1
|
Puławski W, Koliński A, Koliński M. Integrative modeling of diverse protein-peptide systems using CABS-dock. PLoS Comput Biol 2023; 19:e1011275. [PMID: 37405984 DOI: 10.1371/journal.pcbi.1011275] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 06/15/2023] [Indexed: 07/07/2023] Open
Abstract
The CABS model can be applied to a wide range of protein-protein and protein-peptide molecular modeling tasks, such as simulating folding pathways, predicting structures, docking, and analyzing the structural dynamics of molecular complexes. In this work, we use the CABS-dock tool in two diverse modeling tasks: 1) predicting the structures of amyloid protofilaments and 2) identifying cleavage sites in the peptide substrates of proteolytic enzymes. In the first case, simulations of the simultaneous docking of amyloidogenic peptides indicated that the CABS model can accurately predict the structures of amyloid protofilaments which have an in-register parallel architecture. Scoring based on a combination of symmetry criteria and estimated interaction energy values for bound monomers enables the identification of protofilament models that closely match their experimental structures for 5 out of 6 analyzed systems. For the second task, it has been shown that CABS-dock coarse-grained docking simulations can be used to identify the positions of cleavage sites in the peptide substrates of proteolytic enzymes. The cleavage site position was correctly identified for 12 out of 15 analyzed peptides. When combined with sequence-based methods, these docking simulations may lead to an efficient way of predicting cleavage sites in degraded proteins. The method also provides the atomic structures of enzyme-substrate complexes, which can give insights into enzyme-substrate interactions that are crucial for the design of new potent inhibitors.
Collapse
Affiliation(s)
- Wojciech Puławski
- Bioinformatics Laboratory, Mossakowski Medical Research Institute, Polish Academy of Sciences, Warsaw, Poland
| | | | - Michał Koliński
- Bioinformatics Laboratory, Mossakowski Medical Research Institute, Polish Academy of Sciences, Warsaw, Poland
| |
Collapse
|
2
|
Onah E, Uzor PF, Ugwoke IC, Eze JU, Ugwuanyi ST, Chukwudi IR, Ibezim A. Prediction of HIV-1 protease cleavage site from octapeptide sequence information using selected classifiers and hybrid descriptors. BMC Bioinformatics 2022; 23:466. [DOI: 10.1186/s12859-022-05017-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 10/11/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
In most parts of the world, especially in underdeveloped countries, acquired immunodeficiency syndrome (AIDS) still remains a major cause of death, disability, and unfavorable economic outcomes. This has necessitated intensive research to develop effective therapeutic agents for the treatment of human immunodeficiency virus (HIV) infection, which is responsible for AIDS. Peptide cleavage by HIV-1 protease is an essential step in the replication of HIV-1. Thus, correct and timely prediction of the cleavage site of HIV-1 protease can significantly speed up and optimize the drug discovery process of novel HIV-1 protease inhibitors. In this work, we built and compared the performance of selected machine learning models for the prediction of HIV-1 protease cleavage site utilizing a hybrid of octapeptide sequence information comprising bond composition, amino acid binary profile (AABP), and physicochemical properties as numerical descriptors serving as input variables for some selected machine learning algorithms. Our work differs from antecedent studies exploring the same subject in the combination of octapeptide descriptors and method used. Instead of using various subsets of the dataset for training and testing the models, we combined the dataset, applied a 3-way data split, and then used a "stratified" 10-fold cross-validation technique alongside the testing set to evaluate the models.
Results
Among the 8 models evaluated in the “stratified” 10-fold CV experiment, logistic regression, multi-layer perceptron classifier, linear discriminant analysis, gradient boosting classifier, Naive Bayes classifier, and decision tree classifier with AUC, F-score, and B. Acc. scores in the ranges of 0.91–0.96, 0.81–0.88, and 80.1–86.4%, respectively, have the closest predictive performance to the state-of-the-art model (AUC 0.96, F-score 0.80 and B. Acc. ~ 80.0%). Whereas, the perceptron classifier and the K-nearest neighbors had statistically lower performance (AUC 0.77–0.82, F-score 0.53–0.69, and B. Acc. 60.0–68.5%) at p < 0.05. On the other hand, logistic regression, and multi-layer perceptron classifier (AUC of 0.97, F-score > 0.89, and B. Acc. > 90.0%) had the best performance on further evaluation on the testing set, though linear discriminant analysis, gradient boosting classifier, and Naive Bayes classifier equally performed well (AUC > 0.94, F-score > 0.87, and B. Acc. > 86.0%).
Conclusions
Logistic regression and multi-layer perceptron classifiers have comparable predictive performances to the state-of-the-art model when octapeptide sequence descriptors consisting of AABP, bond composition and standard physicochemical properties are used as input variables. In our future work, we hope to develop a standalone software for HIV-1 protease cleavage site prediction utilizing the linear regression algorithm and the aforementioned octapeptide sequence descriptors.
Collapse
|
3
|
Prescott L. SARS-CoV-2 3CLpro whole human proteome cleavage prediction and enrichment/depletion analysis. Comput Biol Chem 2022; 98:107671. [PMID: 35429835 PMCID: PMC8958254 DOI: 10.1016/j.compbiolchem.2022.107671] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Revised: 03/21/2022] [Accepted: 03/25/2022] [Indexed: 12/12/2022]
Abstract
A novel coronavirus (SARS-CoV-2) has devastated the globe as a pandemic that has killed millions of people. Widespread vaccination is still uncertain, so many scientific efforts have been directed toward discovering antiviral treatments. Many drugs are being investigated to inhibit the coronavirus main protease, 3CLpro, from cleaving its viral polyprotein, but few publications have addressed this protease’s interactions with the host proteome or their probable contribution to virulence. Too few host protein cleavages have been experimentally verified to fully understand 3CLpro’s global effects on relevant cellular pathways and tissues. Here, I set out to determine this protease’s targets and corresponding potential drug targets. Using a neural network trained on cleavages from 392 coronavirus proteomes with a Matthews correlation coefficient of 0.985, I predict that a large proportion of the human proteome is vulnerable to 3CLpro, with 4898 out of approximately 20,000 human proteins containing at least one putative cleavage site. These cleavages are nonrandomly distributed and are enriched in the epithelium along the respiratory tract, brain, testis, plasma, and immune tissues and depleted in olfactory and gustatory receptors despite the prevalence of anosmia and ageusia in COVID-19 patients. Affected cellular pathways include cytoskeleton/motor/cell adhesion proteins, nuclear condensation and other epigenetics, host transcription and RNAi, ribosomal stoichiometry and nascent-chain detection and degradation, ubiquitination, pattern recognition receptors, coagulation, lipoproteins, redox, and apoptosis. This whole proteome cleavage prediction demonstrates the importance of 3CLpro in expected and nontrivial pathways affecting virulence, lead me to propose more than a dozen potential therapeutic targets against coronaviruses, and should therefore be applied to all viral proteases and subsequently experimentally verified.
Collapse
|
4
|
Li Z, Hu L, Tang Z, Zhao C. Predicting HIV-1 Protease Cleavage Sites With Positive-Unlabeled Learning. Front Genet 2021; 12:658078. [PMID: 33868387 PMCID: PMC8044780 DOI: 10.3389/fgene.2021.658078] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 03/08/2021] [Indexed: 11/13/2022] Open
Abstract
Understanding the substrate specificity of HIV-1 protease plays an essential role in the prevention of HIV infection. A variety of computational models have thus been developed to predict substrate sites that are cleaved by HIV-1 protease, but most of them normally follow a supervised learning scheme to build classifiers by considering experimentally verified cleavable sites as positive samples and unknown sites as negative samples. However, certain noisy can be contained in the negative set, as false negative samples are possibly existed. Hence, the performance of the classifiers is not as accurate as they could be due to the biased prediction results. In this work, unknown substrate sites are regarded as unlabeled samples instead of negative ones. We propose a novel positive-unlabeled learning algorithm, namely PU-HIV, for an effective prediction of HIV-1 protease cleavage sites. Features used by PU-HIV are encoded from different perspectives of substrate sequences, including amino acid identities, coevolutionary patterns and chemical properties. By adjusting the weights of errors generated by positive and unlabeled samples, a biased support vector machine classifier can be built to complete the prediction task. In comparison with state-of-the-art prediction models, benchmarking experiments using cross-validation and independent tests demonstrated the superior performance of PU-HIV in terms of AUC, PR-AUC, and F-measure. Thus, with PU-HIV, it is possible to identify previously unknown, but physiologically existed substrate sites that are able to be cleaved by HIV-1 protease, thus providing valuable insights into designing novel HIV-1 protease inhibitors for HIV treatment.
Collapse
Affiliation(s)
- Zhenfeng Li
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Zehai Tang
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| | - Cheng Zhao
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China
| |
Collapse
|
5
|
Hu L, Hu P, Luo X, Yuan X, You ZH. Incorporating the Coevolving Information of Substrates in Predicting HIV-1 Protease Cleavage Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2017-2028. [PMID: 31056514 DOI: 10.1109/tcbb.2019.2914208] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Human immunodeficiency virus 1 (HIV-1) protease (PR) plays a crucial role in the maturation of the virus. The study of substrate specificity of HIV-1 PR as a new endeavor strives to increase our ability to understand how HIV-1 PR recognizes its various cleavage sites. To predict HIV-1 PR cleavage sites, most of the existing approaches have been developed solely based on the homogeneity of substrate sequence information with supervised classification techniques. Although efficient, these approaches are found to be restricted to the ability of explaining their results and probably provide few insights into the mechanisms by which HIV-1 PR cleaves the substrates in a site-specific manner. In this work, a coevolutionary pattern-based prediction model for HIV-1 PR cleavage sites, namely EvoCleave, is proposed by integrating the coevolving information obtained from substrate sequences with a linear SVM classifier. The experiment results showed that EvoCleave yielded a very promising performance in terms of ROC analysis and f-measure. We also prospectively assessed the biological significance of coevolutionary patterns by applying them to study three fundamental issues of HIV-1 PR cleavage site. The analysis results demonstrated that the coevolutionary patterns offered valuable insights into the understanding of substrate specificity of HIV-1 PR.
Collapse
|
6
|
Singh D, Sisodia DS, Singh P. Multiobjective evolutionary-based multi-kernel learner for realizing transfer learning in the prediction of HIV-1 protease cleavage sites. Soft comput 2020. [DOI: 10.1007/s00500-019-04487-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
7
|
Singh D, Sisodia DS, Singh P. Compositional framework for multitask learning in the identification of cleavage sites of HIV-1 protease. J Biomed Inform 2020; 102:103376. [PMID: 31935461 DOI: 10.1016/j.jbi.2020.103376] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 12/19/2019] [Accepted: 01/08/2020] [Indexed: 11/18/2022]
Abstract
Inadequate patient samples and costly annotated data generations result into the smaller dataset in the biomedical domain. Due to which the predictions with a trained model that usually reveal a single small dataset association are fail to derive robust insights. To cope with the data sparsity, a promising strategy of combining data from the different related tasks is exercised in various application. Motivated by, successful work in the various bioinformatics application, we propose a multitask learning model based on multi-kernel that exploits the dependencies among various related tasks. This work aims to combine the knowledge from experimental studies of the different dataset to build stronger predictive models for HIV-1 protease cleavage sites prediction. In this study, a set of peptide data from one source is referred as 'task' and to integrate interactions from multiple tasks; our method exploits the common features and parameters sharing across the data source. The proposed framework uses feature integration, feature selection, multi-kernel and multifactorial evolutionary algorithm to model multitask learning. The framework considered seven different feature descriptors and four different kernel variants of support vector machines to form the optimal multi-kernel learning model. To validate the effectiveness of the model, the performance parameters such as average accuracy, and area under curve have been evaluated on the suggested model. We also carried out Friedman and post hoc statistical test to substantiate the significant improvement achieved by the proposed framework. The result obtained following the extensive experiment confirms the belief that multitask learning in cleavage site identification can improve the performance.
Collapse
Affiliation(s)
- Deepak Singh
- Department of Computer Science and Engineering, National Institute of Technology, Raipur, C.G, India.
| | - Dilip Singh Sisodia
- Department of Computer Science and Engineering, National Institute of Technology, Raipur, C.G, India.
| | - Pradeep Singh
- Department of Computer Science and Engineering, National Institute of Technology, Raipur, C.G, India.
| |
Collapse
|
8
|
Cognitive Framework for HIV-1 Protease Cleavage Site Classification Using Evolutionary Algorithm. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2019. [DOI: 10.1007/s13369-019-03871-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
9
|
Evolutionary based ensemble framework for realizing transfer learning in HIV-1 Protease cleavage sites prediction. APPL INTELL 2018. [DOI: 10.1007/s10489-018-1323-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
10
|
Fathi A, Sadeghi R. A genetic programming method for feature mapping to improve prediction of HIV-1 protease cleavage site. Appl Soft Comput 2018. [DOI: 10.1016/j.asoc.2018.06.045] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
11
|
Singh O, Su ECY. Prediction of HIV-1 protease cleavage site using a combination of sequence, structural, and physicochemical features. BMC Bioinformatics 2016; 17:478. [PMID: 28155640 PMCID: PMC5259813 DOI: 10.1186/s12859-016-1337-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Background The human immunodeficiency virus type 1 (HIV-1) aspartic protease is an important enzyme owing to its imperative part in viral development and a causative agent of deadliest disease known as acquired immune deficiency syndrome (AIDS). Development of HIV-1 protease inhibitors can help understand the specificity of substrates which can restrain the replication of HIV-1, thus antagonize AIDS. However, experimental methods in identification of HIV-1 protease cleavage sites are generally time-consuming and labor-intensive. Therefore, using computational methods to predict cleavage sites has become highly desirable. Results In this study, we propose a prediction method in which sequence, structural, and physicochemical features are incorporated in various machine learning algorithms. Then, a bidirectional stepwise selection algorithm is incorporated in feature selection to identify discriminative features. Further, only the selected features are calculated by various encoding schemes and used as input for decision trees, logistic regression, and artificial neural networks. Moreover, a more rigorous three-way data split procedure is applied to evaluate the objective performance of cleavage site prediction. Four benchmark datasets collected from previous studies are used to evaluate the predictive performance. Conclusions Experiment results showed that combinations of sequence, structure, and physicochemical features performed better than single feature type for identification of HIV-1 protease cleavage sites. In addition, incorporation of stepwise feature selection is effective to identify interpretable biological features to depict specificity of the substrates. Moreover, artificial neural networks perform significantly better than the other two classifiers. Finally, the proposed method achieved 80.0% ~ 97.4% in accuracy and 0.815 ~ 0.995 evaluated by independent test sets in a three-way data split procedure. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1337-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Onkar Singh
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| | - Emily Chia-Yu Su
- Graduate Institute of Biomedical Informatics, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan.
| |
Collapse
|
12
|
Koçak Y, Özyer T, Alhajj R. Utilizing maximal frequent itemsets and social network analysis for HIV data analysis. J Cheminform 2016. [PMCID: PMC5395515 DOI: 10.1186/s13321-016-0184-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Acquired immune deficiency syndrome is a deadly disease which is caused by human immunodeficiency virus (HIV). This virus attacks patients immune system and effects its ability to fight against diseases. Developing effective medicine requires understanding the life cycle and replication ability of the virus. HIV-1 protease enzyme is used to cleave an octamer peptide into peptides which are used to create proteins by the virus. In this paper, a novel feature extraction method is proposed for understanding important patterns in octamer’s cleavability. This feature extraction method is based on data mining techniques which are used to find important relations inside a dataset by comprehensively analyzing the given data. As demonstrated in this paper, using the extracted information in the classification process yields important results which may be taken into consideration when developing a new medicine. We have used 746 and 1625, Impens and schilling data instances from the 746-dataset. Besides, we have performed social network analysis as a complementary alternative method.
Collapse
|
13
|
Manning T, Walsh P. The importance of physicochemical characteristics and nonlinear classifiers in determining HIV-1 protease specificity. Bioengineered 2016; 7:65-78. [PMID: 27212259 PMCID: PMC4879986 DOI: 10.1080/21655979.2016.1149271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Revised: 01/25/2016] [Accepted: 01/26/2016] [Indexed: 10/21/2022] Open
Abstract
This paper reviews recent research relating to the application of bioinformatics approaches to determining HIV-1 protease specificity, outlines outstanding issues, and presents a new approach to addressing these issues. Leading machine learning theory for the problem currently suggests that the direct encoding of the physicochemical properties of the amino acid substrates is not required for optimal performance. A number of amino acid encoding approaches which incorporate potentially relevant physicochemical properties of the substrate are identified, and are evaluated using a nonlinear task decomposition based neuroevolution algorithm. The results are evaluated, and compared against a recent benchmark set on a nonlinear classifier using only amino acid sequence and identity information. Ensembles of these nonlinear classifiers using the physicochemical properties of the substrate are demonstrated to consistently outperform the recently published state-of-the-art linear support vector machine based approach in out-of-sample evaluations.
Collapse
Affiliation(s)
- Timmy Manning
- Department of Computer Science, Cork Institute of Technology, Cork, Ireland
| | - Paul Walsh
- Department of Computer Science, Cork Institute of Technology, Cork, Ireland
- NSilico Ltd, Rubicon Innovation Center, Cork, Ireland
| |
Collapse
|
14
|
Feature Selection Combined with Neural Network Structure Optimization for HIV-1 Protease Cleavage Site Prediction. BIOMED RESEARCH INTERNATIONAL 2015; 2015:263586. [PMID: 25961009 PMCID: PMC4413510 DOI: 10.1155/2015/263586] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 01/07/2015] [Indexed: 11/17/2022]
Abstract
It is crucial to understand the specificity of HIV-1 protease for designing HIV-1 protease inhibitors. In this paper, a new feature selection method combined with neural network structure optimization is proposed to analyze the specificity of HIV-1 protease and find the important positions in an octapeptide that determined its cleavability. Two kinds of newly proposed features based on Amino Acid Index database plus traditional orthogonal encoding features are used in this paper, taking both physiochemical and sequence information into consideration. Results of feature selection prove that p2, p1, p1′, and p2′ are the most important positions. Two feature fusion methods are used in this paper: combination fusion and decision fusion aiming to get comprehensive feature representation and improve prediction performance. Decision fusion of subsets that getting after feature selection obtains excellent prediction performance, which proves feature selection combined with decision fusion is an effective and useful method for the task of HIV-1 protease cleavage site prediction. The results and analysis in this paper can provide useful instruction and help designing HIV-1 protease inhibitor in the future.
Collapse
|
15
|
Rögnvaldsson T, You L, Garwicz D. State of the art prediction of HIV-1 protease cleavage sites. Bioinformatics 2014; 31:1204-10. [PMID: 25504647 DOI: 10.1093/bioinformatics/btu810] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Accepted: 12/04/2014] [Indexed: 02/01/2023] Open
Abstract
MOTIVATION Understanding the substrate specificity of human immunodeficiency virus (HIV)-1 protease is important when designing effective HIV-1 protease inhibitors. Furthermore, characterizing and predicting the cleavage profile of HIV-1 protease is essential to generate and test hypotheses of how HIV-1 affects proteins of the human host. Currently available tools for predicting cleavage by HIV-1 protease can be improved. RESULTS The linear support vector machine with orthogonal encoding is shown to be the best predictor for HIV-1 protease cleavage. It is considerably better than current publicly available predictor services. It is also found that schemes using physicochemical properties do not improve over the standard orthogonal encoding scheme. Some issues with the currently available data are discussed. AVAILABILITY AND IMPLEMENTATION The datasets used, which are the most important part, are available at the UCI Machine Learning Repository. The tools used are all standard and easily available. CONTACT thorsteinn.rognvaldsson@hh.se.
Collapse
Affiliation(s)
- Thorsteinn Rögnvaldsson
- CAISR, School of Information Science, Computer and Electrical Engineering, Halmstad University, Halmstad, Sweden and Division of Clinical Chemistry and Pharmacology, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| | - Liwen You
- CAISR, School of Information Science, Computer and Electrical Engineering, Halmstad University, Halmstad, Sweden and Division of Clinical Chemistry and Pharmacology, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| | - Daniel Garwicz
- CAISR, School of Information Science, Computer and Electrical Engineering, Halmstad University, Halmstad, Sweden and Division of Clinical Chemistry and Pharmacology, Department of Medical Sciences, Uppsala University, Uppsala, Sweden
| |
Collapse
|
16
|
Rögnvaldsson T, You L, Garwicz D. Bioinformatic approaches for modeling the substrate specificity of HIV-1 protease: an overview. Expert Rev Mol Diagn 2014; 7:435-51. [PMID: 17620050 DOI: 10.1586/14737159.7.4.435] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
HIV-1 protease has a broad and complex substrate specificity, which hitherto has escaped a simple comprehensive definition. This, and the relatively high mutation rate of the retroviral protease, makes it challenging to design effective protease inhibitors. Several attempts have been made during the last two decades to elucidate the enigmatic cleavage specificity of HIV-1 protease and to predict cleavage of novel substrates using bioinformatic analysis methods. This review describes the methods that have been utilized to date to address this important problem and the results achieved. The data sets used are also reviewed and important aspects of these are highlighted.
Collapse
Affiliation(s)
- Thorsteinn Rögnvaldsson
- Halmstad University, School of Information Science, Computer & Electrical Engineering, Halmstad, Sweden.
| | | | | |
Collapse
|
17
|
Öztürk O, Aksaç A, Elsheikh A, Özyer T, Alhajj R. A consistency-based feature selection method allied with linear SVMs for HIV-1 protease cleavage site prediction. PLoS One 2013; 8:e63145. [PMID: 24058397 PMCID: PMC3751940 DOI: 10.1371/journal.pone.0063145] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2012] [Accepted: 04/02/2013] [Indexed: 01/12/2023] Open
Abstract
Background Predicting type-1 Human Immunodeficiency Virus (HIV-1) protease cleavage site in protein molecules and determining its specificity is an important task which has attracted considerable attention in the research community. Achievements in this area are expected to result in effective drug design (especially for HIV-1 protease inhibitors) against this life-threatening virus. However, some drawbacks (like the shortage of the available training data and the high dimensionality of the feature space) turn this task into a difficult classification problem. Thus, various machine learning techniques, and specifically several classification methods have been proposed in order to increase the accuracy of the classification model. In addition, for several classification problems, which are characterized by having few samples and many features, selecting the most relevant features is a major factor for increasing classification accuracy. Results We propose for HIV-1 data a consistency-based feature selection approach in conjunction with recursive feature elimination of support vector machines (SVMs). We used various classifiers for evaluating the results obtained from the feature selection process. We further demonstrated the effectiveness of our proposed method by comparing it with a state-of-the-art feature selection method applied on HIV-1 data, and we evaluated the reported results based on attributes which have been selected from different combinations. Conclusion Applying feature selection on training data before realizing the classification task seems to be a reasonable data-mining process when working with types of data similar to HIV-1. On HIV-1 data, some feature selection or extraction operations in conjunction with different classifiers have been tested and noteworthy outcomes have been reported. These facts motivate for the work presented in this paper. Software availability The software is available at http://ozyer.etu.edu.tr/c-fs-svm.rar. The software can be downloaded at esnag.etu.edu.tr/software/hiv_cleavage_site_prediction.rar; you will find a readme file which explains how to set the software in order to work.
Collapse
Affiliation(s)
- Orkun Öztürk
- eSNAg Research Group, Department of Computer Engineering, TOBB University, Ankara, Turkey
- Raccoon Software Computer R&D Ltd., Ankara, Turkey
| | - Alper Aksaç
- eSNAg Research Group, Department of Computer Engineering, TOBB University, Ankara, Turkey
- Raccoon Software Computer R&D Ltd., Ankara, Turkey
| | - Abdallah Elsheikh
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
| | - Tansel Özyer
- eSNAg Research Group, Department of Computer Engineering, TOBB University, Ankara, Turkey
- Raccoon Software Computer R&D Ltd., Ankara, Turkey
| | - Reda Alhajj
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
- Department of Computer Science, Global University, Beirut, Lebanon
- * E-mail:
| |
Collapse
|
18
|
Asadollahi M, Fekete E, Karaffa L, Flipphi M, Árnyasi M, Esmaeili M, Váczy KZ, Sándor E. Comparison of Botrytis cinerea populations isolated from two open-field cultivated host plants. Microbiol Res 2013; 168:379-388. [PMID: 23353014 DOI: 10.1016/j.micres.2012.12.008] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2012] [Revised: 12/04/2012] [Accepted: 12/20/2012] [Indexed: 11/17/2022]
Abstract
The necrotrophic fungus Botrytis cinerea is reported to infect more than 220 host plants worldwide. In phylogenetical-taxonomical terms, the pathogen is considered a complex of two cryptic species, group I and group II. We sampled populations of B. cinerea on sympatric strawberry and raspberry cultivars in the North-East of Hungary for three years during flowering and the harvest period. Four hundred and ninety group II B. cinerea isolates were analyzed for the current study. Three different data sets were generated: (i) PCR-RFLP patterns of the ADP-ATP translocase and nitrate reductase genes, (ii) MSB1 minisatellite sequence data, and (iii) the fragment sizes of five microsatellite loci. The structures of the different populations were similar as indicated by Nei's gene diversity and haplotype diversity. The F statistics (Fst, Gst), and the gene flow indicated ongoing differentiation within sympatric populations. The population genetic parameters were influenced by polymorphisms within the three data sets as assessed using Bayesian algorithms. Data Mining analysis pointed towards the five microsatellite loci as the most defining markers to study differentiation in the 490 isolates. The results suggest the occurrence of host-specific, sympatric divergence of generalist phytoparasites in perennial hosts.
Collapse
Affiliation(s)
- Mojtaba Asadollahi
- Department of Biochemical Engineering, Faculty of Science and Technology, University of Debrecen, Egyetem tér 1, 4032 Debrecen, Hungary; Institute of Food Processing, Quality Assurance and Microbiology, Faculty of Agricultural and Food Sciences and Environmental Management, University of Debrecen, Böszörményi út 138, 4032 Debrecen, Hungary
| | - Erzsébet Fekete
- Department of Biochemical Engineering, Faculty of Science and Technology, University of Debrecen, Egyetem tér 1, 4032 Debrecen, Hungary
| | - Levente Karaffa
- Department of Biochemical Engineering, Faculty of Science and Technology, University of Debrecen, Egyetem tér 1, 4032 Debrecen, Hungary
| | - Michel Flipphi
- Department of Biochemical Engineering, Faculty of Science and Technology, University of Debrecen, Egyetem tér 1, 4032 Debrecen, Hungary
| | - Mariann Árnyasi
- Sámuel Diószegi Institute of Agricultural Innovation, Faculty of Agricultural and Food Sciences and Environmental Management, University of Debrecen, Böszörményi út 138, 4032 Debrecen, Hungary
| | - Mahdi Esmaeili
- Department of Computer Science, Islamic Azad University, Kashan Branch, Kashan, Iran
| | - Kálmán Zoltán Váczy
- KRC Research Institute for Viticulture and Enology, Kőlyuktető, PO Box 83, 3301 Eger, Hungary
| | - Erzsébet Sándor
- Institute of Food Processing, Quality Assurance and Microbiology, Faculty of Agricultural and Food Sciences and Environmental Management, University of Debrecen, Böszörményi út 138, 4032 Debrecen, Hungary.
| |
Collapse
|
19
|
Newell NE. Cascade detection for the extraction of localized sequence features; specificity results for HIV-1 protease and structure-function results for the Schellman loop. ACTA ACUST UNITED AC 2011; 27:3415-22. [PMID: 22039211 DOI: 10.1093/bioinformatics/btr594] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
MOTIVATION The extraction of the set of features most relevant to function from classified biological sequence sets is still a challenging problem. A central issue is the determination of expected counts for higher order features so that artifact features may be screened. RESULTS Cascade detection (CD), a new algorithm for the extraction of localized features from sequence sets, is introduced. CD is a natural extension of the proportional modeling techniques used in contingency table analysis into the domain of feature detection. The algorithm is successfully tested on synthetic data and then applied to feature detection problems from two different domains to demonstrate its broad utility. An analysis of HIV-1 protease specificity reveals patterns of strong first-order features that group hydrophobic residues by side chain geometry and exhibit substantial symmetry about the cleavage site. Higher order results suggest that favorable cooperativity is weak by comparison and broadly distributed, but indicate possible synergies between negative charge and hydrophobicity in the substrate. Structure-function results for the Schellman loop, a helix-capping motif in proteins, contain strong first-order features and also show statistically significant cooperativities that provide new insights into the design of the motif. These include a new 'hydrophobic staple' and multiple amphipathic and electrostatic pair features. CD should prove useful not only for sequence analysis, but also for the detection of multifactor synergies in cross-classified data from clinical studies or other sources. AVAILABILITY Windows XP/7 application and data files available at: https://sites.google.com/site/cascadedetect/home. CONTACT nacnewell@comcast.net SUPPLEMENTARY INFORMATION Supplementary information is available at Bioinformatics online.
Collapse
|
20
|
Ode H, Yokoyama M, Kanda T, Sato H. Identification of folding preferences of cleavage junctions of HIV-1 precursor proteins for regulation of cleavability. J Mol Model 2010; 17:391-9. [PMID: 20480379 DOI: 10.1007/s00894-010-0739-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2010] [Accepted: 04/30/2010] [Indexed: 11/30/2022]
Abstract
Human immunodeficiency virus type 1 protease (HIV-1 PR) cleaves two viral precursor proteins, Gag and Gag-Pol, at multiple sites. Although the processing proceeds in the rank order to assure effective viral replication, the molecular mechanisms by which the order is regulated are not fully understood. In this study, we used bioinformatics approaches to examine whether the folding preferences of the cleavage junctions influence their cleavabilities by HIV-1 PR. The folding of the eight-amino-acid peptides corresponding to the seven cleavage junctions of the HIV-1(HXB2) Gag and Gag-Pol precursors were simulated in the PR-free and PR-bound states with molecular dynamics and homology modeling methods, and the relationships between the folding parameters and the reported kinetic parameters of the HIV-1(HXB2) peptides were analyzed. We found that a folding preference for forming a dihedral angle of Cβ (P1)-Cα (P1)- Cα (P1')-Cβ (P1') in the range of 150 to 180 degrees in the PR-free state was positively correlated with the 1/K(m) (R = 0.95, P = 0.0008) and that the dihedral angle of the O (P2)-C (P2)- C (P1)- O (P1) of the main chains in the PR-bound state was negatively correlated with k(cat) (R = 0.94, P = 0.001). We further found that these two folding properties influenced the overall cleavability of the precursor protein when the sizes of the side chains at the P1 site were similar. These data suggest that the dihedral angles at the specific positions around the cleavage junctions before and after binding to PR are both critical for regulating the cleavability of precursor proteins by HIV-1 PR.
Collapse
Affiliation(s)
- Hirotaka Ode
- Pathogen Genomics Center, National Institute of Infectious Diseases, Tokyo, Japan.
| | | | | | | |
Collapse
|
21
|
Identification of structural mechanisms of HIV-1 protease specificity using computational peptide docking: implications for drug resistance. Structure 2010; 17:1636-1648. [PMID: 20004167 DOI: 10.1016/j.str.2009.10.008] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2009] [Revised: 10/01/2009] [Accepted: 10/04/2009] [Indexed: 11/23/2022]
Abstract
Drug-resistant mutations (DRMs) in HIV-1 protease are a major challenge to antiretroviral therapy. Protease-substrate interactions that are determined to be critical for native selectivity could serve as robust targets for drug design that are immune to DRMs. In order to identify the structural mechanisms of selectivity, we developed a peptide-docking algorithm to predict the atomic structure of protease-substrate complexes and applied it to a large and diverse set of cleavable and noncleavable peptides. Cleavable peptides showed significantly lower energies of interaction than noncleavable peptides with six protease active-site residues playing the most significant role in discrimination. Surprisingly, all six residues correspond to sequence positions associated with drug resistance mutations, demonstrating that the very residues that are responsible for native substrate specificity in HIV-1 protease are altered during its evolution to drug resistance, suggesting that drug resistance and substrate selectivity may share common mechanisms.
Collapse
|
22
|
Kim G, Kim Y, Lim H, Kim H. An MLP-based feature subset selection for HIV-1 protease cleavage site analysis. Artif Intell Med 2010; 48:83-9. [DOI: 10.1016/j.artmed.2009.07.010] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2008] [Revised: 07/03/2009] [Accepted: 07/20/2009] [Indexed: 10/20/2022]
|
23
|
Li X, Hu H, Shu L. Predicting human immunodeficiency virus protease cleavage sites in nonlinear projection space. Mol Cell Biochem 2010; 339:127-33. [PMID: 20054614 DOI: 10.1007/s11010-009-0376-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2009] [Accepted: 12/21/2009] [Indexed: 11/30/2022]
Abstract
HIV-1 protease has a broad and complex substrate specificity. The discovery of an accurate, robust, and rapid method for predicting the cleavage sites in proteins by HIV protease would greatly expedite the search for inhibitors of HIV protease. During the last two decades, various methods have been developed to explore the specificity of HIV protease cleavage activity. However, because little advancement has been made in the understanding of HIV-1 protease cleavage site specificity, not much progress has been reported in either extracting effective methods or maintaining high prediction accuracy. In this article, a theoretical framework is developed, based on the kernel method for dimensionality reduction and prediction for HIV-1 protease cleavage site specificity. A nonlinear dimensionality reduction kernel method, based on manifold learning, is proposed to reduce the high dimensions of protease specificity. A support vector machine is applied to predict the protease cleavage. Superior performance in comparison to that previously published in literature is obtained using numerical simulations showing that the basic specificities of the HIV-1 protease are maintained in reduction feature space, and by combining the nonlinear dimensionality reduction algorithm with a support vector machine classifier.
Collapse
Affiliation(s)
- Xuehua Li
- School of Applied Mathematics, University of Electronic Science and Technology of China, 610054 Chengdu, People's Republic of China.
| | | | | |
Collapse
|
24
|
Rögnvaldsson T, Etchells TA, You L, Garwicz D, Jarman I, Lisboa PJG. How to find simple and accurate rules for viral protease cleavage specificities. BMC Bioinformatics 2009; 10:149. [PMID: 19445713 PMCID: PMC2698905 DOI: 10.1186/1471-2105-10-149] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2009] [Accepted: 05/16/2009] [Indexed: 01/02/2023] Open
Abstract
Background Proteases of human pathogens are becoming increasingly important drug targets, hence it is necessary to understand their substrate specificity and to interpret this knowledge in practically useful ways. New methods are being developed that produce large amounts of cleavage information for individual proteases and some have been applied to extract cleavage rules from data. However, the hitherto proposed methods for extracting rules have been neither easy to understand nor very accurate. To be practically useful, cleavage rules should be accurate, compact, and expressed in an easily understandable way. Results A new method is presented for producing cleavage rules for viral proteases with seemingly complex cleavage profiles. The method is based on orthogonal search-based rule extraction (OSRE) combined with spectral clustering. It is demonstrated on substrate data sets for human immunodeficiency virus type 1 (HIV-1) protease and hepatitis C (HCV) NS3/4A protease, showing excellent prediction performance for both HIV-1 cleavage and HCV NS3/4A cleavage, agreeing with observed HCV genotype differences. New cleavage rules (consensus sequences) are suggested for HIV-1 and HCV NS3/4A cleavages. The practical usability of the method is also demonstrated by using it to predict the location of an internal cleavage site in the HCV NS3 protease and to correct the location of a previously reported internal cleavage site in the HCV NS3 protease. The method is fast to converge and yields accurate rules, on par with previous results for HIV-1 protease and better than previous state-of-the-art for HCV NS3/4A protease. Moreover, the rules are fewer and simpler than previously obtained with rule extraction methods. Conclusion A rule extraction methodology by searching for multivariate low-order predicates yields results that significantly outperform existing rule bases on out-of-sample data, but are more transparent to expert users. The approach yields rules that are easy to use and useful for interpreting experimental data.
Collapse
|
25
|
Study of Inhibitors Against SARS Coronavirus by Computational Approaches. VIRAL PROTEASES AND ANTIVIRAL PROTEASE INHIBITOR THERAPY 2009. [PMCID: PMC7122585 DOI: 10.1007/978-90-481-2348-3_1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
26
|
Shen HB, Chou KC. Identification of proteases and their types. Anal Biochem 2008; 385:153-60. [PMID: 19007742 DOI: 10.1016/j.ab.2008.10.020] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2009] [Revised: 10/13/2008] [Accepted: 10/14/2008] [Indexed: 10/21/2022]
Abstract
Called by many as biology's version of Swiss army knives, proteases cut long sequences of amino acids into fragments and regulate most physiological processes. They are vitally important in the life cycle. Different types of proteases have different action mechanisms and biological processes. With the avalanche of protein sequences generated during the postgenomic age, it is highly desirable for both basic research and drug design to develop a fast and reliable method for identifying the types of proteases according to their sequences or even just for whether they are proteases or not. In this article, three recently developed identification methods in this regard are discussed: (i) FunD-PseAAC, (ii) GO-PseAAC, and (iii) FunD-PsePSSM. The first two were established by hybridizing the FunD (functional domain) approach and the GO (gene ontology) approach, respectively, with the PseAAC (pseudo amino acid composition) approach. The third method was established by fusing the FunD approach with the PsePSSM (pseudo position-specific scoring matrix) approach. Of these three methods, only FunD-PsePSSM has provided a server called ProtIdent (protease identifier), which is freely accessible to the public via the website at http://www.csbio.sjtu.edu.cn/bioinf/Protease. For the convenience of users, a step-by-step guide on how to use ProtIdent is illustrated. Meanwhile, the caveat in using ProtIdent and how to understand the success expectancy rate of a statistical predictor are discussed. Finally, the essence of why ProtIdent can yield a high success rate in identifying proteases and their types is elucidated.
Collapse
Affiliation(s)
- Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai 200240, China.
| | | |
Collapse
|
27
|
Chou KC, Shen HB. ProtIdent: a web server for identifying proteases and their types by fusing functional domain and sequential evolution information. Biochem Biophys Res Commun 2008; 376:321-5. [PMID: 18774775 DOI: 10.1016/j.bbrc.2008.08.125] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2008] [Accepted: 08/26/2008] [Indexed: 10/21/2022]
Abstract
Proteases are vitally important to life cycles and have become a main target in drug development. According to their action mechanisms, proteases are classified into six types: (1) aspartic, (2) cysteine, (3) glutamic, (4) metallo, (5) serine, and (6) threonine. Given the sequence of an uncharacterized protein, can we identify whether it is a protease or non-protease? If it is, what type does it belong to? To address these problems, a 2-layer predictor, called "ProtIdent", is developed by fusing the functional domain and sequential evolution information: the first layer is for identifying the query protein as protease or non-protease; if it is a protease, the process will automatically go to the second layer to further identify it among the six types. The overall success rates in both cases by rigorous cross-validation tests were higher than 92%. ProtIdent is freely accessible to the public as a web server at http://www.csbio.sjtu.edu.cn/bioinf/Protease.
Collapse
Affiliation(s)
- Kuo-Chen Chou
- Institute of Image Processing & Pattern Recognition, Shanghai Jiaotong University, 800 Dongchuan Road, Shanghai, 200240, China.
| | | |
Collapse
|
28
|
HIVcleave: a web-server for predicting human immunodeficiency virus protease cleavage sites in proteins. Anal Biochem 2008; 375:388-90. [PMID: 18249180 DOI: 10.1016/j.ab.2008.01.012] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2007] [Revised: 01/08/2008] [Accepted: 01/09/2008] [Indexed: 11/24/2022]
Abstract
According to the ''distorted key theory'' [K.C. Chou, Analytical Biochemistry, 233 (1996) 1-14], the information of cleavage sites of proteins by HIV (human immunodeficiency virus) protease is very useful for finding effective inhibitors against HIV, the culprit of AIDS (acquired immunodeficiency syndrome). To meet the increasing need in this regard, a web-server called HIVcleave was established at http://chou.med.harvard.edu/bioinf/HIV/. In this note we provide a step-to-step guide for how to use HIVcleave to identify the cleavage sites of a query protein sequence by HIV-1 and HIV-2 proteases, respectively.
Collapse
|
29
|
Kim H, Zhang Y, Heo YS, Oh HB, Chen SS. Specificity rule discovery in HIV-1 protease cleavage site analysis. Comput Biol Chem 2007; 32:71-8. [PMID: 18006382 DOI: 10.1016/j.compbiolchem.2007.09.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2007] [Revised: 08/28/2007] [Accepted: 09/09/2007] [Indexed: 10/22/2022]
Abstract
Several machine learning algorithms have recently been applied to modeling the specificity of HIV-1 protease. The problem is challenging because of the three issues as follows: (1) datasets with high dimensionality and small number of samples could misguide classification modeling and its interpretation; (2) symbolic interpretation is desirable because it provides us insight to the specificity in the form of human-understandable rules, and thus helps us to design effective HIV inhibitors; (3) the interpretation should take into account complexity or dependency between positions in sequences. Therefore, it is necessary to investigate multivariate and feature-selective methods to model the specificity and to extract rules from the model. We have tested extensively various machine learning methods, and we have found that the combination of neural networks and decompositional approach can generate a set of effective rules. By validation to experimental results for the HIV-1 protease, the specificity rules outperform the ones generated by frequency-based, univariate or black-box methods.
Collapse
Affiliation(s)
- Hyeoncheol Kim
- Department of Computer Science Education, Korea University, Seoul, Republic of Korea.
| | | | | | | | | |
Collapse
|
30
|
|
31
|
Coren LV, Thomas JA, Chertova E, Sowder RC, Gagliardi TD, Gorelick RJ, Ott DE. Mutational analysis of the C-terminal gag cleavage sites in human immunodeficiency virus type 1. J Virol 2007; 81:10047-54. [PMID: 17634233 PMCID: PMC2045408 DOI: 10.1128/jvi.02496-06] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Human immunodeficiency virus type 1 (HIV-1) Gag is expressed as a polyprotein that is cleaved into six proteins by the viral protease in a maturation process that begins during assembly and budding. While processing of the N terminus of Gag is strictly required for virion maturation and infectivity, the necessity for the C-terminal cleavages of Gag is less well defined. To examine the importance of this process, we introduced a series of mutations into the C terminus of Gag that interrupted the cleavage sites that normally produce in the nucleocapsid (NC), spacer 2 (SP2), or p6(Gag) proteins. Protein analysis showed that all of the mutant constructs produced virions efficiently upon transfection of cells and appropriately processed Gag polyprotein at the nonmutated sites. Mutants that produced a p9(NC/SP2) protein exhibited only minor effects on HIV-1 infectivity and replication. In contrast, mutants that produced only the p8(SP2/p6) or p15(NC/SP2/p6) protein had severe defects in infectivity and replication. To identify the key defective step, we quantified reverse transcription and integration products isolated from infected cells by PCR. All mutants tested produced levels of reverse transcription products either similar to or only somewhat lower than that of wild type. In contrast, mutants that failed to cleave the SP2-p6(Gag) site produced drastically less provirus than the wild type. Together, our results show that processing of the SP2-p6(Gag) and not the NC-SP2 cleavage site is important for efficient viral DNA integration during infection in vitro. In turn, this finding suggests an important role for the p9(NC/SP2) species in some aspect of integration.
Collapse
Affiliation(s)
- Lori V Coren
- AIDS Vaccine Program, SAIC-Frederick, Inc., National Cancer Institute at Frederick, Frederick, MD 21702-1201, USA.
| | | | | | | | | | | | | |
Collapse
|
32
|
Lau TS, Li Y, Kameoka M, Ng TB, Wan DCC. Suppression of HIV replication using RNA interference against HIV-1 integrase. FEBS Lett 2007; 581:3253-9. [PMID: 17592732 DOI: 10.1016/j.febslet.2007.06.011] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2007] [Revised: 05/04/2007] [Accepted: 06/01/2007] [Indexed: 11/22/2022]
Abstract
RNA interference (RNAi) has become one of the most powerful and popular approach on gene silencing in clinical research study especially in virology due to the gene-specific suppression property of small interfering RNA (siRNA). In this report, we demonstrate that expression of vector-mediated small hairpin RNA (shRNA) against human immunodeficiency virus type 1 (HIV-1) integrase (IN), one of the three important enzymes in HIV infection by controlling the integration of viral RNA to host DNA, could suppress the protein synthesis of EGFP-tagged IN in HeLa cell model efficiently. Furthermore, we show that IN shRNA can successfully reduce the HIV particles production in 293T cells at the level similar to the positive control of HIV-1 tat shRNA. These results provide the therapeutic possibility of HIV replication using RNAi against HIV-1 integrase.
Collapse
Affiliation(s)
- Tat San Lau
- Department of Biochemistry, The Chinese University of Hong Kong, Shatin, NT, Hong Kong, China
| | | | | | | | | |
Collapse
|
33
|
Kontijevskis A, Wikberg JES, Komorowski J. Computational proteomics analysis of HIV-1 protease interactome. Proteins 2007; 68:305-12. [PMID: 17427231 DOI: 10.1002/prot.21415] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
HIV-1 protease is a small homodimeric enzyme that ensures maturation of HIV virions by cleaving the viral precursor Gag and Gag-Pol polyproteins into structural and functional elements. The cleavage sites in the viral polyproteins share neither sequence homology nor binding motif and the specificity of the HIV-1 protease is therefore only partially understood. Using an extensive data set collected from 16 years of HIV proteome research we have here created a general and predictive rule-based model for HIV-1 protease specificity based on rough sets. We demonstrate that HIV-1 protease specificity is much more complex than previously anticipated, which cannot be defined based solely on the amino acids at the substrate's scissile bond or by any other single substrate amino acid position only. Our results show that the combination of at least three particular amino acids is needed in the substrate for a cleavage event to occur. Only by combining and analyzing massive amounts of HIV proteome data it was possible to discover these novel and general patterns of physico-chemical substrate cleavage determinants. Our study is an example how computational biology methods can advance the understanding of the viral interactomes.
Collapse
|
34
|
Bukrinskaya A. HIV-1 matrix protein: a mysterious regulator of the viral life cycle. Virus Res 2007; 124:1-11. [PMID: 17210199 DOI: 10.1016/j.virusres.2006.07.001] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2006] [Revised: 06/30/2006] [Accepted: 07/05/2006] [Indexed: 01/17/2023]
Abstract
Significant progress has been achieved in the last few years concerning the human immunodeficiency virus (HIV-1) life cycle, mostly in the fields of cellular receptors for the virus, virus assembly and budding of virus particles from the cell surface. Meanwhile, some aspects, such as postentry events, virus maturation and the regulatory role of individual viral proteins remain poorly defined. This review summarizes some recent findings concerning the role of Gag Pr55 and its proteolytic processing in the HIV-1 life cycle with particular emphasis on the functions of matrix protein p17 (MA), the protein which plays a key role in regulation of the early and late steps of viral morphogenesis. Based on our recent observations, the possibility is discussed that two subsets of MA exist, one cleaved from the Gag precursor in the host cell (cMA), and the other cleaved in the virions (vMA). It is suggested that two MA fractions possess diverse functions and are involved in different stages of virus morphogenesis as key regulators of the viral life cycle.
Collapse
Affiliation(s)
- Alissa Bukrinskaya
- D.I.Ivanovsky Institute of Virology, Russian Academy of Medical Sciences, Moscow 123098, RF, Russia.
| |
Collapse
|
35
|
Liang GZ, Li SZ. A new sequence representation as applied in better specificity elucidation for human immunodeficiency virus type 1 protease. Biopolymers 2007; 88:401-12. [PMID: 17206631 DOI: 10.1002/bip.20669] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Factor analysis scales of generalized amino acid information (FASGAI) involving hydrophobicity, alpha and turn propensities, bulky properties, compositional characteristics, local flexibility, and electronic properties were derived from 516 property parameters of 20-coded amino acids, and was then employed to represent sequence structures of 746 peptides with 8 amino acid residues. Cleavage site prediction models for human immunodeficiency virus type 1 protease by linear discriminant analysis and support vector machine with radial basis function kernel were constructed to identify if they could be cleaved or not, and were further utilized to investigate the cleavage specificity. These diversified properties, including the bulky properties, secondary conformation characteristics, electronic properties, and hydrophobicity at the first, the second, the fourth, the fifth, and the sixth residue, are possibly important factors in determining HIV PR cleavage or not. Particularly, maximal positive and negative influences result from the bulky properties of different sites. Further results from analysis of variance also likely reflect that the HIV PR recognizes diversified key properties of various sites in the octameric sequences. Satisfactory results show that FASGAI can not only be used to represent sequence structures of various functional peptides, but alsoprovide a potential feasible measure for exploring relationship between protein motif sequences and their functions.
Collapse
Affiliation(s)
- Gui Z Liang
- College of Bioengineering, Chongqing University, Chongqing 400030, People's Republic of China.
| | | |
Collapse
|