1
|
Hacisuleyman A, Erman B. Fine tuning rigid body docking results using the Dreiding force field: A computational study of 36 known nanobody-protein complexes. Proteins 2023; 91:1417-1426. [PMID: 37232507 DOI: 10.1002/prot.26529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 05/03/2023] [Accepted: 05/08/2023] [Indexed: 05/27/2023]
Abstract
This paper aims to understand the binding strategies of a nanobody-protein pair by studying known complexes. Rigid body protein-ligand docking programs produce several complexes, called decoys, which are good candidates with high scores of shape complementarity, electrostatic interactions, desolvation, buried surface area, and Lennard-Jones potentials. However, the decoy that corresponds to the native structure is not known. We studied 36 nanobody-protein complexes from the single domain antibody database, sd-Ab DB, http://www.sdab-db.ca/. For each structure, a large number of decoys are generated using the Fast Fourier Transform algorithm of the software ZDOCK. The decoys were ranked according to their target protein-nanobody interaction energies, calculated by using the Dreiding Force Field, with rank 1 having the lowest interaction energy. Out of 36 protein data bank (PDB) structures, 25 true structures were predicted as rank 1. Eleven of the remaining structures required Ångstrom size rigid body translations of the nanobody relative to the protein to match the given PDB structure. After the translation, the Dreiding interaction (DI) energies of all complexes decreased and became rank 1. In one case, rigid body rotations as well as translations of the nanobody were required for matching the crystal structure. We used a Monte Carlo algorithm that randomly translates and rotates the nanobody of a decoy and calculates the DI energy. Results show that rigid body translations and the DI energy are sufficient for determining the correct binding location and pose of ZDOCK created decoys. A survey of the sd-Ab DB showed that each nanobody makes at least one salt bridge with its partner protein, indicating that salt bridge formation is an essential strategy in nanobody-protein recognition. Based on the analysis of the 36 crystal structures and evidence from existing literature, we propose a set of principles that could be used in the design of nanobodies.
Collapse
Affiliation(s)
- Aysima Hacisuleyman
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Burak Erman
- Chemical and Biological Engineering, Koc University, Istanbul, Turkey
| |
Collapse
|
2
|
Biró B, Zhao B, Kurgan L. Complementarity of the residue-level protein function and structure predictions in human proteins. Comput Struct Biotechnol J 2022; 20:2223-2234. [PMID: 35615015 PMCID: PMC9118482 DOI: 10.1016/j.csbj.2022.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/02/2022] [Accepted: 05/02/2022] [Indexed: 11/24/2022] Open
Abstract
Sequence-based predictors of the residue-level protein function and structure cover a broad spectrum of characteristics including intrinsic disorder, secondary structure, solvent accessibility and binding to nucleic acids. They were catalogued and evaluated in numerous surveys and assessments. However, methods focusing on a given characteristic are studied separately from predictors of other characteristics, while they are typically used on the same proteins. We fill this void by studying complementarity of a representative collection of methods that target different predictions using a large, taxonomically consistent, and low similarity dataset of human proteins. First, we bridge the gap between the communities that develop structure-trained vs. disorder-trained predictors of binding residues. Motivated by a recent study of the protein-binding residue predictions, we empirically find that combining the structure-trained and disorder-trained predictors of the DNA-binding and RNA-binding residues leads to substantial improvements in predictive quality. Second, we investigate whether diverse predictors generate results that accurately reproduce relations between secondary structure, solvent accessibility, interaction sites, and intrinsic disorder that are present in the experimental data. Our empirical analysis concludes that predictions accurately reflect all combinations of these relations. Altogether, this study provides unique insights that support combining results produced by diverse residue-level predictors of protein function and structure.
Collapse
Affiliation(s)
- Bálint Biró
- Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Gödöllő, Hungary
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
3
|
Görmez Y, Sabzekar M, Aydın Z. IGPRED: Combination of convolutional neural and graph convolutional networks for protein secondary structure prediction. Proteins 2021; 89:1277-1288. [PMID: 33993559 DOI: 10.1002/prot.26149] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 04/21/2021] [Accepted: 05/11/2021] [Indexed: 11/10/2022]
Abstract
There is a close relationship between the tertiary structure and the function of a protein. One of the important steps to determine the tertiary structure is protein secondary structure prediction (PSSP). For this reason, predicting secondary structure with higher accuracy will give valuable information about the tertiary structure. Recently, deep learning techniques have obtained promising improvements in several machine learning applications including PSSP. In this article, a novel deep learning model, based on convolutional neural network and graph convolutional network is proposed. PSIBLAST PSSM, HHMAKE PSSM, physico-chemical properties of amino acids are combined with structural profiles to generate a rich feature set. Furthermore, the hyper-parameters of the proposed network are optimized using Bayesian optimization. The proposed model IGPRED obtained 89.19%, 86.34%, 87.87%, 85.76%, and 86.54% Q3 accuracies for CullPDB, EVAset, CASP10, CASP11, and CASP12 datasets, respectively.
Collapse
Affiliation(s)
- Yasin Görmez
- Faculty of Economics and Administrative Sciences, Management Information Systems, Sivas Cumhuriyet University, Sivas, Turkey
| | - Mostafa Sabzekar
- Department of Computer Engineering, Birjand University of Technology, Birjand, Iran
| | - Zafer Aydın
- Engineering Faculty, Computer Engineering Department, Abdullah Gül University, Kayseri, Turkey
| |
Collapse
|
4
|
Tavakkoli H, Khosravi A, Sharifi I, Salari Z, Salarkia E, Kheirandish R, Dehghantalebi K, Jajarmi M, Mosallanejad SS, Dabiri S, Keyhani A. Partridge and embryonated partridge egg as new preclinical models for candidiasis. Sci Rep 2021; 11:2072. [PMID: 33483560 PMCID: PMC7822824 DOI: 10.1038/s41598-021-81592-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2020] [Accepted: 01/06/2021] [Indexed: 12/20/2022] Open
Abstract
Candida albicans (C. albicans) is the most common cause of candidiasis in humans and animals. This study was established to a new experimental infection model for systemic candidiasis using partridge and embryonated partridge egg. First, we tested the induction of systemic candidiasis in partridge and embryonated partridge egg. Finally, interaction between virulence factors of C. albicans and Bcl-2 family members was predicted. We observed that embryonic infection causes a decrease in survival time and at later embryonic days (11–12th), embryos showed lesions. Morphometric analysis of the extra-embryonic membrane (EEM) vasculature showed that vascular apoptotic effect of C. albicans was revealed by a significant reduction in capillary area. In immunohistochemistry assay, low expression of Bcl-2 and increased expression of Bax confirmed apoptosis. The gene expression of Bax and Bcl-2 was also altered in fungi-exposed EEM. Ourin silico simulation has shown an accurate interaction between aspartic proteinase, polyamine oxidase, Bcl-2 and BAX. We observed that the disease was associated with adverse consequences, which were similar to human candidiasis. Acquired results support the idea that partridge and embryonated partridge egg can be utilized as appropriate preclinical models to investigate the pathological effects of candidiasis.
Collapse
Affiliation(s)
- Hadi Tavakkoli
- Department of Clinical Science, School of Veterinary Medicine, Shahid Bahonar University of Kerman, 22 Bahman Boulevard, Pajouhesh Square, Kerman, 7616914111, Iran.
| | - Ahmad Khosravi
- Leishmaniasis Research Center, Kerman University of Medical Sciences, 22 Bahman Boulevard, Pajouhesh Square, Kerman, 7616914115, Iran.
| | - Iraj Sharifi
- Leishmaniasis Research Center, Kerman University of Medical Sciences, 22 Bahman Boulevard, Pajouhesh Square, Kerman, 7616914115, Iran
| | - Zohreh Salari
- Obstetrics and Gynecology Center, Afzalipour School of Medicine, Kerman University of Medical Sciences, Kerman, Iran
| | - Ehsan Salarkia
- Leishmaniasis Research Center, Kerman University of Medical Sciences, 22 Bahman Boulevard, Pajouhesh Square, Kerman, 7616914115, Iran
| | - Reza Kheirandish
- Department of Pathobiology, Faculty of Veterinary Medicine, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Kazem Dehghantalebi
- Department of Clinical Science, School of Veterinary Medicine, Shahid Bahonar University of Kerman, 22 Bahman Boulevard, Pajouhesh Square, Kerman, 7616914111, Iran
| | - Maziar Jajarmi
- Department of Pathobiology, Faculty of Veterinary Medicine, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Seyedeh Saedeh Mosallanejad
- Afzalipour School of Medicine and Biochemistry Department, Kerman University of Medical Sciences, Kerman, Iran
| | - Shahriar Dabiri
- Afzalipour School of Medicine and Pathology and Stem Cells Research Center, Kerman University of Medical Sciences, Kerman, Iran
| | - Alireza Keyhani
- Leishmaniasis Research Center, Kerman University of Medical Sciences, 22 Bahman Boulevard, Pajouhesh Square, Kerman, 7616914115, Iran
| |
Collapse
|
5
|
Salari Z, Tavakkoli H, Khosravi A, Karamad E, Salarkia E, Ansari M, Dabiri S, Mortazaeizdeh A, Mosallanejad SS, Sharifi F. Embryo-toxicity of docosahexaenoic and eicosapentaenoic acids: In vivo and in silico investigations using the chick embryo model. Biomed Pharmacother 2021; 136:111218. [PMID: 33450494 DOI: 10.1016/j.biopha.2021.111218] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 12/30/2020] [Accepted: 12/31/2020] [Indexed: 01/11/2023] Open
Abstract
OBJECTIVE The objective of the current study was to evaluate the embryo-toxicity of omega-3 fatty acids. METHODS Firstly, the embryo-toxicity of docosahexaenoic (DHA) and eicosapentaenoic acids (EPA), as well as their interaction with Bcl-2 family members, were predicted using an in silico assay. In the next step, the embryonic pathological lesions and amniotic fluid biochemical changes following omega-3 treatment were investigated using a chick embryo model. Finally, the drug's vascular apoptotic effect on the chick's yolk sac membrane (YSM) was assessed. RESULTS In silico simulations revealed the embryo-toxicity, tissue-toxicity (respiratory and cardiovascular), and vascular-toxicity (apoptotic activity) of DHA and EPA. There was also an accurate interaction between DHA and EPA with Bax (Binding affinity: -7.6 and -10.6 kcal/mol) and Bcl-2 (Binding affinity: -8.0 and -12.2 kcal/mol), respectively. Moreover, DHA and EPA administrations were related to various adverse consequences, including weight loss and lesions in the respiratory and cardiovascular systems. Histopathological findings consisted of pulmonary edema, airway dilatation, increased interstitial tissue, and hyperemia in the lungs, heart, liver, kidney, and brain. Morphometric evaluation of the YSM vasculature revealed that the vascular apoptotic effect of omega-3was associated with a significant reduction in mean capillary area. In immunohistochemistry assay, increased expression of BAX and low expression of Bcl-2 affirmed apoptosis in YSM vessels. CONCLUSION According to the results of this study, one could confirm that the possible embryo-toxicity of omega-3 was approved by data presented in this research. The obtained results also support the suspicion that alteration of the apoptotic-related proteins in vessels is an essential pathway in embryo-toxicity of omega-3.
Collapse
Affiliation(s)
- Zohreh Salari
- Obstetrics and Gynecology Center, Afzalipour School of Medicine, Kerman University of Medical Sciences, Kerman, Iran
| | - Hadi Tavakkoli
- Department of Clinical Science, School of Veterinary Medicine, Shahid Bahonar University of Kerman, Kerman, Iran.
| | - Ahmad Khosravi
- Leishmaniasis Research Center, Kerman University of Medical Sciences, Kerman, Iran.
| | - Elahe Karamad
- Obstetrics and Gynecology Center, Afzalipour School of Medicine, Kerman University of Medical Sciences, Kerman, Iran
| | - Ehsan Salarkia
- Leishmaniasis Research Center, Kerman University of Medical Sciences, Kerman, Iran
| | - Mehdi Ansari
- Pharmaceutics Research Center, Institute of Neuropharmacology, Kerman University of Medical Sciences, Kerman, Iran
| | - Shahriar Dabiri
- Afzalipour School of Medicine & Pathology and Stem Cells Research Center, Kerman University of Medical Sciences, Kerman, Iran
| | - Abbas Mortazaeizdeh
- Afzalipour School of Medicine & Pathology and Stem Cells Research Center, Kerman University of Medical Sciences, Kerman, Iran
| | | | - Fatemeh Sharifi
- Leishmaniasis Research Center, Kerman University of Medical Sciences, Kerman, Iran
| |
Collapse
|
6
|
Azginoglu N, Aydin Z, Celik M. Structural profile matrices for predicting structural properties of proteins. J Bioinform Comput Biol 2020; 18:2050022. [PMID: 32649260 DOI: 10.1142/s0219720020500225] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Predicting structural properties of proteins plays a key role in predicting the 3D structure of proteins. In this study, new structural profile matrices (SPM) are developed for protein secondary structure, solvent accessibility and torsion angle class predictions, which could be used as input to 3D prediction algorithms. The structural templates employed in computing SPMs are detected by eight alignment methods in LOMETS server, gap affine alignment method, ScanProsite, PfamScan, and HHblits. The contribution of each template is weighted by its similarity to target, which is assessed by several sequence alignment scores. For comparison, the SPMs are also computed using Homolpro, which uses BLAST for target template alignments and does not assign weights to templates. Incorporating the SPMs into DSPRED classifier, the prediction accuracy improves significantly as demonstrated by cross-validation experiments on two difficult benchmarks. The most accurate predictions are obtained using the SPMs derived by threading methods in LOMETS server. On the other hand, the computational cost of computing these SPMs was the highest.
Collapse
Affiliation(s)
- Nuh Azginoglu
- Department of Computer Engineering, Nevsehir Haci Bektas Veli University, Nevsehir 50300, Turkey
| | - Zafer Aydin
- Department of Computer Engineering, Abdullah Gul University, Kayseri 38080, Turkey
| | - Mete Celik
- Department of Computer Engineering, Erciyes University, Kayseri 38039, Turkey
| |
Collapse
|
7
|
Tavakkoli H, Attaran R, Khosravi A, Salari Z, Salarkia E, Dabiri S, Mosallanejad SS. Vascular alteration in relation to fosfomycine: In silico and in vivo investigations using a chick embryo model. Biomed Pharmacother 2019; 118:109240. [PMID: 31401391 DOI: 10.1016/j.biopha.2019.109240] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2019] [Revised: 07/10/2019] [Accepted: 07/16/2019] [Indexed: 02/08/2023] Open
Abstract
Fosfomycin residues are found in the egg following administration in the layer hen. In this regard, some aspects of embryo-toxicity of fosfomycin have been documented previously. The exact mechanism by which fosfomycin causes embryo-toxicity is not clearly understood. We hypothesis that fosfomycin may alter vasculature as well as normal expression of genes, which are associated with vascular development. Therefore, the present study aimed to address these issues through in silico and in vivo investigations. At first, embryo-toxicity and anti-angiogenic effects of fosfomycin were tested using computerized programs. After that, fertile chicken eggs were treated with fosfomycin and chorioallantoic membrane vasculature was assessed through morphometric, molecular and histopathological assays. The results showed that fosfomycin not only interacted with VEGF-A protein and promoter, but also altered embryonic vasculature and decreased expression level of VEGF-A. Reticulin staining of treated group was also confirmed decreased vasculature. The minor groove of DNA was the preferential binding site for fosfomycin with its selective binding to GC-rich sequences. We suggested that the affinity of fosfomycin for VEGF-A protein and promoter as well as alteration of the angiogenic signaling pathway may cause vascular damage during embryonic growth. Hence, veterinarians should be aware of such effects and limit the use of this drug during the developmental stages of the embryo, particularly in breeder farms. Considering the anti-angiogenic activity and sequence selectivity of fosfomycin, a major advantage that seems to be very promising is the fact that it is possible to achieve a sequence-selective binding drug for cancer.
Collapse
Affiliation(s)
- Hadi Tavakkoli
- Department of Clinical Science, School of Veterinary Medicine, Shahid Bahonar University of Kerman, Kerman, Iran.
| | - Reza Attaran
- Department of Clinical Science, School of Veterinary Medicine, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Ahmad Khosravi
- Leishmaniasis Research Center, Kerman University of Medical Science, Kerman, Iran
| | - Zohreh Salari
- Obstetrics and Gynecology Center, Afzalipour School of Medicine, Kerman University of Medical Sciences, Kerman, Iran
| | - Ehsan Salarkia
- Leishmaniasis Research Center, Kerman University of Medical Science, Kerman, Iran
| | - Shahriar Dabiri
- Afzalipour School of Medicine & Pathology and Stem Cells Research Center, Kerman University of Medical Sciences, Kerman, Iran
| | - Seyedeh Saedeh Mosallanejad
- Afzalipour School of Medicine & Biochemistry Department, Kerman University of Medical Sciences, Kerman, Iran
| |
Collapse
|
8
|
Hasani N, Mohseni Meybodi A, Rafaee A, Sadighi Gilani MA, Mohammadzadeh R, Sabbaghian M. Spermatogenesis disorder is associated with mutations in the ligand-binding domain of an androgen receptor. Andrologia 2019; 51:e13376. [PMID: 31373714 DOI: 10.1111/and.13376] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2019] [Revised: 05/08/2019] [Accepted: 05/10/2019] [Indexed: 12/26/2022] Open
Abstract
Androgens play a key role in spermatogenesis, and their functions are mediated by the androgen receptor (AR). Some mutations in the AR gene have the potential to alter the primary structure and function of the protein. The aim of this study was to investigate the AR gene mutations in a cohort of males with idiopathic azoospermia referred to Royan Institute. Fifty-one biopsy samples were obtained for routine clinical purposes from 15 men with hypospermatogenesis (HS), 17 patients with maturation arrest (MA) and 19 patients with Sertoli cell-only syndrome (SCOS). The AR cDNAs were prepared from tissue mRNAs and were sequenced. One synonymous variant and three nonsynonymous protein coding single nucleotide polymorphisms (nsSNPs) were detected. Protein structure prediction demonstrated that the S815I and M746T nonsynonymous variants would affect protein structure and its normal function. Our study suggests that mutations in the AR gene would change or disturb the receptor's normal activity. Although these variations may influence spermatogenesis, it is difficult to say that they lead to a lack of spermatogenesis.
Collapse
Affiliation(s)
- Nafiseh Hasani
- Department of Cell and Molecular Biology, Faculty of Basic Science, University of Maragheh, Maragheh, Iran
| | - Anahita Mohseni Meybodi
- Department of Genetics, Reproductive Biomedicine Research Center, Royan Institute for Reproductive Biomedicine, ACECR, Tehran, Iran
| | - Alemeh Rafaee
- Department of Andrology, Reproductive Biomedicine Research Center, Royan Institute for Reproductive Biomedicine, ACECR, Tehran, Iran
| | - Mohammad Ali Sadighi Gilani
- Department of Andrology, Reproductive Biomedicine Research Center, Royan Institute for Reproductive Biomedicine, ACECR, Tehran, Iran.,Department of Urology, Shariati Hospital, Tehran University of Medical Sciences, Tehran, Iran
| | - Reza Mohammadzadeh
- Department of Cell and Molecular Biology, Faculty of Basic Science, University of Maragheh, Maragheh, Iran
| | - Marjan Sabbaghian
- Department of Andrology, Reproductive Biomedicine Research Center, Royan Institute for Reproductive Biomedicine, ACECR, Tehran, Iran
| |
Collapse
|
9
|
Wardah W, Khan M, Sharma A, Rashid MA. Protein secondary structure prediction using neural networks and deep learning: A review. Comput Biol Chem 2019; 81:1-8. [DOI: 10.1016/j.compbiolchem.2019.107093] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Revised: 12/28/2018] [Accepted: 07/10/2019] [Indexed: 02/02/2023]
|
10
|
Aydin Z, Azginoglu N, Bilgin HI, Celik M. Developing structural profile matrices for protein secondary structure and solvent accessibility prediction. Bioinformatics 2019; 35:4004-4010. [DOI: 10.1093/bioinformatics/btz238] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2018] [Revised: 02/17/2019] [Accepted: 03/29/2019] [Indexed: 11/13/2022] Open
Abstract
Abstract
Motivation
Predicting secondary structure and solvent accessibility of proteins are among the essential steps that preclude more elaborate 3D structure prediction tasks. Incorporating class label information contained in templates with known structures has the potential to improve the accuracy of prediction methods. Building a structural profile matrix is one such technique that provides a distribution for class labels at each amino acid position of the target.
Results
In this paper, a new structural profiling technique is proposed that is based on deriving PFAM families and is combined with an existing approach. Cross-validation experiments on two benchmark datasets and at various similarity intervals demonstrate that the proposed profiling strategy performs significantly better than Homolpro, a state-of-the-art method for incorporating template information, as assessed by statistical hypothesis tests.
Availability and implementation
The DSPRED method can be accessed by visiting the PSP server at http://psp.agu.edu.tr. Source code and binaries are freely available at https://github.com/yusufzaferaydin/dspred.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zafer Aydin
- Department of Computer Engineering, Abdullah Gul University, Kayseri, Turkey
| | - Nuh Azginoglu
- Department of Computer Engineering, Nevsehir Haci Bektas Veli University, Nevsehir, Turkey
| | | | - Mete Celik
- Department of Computer Engineering, Erciyes University, Kayseri, Turkey
| |
Collapse
|
11
|
Oldfield CJ, Chen K, Kurgan L. Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences. Methods Mol Biol 2019; 1958:73-100. [PMID: 30945214 DOI: 10.1007/978-1-4939-9161-7_4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many new methods for the sequence-based prediction of the secondary and supersecondary structures have been developed over the last several years. These and older sequence-based predictors are widely applied for the characterization and prediction of protein structure and function. These efforts have produced countless accurate predictors, many of which rely on state-of-the-art machine learning models and evolutionary information generated from multiple sequence alignments. We describe and motivate both types of predictions. We introduce concepts related to the annotation and computational prediction of the three-state and eight-state secondary structure as well as several types of supersecondary structures, such as β hairpins, coiled coils, and α-turn-α motifs. We review 34 predictors focusing on recent tools and provide detailed information for a selected set of 14 secondary structure and 3 supersecondary structure predictors. We conclude with several practical notes for the end users of these predictive methods.
Collapse
Affiliation(s)
- Christopher J Oldfield
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA
| | - Ke Chen
- School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin, People's Republic of China
| | - Lukasz Kurgan
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
12
|
Aydin Z, Kaynar O, Görmez Y. Dimensionality reduction for protein secondary structure and solvent accesibility prediction. J Bioinform Comput Biol 2018; 16:1850020. [PMID: 30353781 DOI: 10.1142/s0219720018500208] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Secondary structure and solvent accessibility prediction provide valuable information for estimating the three dimensional structure of a protein. As new feature extraction methods are developed the dimensionality of the input feature space increases steadily. Reducing the number of dimensions provides several advantages such as faster model training, faster prediction and noise elimination. In this work, several dimensionality reduction techniques have been employed including various feature selection methods, autoencoders and PCA for protein secondary structure and solvent accessibility prediction. The reduced feature set is used to train a support vector machine at the second stage of a hybrid classifier. Cross-validation experiments on two difficult benchmarks demonstrate that the dimension of the input space can be reduced substantially while maintaining the prediction accuracy. This will enable the incorporation of additional informative features derived for predicting the structural properties of proteins without reducing the accuracy due to overfitting.
Collapse
Affiliation(s)
- Zafer Aydin
- * Department of Computer Engineering, Abdullah Gul University, Kayseri 38080, Turkey
| | - Oğuz Kaynar
- † Department of Management Information Systems, Cumhuriyet University, Sivas 58000, Turkey
| | - Yasin Görmez
- † Department of Management Information Systems, Cumhuriyet University, Sivas 58000, Turkey
| |
Collapse
|
13
|
Sultan S, Huma N, Butt MS, Aleem M, Abbas M. Therapeutic potential of dairy bioactive peptides: A contemporary perspective. Crit Rev Food Sci Nutr 2017; 58:105-115. [PMID: 26852912 DOI: 10.1080/10408398.2015.1136590] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Dairy products are associated with numerous health benefits. These are a good source of nutrients such as carbohydrates, protein (bioactive peptides), lipids, minerals, and vitamins, which are essential for growth, development, and maintenance of the human body. Accordingly, dairy bioactive peptides are one of the targeted compounds present in different dairy products. Dairy bioactive compounds can be classified as antihypertensive, anti-oxidative, immmunomodulant, anti-mutagenic, antimicrobial, opoid, anti-thrombotic, anti-obesity, and mineral-binding agents, depending upon biological functions. These bioactive peptides can easily be produced by enzymatic hydrolysis, and during fermentation and gastrointestinal digestion. For this reason, fermented dairy products, such as yogurt, cheese, and sour milk, are gaining popularity worldwide, and are considered excellent source of dairy peptides. Furthermore, fermented and non-fermented dairy products are associated with lower risks of hypertension, coagulopathy, stroke, and cancer insurgences. The current review article is an attempt to disseminate general information about dairy peptides and their health claims to scientists, allied stakeholders, and, certainly, readers.
Collapse
Affiliation(s)
- Saira Sultan
- a National Institute of Food Science and Technology , University of Agriculture Faisalabad , Faisalabad , Pakistan.,b Queensland Alliance for Agriculture and Food Innovation , The University of Queensland , Queensland , Australia
| | - Nuzhat Huma
- a National Institute of Food Science and Technology , University of Agriculture Faisalabad , Faisalabad , Pakistan
| | - Masood Sadiq Butt
- a National Institute of Food Science and Technology , University of Agriculture Faisalabad , Faisalabad , Pakistan
| | - Muhammad Aleem
- c Institute of Biological Chemistry and Nutritional Science (140a), Universitat Hohenheim , Stuttgart , Germany
| | - Munawar Abbas
- d Institute of Home & Food Sciences, Government College University , Faisalabad , Pakistan
| |
Collapse
|
14
|
Hidden Markov model and Chapman Kolmogrov for protein structures prediction from images. Comput Biol Chem 2017; 68:231-244. [DOI: 10.1016/j.compbiolchem.2017.04.003] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Revised: 03/11/2017] [Accepted: 04/11/2017] [Indexed: 11/20/2022]
|
15
|
Peng W, Li M, Chen L, Wang L. Predicting Protein Functions by Using Unbalanced Random Walk Algorithm on Three Biological Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:360-369. [PMID: 28368814 DOI: 10.1109/tcbb.2015.2394314] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
With the gap between the sequence data and their functional annotations becomes increasing wider, many computational methods have been proposed to annotate functions for unknown proteins. However, designing effective methods to make good use of various biological resources is still a big challenge for researchers due to function diversity of proteins. In this work, we propose a new method named ThrRW, which takes several steps of random walking on three different biological networks: protein interaction network (PIN), domain co-occurrence network (DCN), and functional interrelationship network (FIN), respectively, so as to infer functional information from neighbors in the corresponding networks. With respect to the topological and structural differences of the three networks, the number of walking steps in the three networks will be different. In the course of working, the functional information will be transferred from one network to another according to the associations between the nodes in different networks. The results of experiment on S. cerevisiae data show that our method achieves better prediction performance not only than the methods that consider both PIN data and GO term similarities, but also than the methods using both PIN data and protein domain information, which verifies the effectiveness of our method on integrating multiple biological data sources.
Collapse
|
16
|
Meng F, Kurgan L. Computational Prediction of Protein Secondary Structure from Sequence. ACTA ACUST UNITED AC 2016; 86:2.3.1-2.3.10. [DOI: 10.1002/cpps.19] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Fanchi Meng
- Department of Electrical and Computer Engineering, University of Alberta Edmonton Canada
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University Richmond Virginia
| |
Collapse
|
17
|
Al-Khayyat MZS, Al-Dabbagh AGA. In silico Prediction and Docking of Tertiary Structure of LuxI, an Inducer Synthase of Vibrio fischeri. Rep Biochem Mol Biol 2016; 4:66-75. [PMID: 27536699 PMCID: PMC4986264] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2015] [Accepted: 06/10/2015] [Indexed: 06/06/2023]
Abstract
BACKGROUND LuxI is a component of the quorum sensing signaling pathway in Vibrio fischeriresponsible for the inducer synthesis that is essential for bioluminescence. METHODS Homology modeling of LuxI was carried out using Phyre2 and refined with the GalaxyWEB server. Five models were generated and evaluated by ERRAT, ANOLEA, QMEAN6, and Procheck. RESULTS Five refined models were generated by the GalaxyWEB server, with Model 4 having the greatest quality based on the QMEAN6 score of 0.732. ERRAT analysis revealed an overall quality of 98.9%, while the overall quality of the initial model was 54%. The mean force potential energy, as analyzed by ANOLEA, were better compared to the initial model. Sterochemical quality estimation by Procheck showed that the refined Model 4 had a reliable structure, and was therefore submitted to the protein model database. Drug Discovery Workbench V.2 was used to screen 2700 experimental compounds from the DrugBank database to identify inhibitors that can bind to the active site between amino acids 24 and 110. Ten compounds with high negative scores were selected as the best in binding. CONCLUSION The model produced, and the predicted acteyltransferase binding site, could be useful in modeling homologous sequences from other microorganisms and the design of new antimicrobials.
Collapse
|
18
|
Reaching optimized parameter set: protein secondary structure prediction using neural network. Neural Comput Appl 2016. [DOI: 10.1007/s00521-015-2150-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
19
|
Schwede T. Protein modeling: what happened to the "protein structure gap"? Structure 2014; 21:1531-40. [PMID: 24010712 DOI: 10.1016/j.str.2013.08.007] [Citation(s) in RCA: 83] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Revised: 08/12/2013] [Accepted: 08/12/2013] [Indexed: 11/27/2022]
Abstract
Computational modeling of three-dimensional macromolecular structures and complexes from their sequence has been a long-standing vision in structural biology. Over the last 2 decades, a paradigm shift has occurred: starting from a large "structure knowledge gap" between the huge number of protein sequences and small number of known structures, today, some form of structural information, either experimental or template-based models, is available for the majority of amino acids encoded by common model organism genomes. With the scientific focus of interest moving toward larger macromolecular complexes and dynamic networks of interactions, the integration of computational modeling methods with low-resolution experimental techniques allows the study of large and complex molecular machines. One of the open challenges for computational modeling and prediction techniques is to convey the underlying assumptions, as well as the expected accuracy and structural variability of a specific model, which is crucial to understanding its limitations.
Collapse
Affiliation(s)
- Torsten Schwede
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Klingelbergstrasse 50-70, 4056 Basel, Switzerland.
| |
Collapse
|
20
|
Prediction of multi-type membrane proteins in human by an integrated approach. PLoS One 2014; 9:e93553. [PMID: 24676214 PMCID: PMC3968155 DOI: 10.1371/journal.pone.0093553] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2013] [Accepted: 03/05/2014] [Indexed: 11/29/2022] Open
Abstract
Membrane proteins were found to be involved in various cellular processes performing various important functions, which are mainly associated to their types. However, it is very time-consuming and expensive for traditional biophysical methods to identify membrane protein types. Although some computational tools predicting membrane protein types have been developed, most of them can only recognize one kind of type. Therefore, they are not as effective as one membrane protein can have several types at the same time. To our knowledge, few methods handling multiple types of membrane proteins were reported. In this study, we proposed an integrated approach to predict multiple types of membrane proteins by employing sequence homology and protein-protein interaction network. As a result, the prediction accuracies reached 87.65%, 81.39% and 70.79%, respectively, by the leave-one-out test on three datasets. It outperformed the nearest neighbor algorithm adopting pseudo amino acid composition. The method is anticipated to be an alternative tool for identifying membrane protein types. New metrics for evaluating performances of methods dealing with multi-label problems were also presented. The program of the method is available upon request.
Collapse
|
21
|
Webb B, Eswar N, Fan H, Khuri N, Pieper U, Dong G, Sali A. Comparative Modeling of Drug Target Proteins☆. REFERENCE MODULE IN CHEMISTRY, MOLECULAR SCIENCES AND CHEMICAL ENGINEERING 2014. [PMCID: PMC7157477 DOI: 10.1016/b978-0-12-409547-2.11133-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state-of-the-art by a number of specific examples.
Collapse
|
22
|
Hamp T, Kassner R, Seemayer S, Vicedo E, Schaefer C, Achten D, Auer F, Boehm A, Braun T, Hecht M, Heron M, Hönigschmid P, Hopf TA, Kaufmann S, Kiening M, Krompass D, Landerer C, Mahlich Y, Roos M, Rost B. Homology-based inference sets the bar high for protein function prediction. BMC Bioinformatics 2013; 14 Suppl 3:S7. [PMID: 23514582 PMCID: PMC3584931 DOI: 10.1186/1471-2105-14-s3-s7] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Any method that de novo predicts protein function should do better than random. More challenging, it also ought to outperform simple homology-based inference. METHODS Here, we describe a few methods that predict protein function exclusively through homology. Together, they set the bar or lower limit for future improvements. RESULTS AND CONCLUSIONS During the development of these methods, we faced two surprises. Firstly, our most successful implementation for the baseline ranked very high at CAFA1. In fact, our best combination of homology-based methods fared only slightly worse than the top-of-the-line prediction method from the Jones group. Secondly, although the concept of homology-based inference is simple, this work revealed that the precise details of the implementation are crucial: not only did the methods span from top to bottom performers at CAFA, but also the reasons for these differences were unexpected. In this work, we also propose a new rigorous measure to compare predicted and experimental annotations. It puts more emphasis on the details of protein function than the other measures employed by CAFA and may best reflect the expectations of users. Clearly, the definition of proper goals remains one major objective for CAFA.
Collapse
Affiliation(s)
- Tobias Hamp
- TUM, Department of Informatics, Bioinformatics & Computational Biology - I12 Boltzmannstr, 3, 85748 Garching/Munich, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Miller EB, Murrett CS, Zhu K, Zhao S, Goldfeld DA, Bylund JH, Friesner RA. Prediction of Long Loops with Embedded Secondary Structure using the Protein Local Optimization Program. J Chem Theory Comput 2013; 9:1846-4864. [PMID: 23814507 DOI: 10.1021/ct301083q] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Robust homology modeling to atomic-level accuracy requires in the general case successful prediction of protein loops containing small segments of secondary structure. Further, as loop prediction advances to success with larger loops, the exclusion of loops containing secondary structure becomes awkward. Here, we extend the applicability of the Protein Local Optimization Program (PLOP) to loops up to 17 residues in length that contain either helical or hairpin segments. In general, PLOP hierarchically samples conformational space and ranks candidate loops with a high-quality molecular mechanics force field. For loops identified to possess α-helical segments, we employ an alternative dihedral library composed of (ϕ,ψ) angles commonly found in helices. The alternative library is searched over a user-specified range of residues that define the helical bounds. The source of these helical bounds can be from popular secondary structure prediction software or from analysis of past loop predictions where a propensity to form a helix is observed. Due to the maturity of our energy model, the lowest energy loop across all experiments can be selected with an accuracy of sub-Ångström RMSD in 80% of cases, 1.0 to 1.5 Å RMSD in 14% of cases, and poorer than 1.5 Å RMSD in 6% of cases. The effectiveness of our current methods in predicting hairpin-containing loops is explored with hairpins up to 13 residues in length and again reaching an accuracy of sub-Ångström RMSD in 83% of cases, 1.0 to 1.5 Å RMSD in 10% of cases, and poorer than 1.5 Å RMSD in 7% of cases. Finally, we explore the effect of an imprecise surrounding environment, in which side chains, but not the backbone, are initially in perturbed geometries. In these cases, loops perturbed to 3Å RMSD from the native environment were restored to their native conformation with sub-Ångström RMSD.
Collapse
Affiliation(s)
- Edward B Miller
- Department of Chemistry, Columbia University, New York, New York
| | | | | | | | | | | | | |
Collapse
|
24
|
Puton T, Kozlowski LP, Rother KM, Bujnicki JM. CompaRNA: a server for continuous benchmarking of automated methods for RNA secondary structure prediction. Nucleic Acids Res 2013; 41:4307-23. [PMID: 23435231 PMCID: PMC3627593 DOI: 10.1093/nar/gkt101] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
We present a continuous benchmarking approach for the assessment of RNA secondary structure prediction methods implemented in the CompaRNA web server. As of 3 October 2012, the performance of 28 single-sequence and 13 comparative methods has been evaluated on RNA sequences/structures released weekly by the Protein Data Bank. We also provide a static benchmark generated on RNA 2D structures derived from the RNAstrand database. Benchmarks on both data sets offer insight into the relative performance of RNA secondary structure prediction methods on RNAs of different size and with respect to different types of structure. According to our tests, on the average, the most accurate predictions obtained by a comparative approach are generated by CentroidAlifold, MXScarna, RNAalifold and TurboFold. On the average, the most accurate predictions obtained by single-sequence analyses are generated by CentroidFold, ContextFold and IPknot. The best comparative methods typically outperform the best single-sequence methods if an alignment of homologous RNA sequences is available. This article presents the results of our benchmarks as of 3 October 2012, whereas the rankings presented online are continuously updated. We will gladly include new prediction methods and new measures of accuracy in the new editions of CompaRNA benchmarks.
Collapse
Affiliation(s)
- Tomasz Puton
- Bioinformatics Laboratory, Institute for Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, ul. Umultowska 89, 61-614 Poznan, Poland
| | | | | | | |
Collapse
|
25
|
Yan J, Marcus M, Kurgan L. Comprehensively designed consensus of standalone secondary structure predictors improves Q3 by over 3%. J Biomol Struct Dyn 2013; 32:36-51. [PMID: 23298369 DOI: 10.1080/07391102.2012.746945] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Protein fold is defined by a spatial arrangement of three types of secondary structures (SSs) including helices, sheets, and coils/loops. Current methods that predict SS from sequences rely on complex machine learning-derived models and provide the three-state accuracy (Q3) at about 82%. Further improvements in predictive quality could be obtained with a consensus-based approach, which so far received limited attention. We perform first-of-its-kind comprehensive design of a SS consensus predictor (SScon), in which we consider 12 modern standalone SS predictors and utilize Support Vector Machine (SVM) to combine their predictions. Using a large benchmark data-set with 10 random training-test splits, we show that a simple, voting-based consensus of carefully selected base methods improves Q3 by 1.9% when compared to the best single predictor. Use of SVM provides additional 1.4% improvement with the overall Q3 at 85.6% and segment overlap (SOV3) at 83.7%, when compared to 82.3 and 80.9%, respectively, obtained by the best individual methods. We also show strong improvements when the consensus is based on ab-initio methods, with Q3 = 82.3% and SOV3 = 80.7% that match the results from the best template-based approaches. Our consensus reduces the number of significant errors where helix is confused with a strand, provides particularly good results for short helices and strands, and gives the most accurate estimates of the content of individual SSs in the chain. Case studies are used to visualize the improvements offered by the consensus at the residue level. A web-server and a standalone implementation of SScon are available at http://biomine.ece.ualberta.ca/SSCon/ .
Collapse
Affiliation(s)
- Jing Yan
- a Department of Electrical and Computer Engineering , University of Alberta , Edmonton , Canada
| | | | | |
Collapse
|
26
|
Ochoa D, García-Gutiérrez P, Juan D, Valencia A, Pazos F. Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions. MOLECULAR BIOSYSTEMS 2012; 9:70-6. [PMID: 23104128 DOI: 10.1039/c2mb25325a] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.
Collapse
Affiliation(s)
- David Ochoa
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), C/Darwin, 3, Cantoblanco, 28049 Madrid, Spain
| | | | | | | | | |
Collapse
|
27
|
Using Homology Information From PDB to Improve The Accuracy of Protein β-turn Prediction by NetTurnP*. PROG BIOCHEM BIOPHYS 2012. [DOI: 10.3724/sp.j.1206.2011.00370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
28
|
Sun J, Tang S, Xiong W, Cong P, Li T. DSP: a protein shape string and its profile prediction server. Nucleic Acids Res 2012; 40:W298-302. [PMID: 22553364 PMCID: PMC3394270 DOI: 10.1093/nar/gks361] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Many studies have demonstrated that shape string is an extremely important structure representation, since it is more complete than the classical secondary structure. The shape string provides detailed information also in the regions denoted random coil. But few services are provided for systematic analysis of protein shape string. To fill this gap, we have developed an accurate shape string predictor based on two innovative technologies: a knowledge-driven sequence alignment and a sequence shape string profile method. The performance on blind test data demonstrates that the proposed method can be used for accurate prediction of protein shape string. The DSP server provides both predicted shape string and sequence shape string profile for each query sequence. Using this information, the users can compare protein structure or display protein evolution in shape string space. The DSP server is available at both http://cheminfo.tongji.edu.cn/dsp/ and its main mirror http://chemcenter.tongji.edu.cn/dsp/.
Collapse
Affiliation(s)
- Jiangming Sun
- Department of Chemistry, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | | | | | | | | |
Collapse
|
29
|
Sun JM, Li TH, Cong PS, Tang SN, Xiong WW. Retrieving backbone string neighbors provides insights into structural modeling of membrane proteins. Mol Cell Proteomics 2012; 11:M111.016808. [PMID: 22415040 DOI: 10.1074/mcp.m111.016808] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Identification of protein structural neighbors to a query is fundamental in structure and function prediction. Here we present BS-align, a systematic method to retrieve backbone string neighbors from primary sequences as templates for protein modeling. The backbone conformation of a protein is represented by the backbone string, as defined in Ramachandran space. The backbone string of a query can be accurately predicted by two innovative technologies: a knowledge-driven sequence alignment and encoding of a backbone string element profile. Then, the predicted backbone string is employed to align against a backbone string database and retrieve a set of backbone string neighbors. The backbone string neighbors were shown to be close to native structures of query proteins. BS-align was successfully employed to predict models of 10 membrane proteins with lengths ranging between 229 and 595 residues, and whose high-resolution structural determinations were difficult to elucidate both by experiment and prediction. The obtained TM-scores and root mean square deviations of the models confirmed that the models based on the backbone string neighbors retrieved by the BS-align were very close to the native membrane structures although the query and the neighbor shared a very low sequence identity. The backbone string system represents a new road for the prediction of protein structure from sequence, and suggests that the similarity of the backbone string would be more informative than describing a protein as belonging to a fold.
Collapse
Affiliation(s)
- Jiang-Ming Sun
- Department of Chemistry, Tongji University, 1239 Siping Road, Shanghai 200092, China
| | | | | | | | | |
Collapse
|
30
|
Bettella F, Rasinski D, Knapp EW. Protein Secondary Structure Prediction with SPARROW. J Chem Inf Model 2012; 52:545-56. [DOI: 10.1021/ci200321u] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Francesco Bettella
- Freie Universität
Berlin,
Institut für Chemie, Fabeckstr. 36a, D-14195 Berlin, Germany
- deCODE genetics, Sturlugata
8, 101 Reykjavik, Iceland
| | - Dawid Rasinski
- Freie Universität
Berlin,
Institut für Chemie, Fabeckstr. 36a, D-14195 Berlin, Germany
| | - Ernst Walter Knapp
- Freie Universität
Berlin,
Institut für Chemie, Fabeckstr. 36a, D-14195 Berlin, Germany
| |
Collapse
|
31
|
Chen K, Kurgan L. Computational prediction of secondary and supersecondary structures. Methods Mol Biol 2012; 932:63-86. [PMID: 22987347 DOI: 10.1007/978-1-62703-065-6_5] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The sequence-based prediction of the secondary and supersecondary structures enjoys strong interest and finds applications in numerous areas related to the characterization and prediction of protein structure and function. Substantial efforts in these areas over the last three decades resulted in the development of accurate predictors, which take advantage of modern machine learning models and availability of evolutionary information extracted from multiple sequence alignment. In this chapter, we first introduce and motivate both prediction areas and introduce basic concepts related to the annotation and prediction of the secondary and supersecondary structures, focusing on the β hairpin, coiled coil, and α-turn-α motifs. Next, we overview state-of-the-art prediction methods, and we provide details for 12 modern secondary structure predictors and 4 representative supersecondary structure predictors. Finally, we provide several practical notes for the users of these prediction tools.
Collapse
Affiliation(s)
- Ke Chen
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
| | | |
Collapse
|
32
|
Poleksic A. Optimizing a widely used protein structure alignment measure in expected polynomial time. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:1716-1720. [PMID: 21904019 DOI: 10.1109/tcbb.2011.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Protein structure alignment is an important tool in many biological applications, such as protein evolution studies, protein structure modeling, and structure-based, computer-aided drug design. Protein structure alignment is also one of the most challenging problems in computational molecular biology, due to an infinite number of possible spatial orientations of any two protein structures. We study one of the most commonly used measures of pairwise protein structure similarity, defined as the number of pairs of atoms in two proteins that can be superimposed under a predefined distance cutoff. We prove that the expected running time of a recently published algorithm for optimizing this (and some other, derived measures of protein structure similarity) is polynomial.
Collapse
Affiliation(s)
- Aleksandar Poleksic
- Department of Computer Science, University of Northern Iowa, 305 ITTC, Cedar Falls, IA 50614-0507, USA.
| |
Collapse
|
33
|
Poleksic A. On complexity of protein structure alignment problem under distance constraint. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 9:511-516. [PMID: 22025757 DOI: 10.1109/tcbb.2011.133] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
We study the well known LCP (Largest Common Point-Set) under Bottleneck Distance Problem. Given two proteins a and b (as sequences of points in 3D space) and a distance cutoff σ, the goal is to find a spatial superposition and an alignment that maximizes the number of pairs of points from a and b that can be fit under the distance σ from each other. The best to date algorithms for approximate and exact solution to this problem run in time O(n^8) and O(n^32), respectively, where n represents the protein length. This work improves the runtime of the approximation algorithm and the algorithm for absolute optimum for both order-dependent and order-independent alignments. More specifically, our algorithms for near-optimal and optimal sequential alignments run in time O(^7 log n) and O(n^14 log n), respectively. For non-sequential alignments, corresponding running times are O(n^7.5) and O(n^14.5).
Collapse
|
34
|
Kedarisetti KD, Mizianty MJ, Dick S, Kurgan L. Improved sequence-based prediction of strand residues. J Bioinform Comput Biol 2011; 9:67-89. [PMID: 21328707 DOI: 10.1142/s0219720011005355] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2010] [Revised: 11/19/2010] [Accepted: 11/19/2010] [Indexed: 01/02/2023]
Abstract
Accurate identification of strand residues aids prediction and analysis of numerous structural and functional aspects of proteins. We propose a sequence-based predictor, BETArPRED, which improves prediction of strand residues and β-strand segments. BETArPRED uses a novel design that accepts strand residues predicted by SSpro and predicts the remaining positions utilizing a logistic regression classifier with nine custom-designed features. These are derived from the primary sequence, the secondary structure (SS) predicted by SSpro, PSIPRED and SPINE, and residue depth as predicted by RDpred. Our features utilize certain local (window-based) patterns in the predicted SS and combine information about the predicted SS and residue depth. BETArPRED is evaluated on 432 sequences that share low identity with the training chains, and on the CASP8 dataset. We compare BETArPRED with seven modern SS predictors, and the top-performing automated structure predictor in CASP8, the ZHANG-server. BETArPRED provides statistically significant improvements over each of the SS predictors; it improves prediction of strand residues and β-strands, and it finds β-strands that were missed by the other methods. When compared with the ZHANG-server, we improve predictions of strand segments and predict more actual strand residues, while the other predictor achieves higher rate of correct strand residue predictions when under-predicting them.
Collapse
|
35
|
di Luccio E, Koehl P. A quality metric for homology modeling: the H-factor. BMC Bioinformatics 2011; 12:48. [PMID: 21291572 PMCID: PMC3213331 DOI: 10.1186/1471-2105-12-48] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2010] [Accepted: 02/04/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The analysis of protein structures provides fundamental insight into most biochemical functions and consequently into the cause and possible treatment of diseases. As the structures of most known proteins cannot be solved experimentally for technical or sometimes simply for time constraints, in silico protein structure prediction is expected to step in and generate a more complete picture of the protein structure universe. Molecular modeling of protein structures is a fast growing field and tremendous works have been done since the publication of the very first model. The growth of modeling techniques and more specifically of those that rely on the existing experimental knowledge of protein structures is intimately linked to the developments of high resolution, experimental techniques such as NMR, X-ray crystallography and electron microscopy. This strong connection between experimental and in silico methods is however not devoid of criticisms and concerns among modelers as well as among experimentalists. RESULTS In this paper, we focus on homology-modeling and more specifically, we review how it is perceived by the structural biology community and what can be done to impress on the experimentalists that it can be a valuable resource to them. We review the common practices and provide a set of guidelines for building better models. For that purpose, we introduce the H-factor, a new indicator for assessing the quality of homology models, mimicking the R-factor in X-ray crystallography. The methods for computing the H-factor is fully described and validated on a series of test cases. CONCLUSIONS We have developed a web service for computing the H-factor for models of a protein structure. This service is freely accessible at http://koehllab.genomecenter.ucdavis.edu/toolkit/h-factor.
Collapse
Affiliation(s)
- Eric di Luccio
- Computer Science Department, Room 4337, Genome Center, GBSF University of California Davis 451 East Health Sciences Drive Davis, CA 95616, USA.
| | | |
Collapse
|
36
|
Wishart DS. Interpreting protein chemical shift data. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2011; 58:62-87. [PMID: 21241884 DOI: 10.1016/j.pnmrs.2010.07.004] [Citation(s) in RCA: 184] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Accepted: 07/29/2010] [Indexed: 05/12/2023]
Affiliation(s)
- David S Wishart
- Department of Biological Sciences, National Institute for Nanotechnology (NINT), Edmonton, AB, Canada T6G 2E8.
| |
Collapse
|
37
|
Zhang H, Zhang T, Chen K, Kedarisetti KD, Mizianty MJ, Bao Q, Stach W, Kurgan L. Critical assessment of high-throughput standalone methods for secondary structure prediction. Brief Bioinform 2011; 12:672-88. [PMID: 21252072 DOI: 10.1093/bib/bbq088] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Sequence-based prediction of protein secondary structure (SS) enjoys wide-spread and increasing use for the analysis and prediction of numerous structural and functional characteristics of proteins. The lack of a recent comprehensive and large-scale comparison of the numerous prediction methods results in an often arbitrary selection of a SS predictor. To address this void, we compare and analyze 12 popular, standalone and high-throughput predictors on a large set of 1975 proteins to provide in-depth, novel and practical insights. We show that there is no universally best predictor and thus detailed comparative studies are needed to support informed selection of SS predictors for a given application. Our study shows that the three-state accuracy (Q3) and segment overlap (SOV3) of the SS prediction currently reach 82% and 81%, respectively. We demonstrate that carefully designed consensus-based predictors improve the Q3 by additional 2% and that homology modeling-based methods are significantly better by 1.5% Q3 than ab initio approaches. Our empirical analysis reveals that solvent exposed and flexible coils are predicted with a higher quality than the buried and rigid coils, while inverse is true for the strands and helices. We also show that longer helices are easier to predict, which is in contrast to longer strands that are harder to find. The current methods confuse 1-6% of strand residues with helical residues and vice versa and they perform poorly for residues in the β- bridge and 3(10)-helix conformations. Finally, we compare predictions of the standalone implementations of four well-performing methods with their corresponding web servers.
Collapse
Affiliation(s)
- Hua Zhang
- Zhejiang Gongshang University, Hangzhou, Zhejiang, P.R. China
| | | | | | | | | | | | | | | |
Collapse
|
38
|
Automated protein structure modeling with SWISS-MODEL Workspace and the Protein Model Portal. Methods Mol Biol 2011; 857:107-36. [PMID: 22323219 DOI: 10.1007/978-1-61779-588-6_5] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Comparative protein structure modeling is a computational approach to build three-dimensional structural models for proteins using experimental structures of related protein family members as templates. Regular blind assessments of modeling accuracy have demonstrated that comparative protein structure modeling is currently the most reliable technique to model protein structures. Homology models are often sufficiently accurate to substitute for experimental structures in a wide variety of applications. Since the usefulness of a model for specific application is determined by its accuracy, model quality estimation is an essential component of protein structure prediction. Comparative protein modeling has become a routine approach in many areas of life science research since fully automated modeling systems allow also nonexperts to build reliable models. In this chapter, we describe practical approaches for automated protein structure modeling with SWISS-MODEL Workspace and the Protein Model Portal.
Collapse
|
39
|
Benkert P, Biasini M, Schwede T. Toward the estimation of the absolute quality of individual protein structure models. ACTA ACUST UNITED AC 2010; 27:343-50. [PMID: 21134891 PMCID: PMC3031035 DOI: 10.1093/bioinformatics/btq662] [Citation(s) in RCA: 1484] [Impact Index Per Article: 106.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Motivation: Quality assessment of protein structures is an important part of experimental structure validation and plays a crucial role in protein structure prediction, where the predicted models may contain substantial errors. Most current scoring functions are primarily designed to rank alternative models of the same sequence supporting model selection, whereas the prediction of the absolute quality of an individual protein model has received little attention in the field. However, reliable absolute quality estimates are crucial to assess the suitability of a model for specific biomedical applications. Results: In this work, we present a new absolute measure for the quality of protein models, which provides an estimate of the ‘degree of nativeness’ of the structural features observed in a model and describes the likelihood that a given model is of comparable quality to experimental structures. Model quality estimates based on the QMEAN scoring function were normalized with respect to the number of interactions. The resulting scoring function is independent of the size of the protein and may therefore be used to assess both monomers and entire oligomeric assemblies. Model quality scores for individual models are then expressed as ‘Z-scores’ in comparison to scores obtained for high-resolution crystal structures. We demonstrate the ability of the newly introduced QMEAN Z-score to detect experimentally solved protein structures containing significant errors, as well as to evaluate theoretical protein models. In a comprehensive QMEAN Z-score analysis of all experimental structures in the PDB, membrane proteins accumulate on one side of the score spectrum and thermostable proteins on the other. Proteins from the thermophilic organism Thermatoga maritima received significantly higher QMEAN Z-scores in a pairwise comparison with their homologous mesophilic counterparts, underlining the significance of the QMEAN Z-score as an estimate of protein stability. Availability: The Z-score calculation has been integrated in the QMEAN server available at: http://swissmodel.expasy.org/qmean. Contact:torsten.schwede@unibas.ch Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
|
40
|
Madera M, Calmus R, Thiltgen G, Karplus K, Gough J. Improving protein secondary structure prediction using a simple k-mer model. Bioinformatics 2010; 26:596-602. [PMID: 20130034 PMCID: PMC2828123 DOI: 10.1093/bioinformatics/btq020] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Motivation: Some first order methods for protein sequence analysis inherently treat each position as independent. We develop a general framework for introducing longer range interactions. We then demonstrate the power of our approach by applying it to secondary structure prediction; under the independence assumption, sequences produced by existing methods can produce features that are not protein like, an extreme example being a helix of length 1. Our goal was to make the predictions from state of the art methods more realistic, without loss of performance by other measures. Results: Our framework for longer range interactions is described as a k-mer order model. We succeeded in applying our model to the specific problem of secondary structure prediction, to be used as an additional layer on top of existing methods. We achieved our goal of making the predictions more realistic and protein like, and remarkably this also improved the overall performance. We improve the Segment OVerlap (SOV) score by 1.8%, but more importantly we radically improve the probability of the real sequence given a prediction from an average of 0.271 per residue to 0.385. Crucially, this improvement is obtained using no additional information. Availability:http://supfam.cs.bris.ac.uk/kmer Contact:gough@cs.bris.ac.uk
Collapse
Affiliation(s)
- Martin Madera
- Department of Computer Science, University of Bristol, Woodland Road, Bristol BS8 1UB, UK
| | | | | | | | | |
Collapse
|
41
|
Tress ML, Ezkurdia I, Richardson JS. Target domain definition and classification in CASP8. Proteins 2010; 77 Suppl 9:10-7. [PMID: 19603487 DOI: 10.1002/prot.22497] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
In order to be successful CASP experiments require experimentally determined protein structures. These structures form the basis of the experiment. Structural genomics groups have provided the vast majority of these structures in recent editions of CASP. Before the structure prediction assessment can begin these target structures must be divided into structural domains for assessment purposes and each assessment unit must be assigned to one or more tertiary structure prediction categories. In CASP8 target domain boundaries were based on visual inspection of targets and their experimental data, and on superpositions of the target structures with related template structures. As in CASP7 target domains were broadly classified into two different categories: "template-based modeling" and "free modeling." Assessment categories were determined by structural similarity between the target domain and the nearest structural templates in the PDB and by whether or not related structural templates were used to build the models. The vast majority of the 164 assessment units in CASP8 were classified as template-based modeling. Just 10 target domains were defined as free modeling. In addition three targets were assessed in both the free modeling and template based categories and a subset of 50 template-based models was evaluated as part of the "high accuracy" subset. The targets submitted for CASP8 confirmed a trend that has been apparent since CASP5: targets submitted to the CASP experiments are becoming easier to predict.
Collapse
Affiliation(s)
- Michael L Tress
- Structural and Computational Biology Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
| | | | | |
Collapse
|
42
|
Abstract
While the prediction of a native protein structure from sequence continues to remain a challenging problem, over the past decades computational methods have become quite successful in exploiting the mechanisms behind secondary structure formation. The great effort expended in this area has resulted in the development of a vast number of secondary structure prediction methods. Especially the combination of well-optimized/sensitive machine-learning algorithms and inclusion of homologous sequence information has led to increased prediction accuracies of up to 80%. In this chapter, we will first introduce some basic notions and provide a brief history of secondary structure prediction advances. Then a comprehensive overview of state-of-the-art prediction methods will be given. Finally, we will discuss open questions and challenges in this field and provide some practical recommendations for the user.
Collapse
Affiliation(s)
- Walter Pirovano
- Centre for Integrative Bioinformatics VU, VU University, Amsterdam, The Netherlands
| | | |
Collapse
|
43
|
Abstract
As the field of protein structure prediction continues to expand at an
exponential rate, the bench-biologist might feel overwhelmed by the sheer
range of available applications. This review presents the three main
approaches in computational structure prediction from a
non-bioinformatician?s point of view and makes a selection of tools and
servers freely available. These tools are evaluated from several aspects,
such as number of citations, ease of usage and quality of the results.
Finally, the applications of models generated by computational structure
prediction are discussed.
Collapse
|
44
|
|
45
|
Guex N, Peitsch MC, Schwede T. Automated comparative protein structure modeling with SWISS-MODEL and Swiss-PdbViewer: a historical perspective. Electrophoresis 2009; 30 Suppl 1:S162-73. [PMID: 19517507 DOI: 10.1002/elps.200900140] [Citation(s) in RCA: 1296] [Impact Index Per Article: 86.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
SWISS-MODEL pioneered the field of automated modeling as the first protein modeling service on the Internet. In combination with the visualization tool Swiss-PdbViewer, the Internet-based Workspace and the SWISS-MODEL Repository, it provides a fully integrated sequence to structure analysis and modeling platform. This computational environment is made freely available to the scientific community with the aim to hide the computational complexity of structural bioinformatics and encourage bench scientists to make use of the ever-increasing structural information available. Indeed, over the last decade, the availability of structural information has significantly increased for many organisms as a direct consequence of the complementary nature of comparative protein modeling and experimental structure determination. This has a very positive and enabling impact on many different applications in biomedical research as described in this paper.
Collapse
Affiliation(s)
- Nicolas Guex
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | | |
Collapse
|
46
|
McGuire AT, Keates RAB, Cook S, Mangroo D. Structural modeling identified the tRNA-binding domain of Utp8p, an essential nucleolar component of the nuclear tRNA export machinery of Saccharomyces cerevisiae. Biochem Cell Biol 2009; 87:431-43. [PMID: 19370060 DOI: 10.1139/o08-145] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Utp8p is an essential 80 kDa intranuclear tRNA chaperone that transports tRNAs from the nucleolus to the nuclear tRNA export receptors in Saccharomyces cerevisiae. To help understand the mechanism of Utp8p function, predictive tools were used to derive a partial model of the tertiary structure of Utp8p. Secondary structure prediction, supported by circular dichroism measurements, indicated that Utp8p is divided into 2 domains: the N-terminal beta sheet and the C-terminal alpha helical domain. Tertiary structure prediction was more challenging, because the amino acid sequence of Utp8p is not directly homologous to any known protein structure. The tertiary structures predicted by threading and fold recognition had generally modest scores, but for the C-terminal domain, threading and fold recognition consistently pointed to an alpha-alpha superhelix. Because of the sequence diversity of this fold type, no single structural template was an ideal fit to the Utp8p sequence. Instead, a composite template was constructed from 3 different alpha-alpha superhelix structures that gave the best matches to different portions of the C-terminal domain sequence. In the resulting model, the most conserved sequences grouped in a tight cluster of positive charges on a protein that is otherwise predominantly negative, suggesting that the positive-charge cleft may be the tRNA-binding site. Mutations of conserved positive residues in the proposed binding site resulted in a reduction in the affinity of Utp8p for tRNA both in vivo and in vitro. Models were also derived for the 10 fungal homologues of Utp8p, and the localization of the positive charges on the conserved surface was found in all cases. Taken together, these data suggest that the positive-charge cleft of the C-terminal domain of Utp8p is involved in tRNA-binding.
Collapse
Affiliation(s)
- Andrew T McGuire
- Department of Molecular and Cellular Biology, University of Guelph, Guelph, ON N1G2W1, Canada
| | | | | | | |
Collapse
|
47
|
Mooney C, Pollastri G. Beyond the Twilight Zone: Automated prediction of structural properties of proteins by recursive neural networks and remote homology information. Proteins 2009; 77:181-90. [DOI: 10.1002/prot.22429] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
48
|
Bordoli L, Kiefer F, Arnold K, Benkert P, Battey J, Schwede T. Protein structure homology modeling using SWISS-MODEL workspace. Nat Protoc 2009; 4:1-13. [PMID: 19131951 DOI: 10.1038/nprot.2008.197] [Citation(s) in RCA: 912] [Impact Index Per Article: 60.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Homology modeling aims to build three-dimensional protein structure models using experimentally determined structures of related family members as templates. SWISS-MODEL workspace is an integrated Web-based modeling expert system. For a given target protein, a library of experimental protein structures is searched to identify suitable templates. On the basis of a sequence alignment between the target protein and the template structure, a three-dimensional model for the target protein is generated. Model quality assessment tools are used to estimate the reliability of the resulting models. Homology modeling is currently the most accurate computational method to generate reliable structural models and is routinely used in many biological applications. Typically, the computational effort for a modeling project is less than 2 h. However, this does not include the time required for visualization and interpretation of the model, which may vary depending on personal experience working with protein structures.
Collapse
Affiliation(s)
- Lorenza Bordoli
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH 4056 Basel, Switzerland
| | | | | | | | | | | |
Collapse
|
49
|
Nair R, Liu J, Soong TT, Acton TB, Everett JK, Kouranov A, Fiser A, Godzik A, Jaroszewski L, Orengo C, Montelione GT, Rost B. Structural genomics is the largest contributor of novel structural leverage. ACTA ACUST UNITED AC 2009; 10:181-91. [PMID: 19194785 PMCID: PMC2705706 DOI: 10.1007/s10969-008-9055-6] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2008] [Accepted: 12/08/2008] [Indexed: 11/28/2022]
Abstract
The Protein Structural Initiative (PSI) at the US National Institutes of Health (NIH) is funding four large-scale centers for structural genomics (SG). These centers systematically target many large families without structural coverage, as well as very large families with inadequate structural coverage. Here, we report a few simple metrics that demonstrate how successfully these efforts optimize structural coverage: while the PSI-2 (2005-now) contributed more than 8% of all structures deposited into the PDB, it contributed over 20% of all novel structures (i.e. structures for protein sequences with no structural representative in the PDB on the date of deposition). The structural coverage of the protein universe represented by today’s UniProt (v12.8) has increased linearly from 1992 to 2008; structural genomics has contributed significantly to the maintenance of this growth rate. Success in increasing novel leverage (defined in Liu et al. in Nat Biotechnol 25:849–851, 2007) has resulted from systematic targeting of large families. PSI’s per structure contribution to novel leverage was over 4-fold higher than that for non-PSI structural biology efforts during the past 8 years. If the success of the PSI continues, it may just take another ~15 years to cover most sequences in the current UniProt database.
Collapse
Affiliation(s)
- Rajesh Nair
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Abstract
Both supervised and unsupervised neural networks have been applied to the prediction of protein structure and function. Here, we focus on feedforward neural networks and describe how these learning machines can be applied to protein prediction. We discuss how to select an appropriate data set, how to choose and encode protein features into the neural network input, and how to assess the predictor's performance.
Collapse
Affiliation(s)
- Marco Punta
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
| | | |
Collapse
|