101
|
Altshuler EP, Serebryanaya DV, Katrukha AG. Generation of recombinant antibodies and means for increasing their affinity. BIOCHEMISTRY (MOSCOW) 2011; 75:1584-605. [PMID: 21417996 DOI: 10.1134/s0006297910130067] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Highly specific interaction with foreign molecules is a unique feature of antibodies. Since 1975, when Keller and Milstein proposed the method of hybridoma technology and prepared mouse monoclonal antibodies, many antibodies specific to various antigens have been obtained. Recent development of methods for preparation of recombinant DNA libraries and in silico bioinformatics approaches for protein structure analysis makes possible antibody preparation using gene engineering approaches. The development of gene engineering methods allowed creating recombinant antibodies and improving characteristics of existing antibodies; this significantly extends the applicability of antibodies. By modifying biochemical and immunochemical properties of antibodies by changing their amino acid sequences it is possible to create antibodies with properties optimal for certain tasks. For example, application of recombinant technologies resulted in antibody preparation of high affinity significantly exceeding the initial affinity of natural antibodies. In this review we summarize information about the structure, modes of preparation, and application of recombinant antibodies and their fragments and also consider the main approaches used to increase antibody affinity.
Collapse
Affiliation(s)
- E P Altshuler
- Department of Biochemistry, Faculty of Biology, Lomonosov Moscow State University, Russia
| | | | | |
Collapse
|
102
|
Shi X, Zhang J, He Z, Shang Y, Xu D. A sampling-based method for ranking protein structural models by integrating multiple scores and features. Curr Protein Pept Sci 2011; 12:540-8. [PMID: 21787308 PMCID: PMC4368063 DOI: 10.2174/138920311796957658] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2011] [Revised: 04/01/2011] [Accepted: 05/04/2011] [Indexed: 11/22/2022]
Abstract
One of the major challenges in protein tertiary structure prediction is structure quality assessment. In many cases, protein structure prediction tools generate good structural models, but fail to select the best models from a huge number of candidates as the final output. In this study, we developed a sampling-based machine-learning method to rank protein structural models by integrating multiple scores and features. First, features such as predicted secondary structure, solvent accessibility and residue-residue contact information are integrated by two Radial Basis Function (RBF) models trained from different datasets. Then, the two RBF scores and five selected scoring functions developed by others, i.e., Opus-CA, Opus-PSP, DFIRE, RAPDF, and Cheng Score are synthesized by a sampling method. At last, another integrated RBF model ranks the structural models according to the features of sampling distribution. We tested the proposed method by using two different datasets, including the CASP server prediction models of all CASP8 targets and a set of models generated by our in-house software MUFOLD. The test result shows that our method outperforms any individual scoring function on both best model selection, and overall correlation between the predicted ranking and the actual ranking of structural quality.
Collapse
Affiliation(s)
- Xiaohu Shi
- College of Computer Science and Technology, Jilin University, Jilin, Changchun 130012, China
| | | | | | | | | |
Collapse
|
103
|
Huang SY, Zou X. Statistical mechanics-based method to extract atomic distance-dependent potentials from protein structures. Proteins 2011; 79:2648-61. [PMID: 21732421 PMCID: PMC11108592 DOI: 10.1002/prot.23086] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2011] [Revised: 04/21/2011] [Accepted: 05/09/2011] [Indexed: 12/25/2022]
Abstract
In this study, we have developed a statistical mechanics-based iterative method to extract statistical atomic interaction potentials from known, nonredundant protein structures. Our method circumvents the long-standing reference state problem in deriving traditional knowledge-based scoring functions, by using rapid iterations through a physical, global convergence function. The rapid convergence of this physics-based method, unlike other parameter optimization methods, warrants the feasibility of deriving distance-dependent, all-atom statistical potentials to keep the scoring accuracy. The derived potentials, referred to as ITScore/Pro, have been validated using three diverse benchmarks: the high-resolution decoy set, the AMBER benchmark decoy set, and the CASP8 decoy set. Significant improvement in performance has been achieved. Finally, comparisons between the potentials of our model and potentials of a knowledge-based scoring function with a randomized reference state have revealed the reason for the better performance of our scoring function, which could provide useful insight into the development of other physical scoring functions. The potentials developed in this study are generally applicable for structural selection in protein structure prediction.
Collapse
Affiliation(s)
- Sheng-You Huang
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| | - Xiaoqin Zou
- Department of Physics and Astronomy, Department of Biochemistry, Dalton Cardiovascular Research Center, and Informatics Institute, University of Missouri, Columbia, MO 65211
| |
Collapse
|
104
|
Xu D, Zhang J, Roy A, Zhang Y. Automated protein structure modeling in CASP9 by I-TASSER pipeline combined with QUARK-based ab initio folding and FG-MD-based structure refinement. Proteins 2011; 79 Suppl 10:147-60. [PMID: 22069036 PMCID: PMC3228277 DOI: 10.1002/prot.23111] [Citation(s) in RCA: 118] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2011] [Revised: 06/07/2011] [Accepted: 06/26/2011] [Indexed: 11/09/2022]
Abstract
I-TASSER is an automated pipeline for protein tertiary structure prediction using multiple threading alignments and iterative structure assembly simulations. In CASP9 experiments, two new algorithms, QUARK and fragment-guided molecular dynamics (FG-MD), were added to the I-TASSER pipeline for improving the structural modeling accuracy. QUARK is a de novo structure prediction algorithm used for structure modeling of proteins that lack detectable template structures. For distantly homologous targets, QUARK models are found useful as a reference structure for selecting good threading alignments and guiding the I-TASSER structure assembly simulations. FG-MD is an atomic-level structural refinement program that uses structural fragments collected from the PDB structures to guide molecular dynamics simulation and improve the local structure of predicted model, including hydrogen-bonding networks, torsion angles, and steric clashes. Despite considerable progress in both the template-based and template-free structure modeling, significant improvements on protein target classification, domain parsing, model selection, and ab initio folding of β-proteins are still needed to further improve the I-TASSER pipeline.
Collapse
Affiliation(s)
- Dong Xu
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | | | | | | |
Collapse
|
105
|
Optimal mutation sites for PRE data collection and membrane protein structure prediction. Structure 2011; 19:484-95. [PMID: 21481772 DOI: 10.1016/j.str.2011.02.002] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2010] [Revised: 02/11/2011] [Accepted: 02/11/2011] [Indexed: 01/16/2023]
Abstract
Nuclear magnetic resonance paramagnetic relaxation enhancement (PRE) measures long-range distances to isotopically labeled residues, providing useful constraints for protein structure prediction. The method usually requires labor-intensive conjugation of nitroxide labels to multiple locations on the protein, one at a time. Here a computational procedure, based on protein sequence and simple secondary structure models, is presented to facilitate optimal placement of a minimum number of labels needed to determine the correct topology of a helical transmembrane protein. Tests on DsbB (four helices) using just one label lead to correct topology predictions in four of five cases, with the predicted structures <6 Å to the native structure. Benchmark results using simulated PRE data show that we can generally predict the correct topology for five and six to seven helices using two and three labels, respectively, with an average success rate of 76% and structures of similar precision. The results show promise in facilitating experimentally constrained structure prediction of membrane proteins.
Collapse
|
106
|
Huang NK, Lin JH, Lin JT, Lin CI, Liu EM, Lin CJ, Chen WP, Shen YC, Chen HM, Chen JB, Lai HL, Yang CW, Chiang MC, Wu YS, Chang C, Chen JF, Fang JM, Lin YL, Chern Y. A new drug design targeting the adenosinergic system for Huntington's disease. PLoS One 2011; 6:e20934. [PMID: 21713039 PMCID: PMC3119665 DOI: 10.1371/journal.pone.0020934] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2011] [Accepted: 05/13/2011] [Indexed: 02/01/2023] Open
Abstract
Background Huntington's disease (HD) is a neurodegenerative disease caused by a CAG trinucleotide expansion in the Huntingtin (Htt) gene. The expanded CAG repeats are translated into polyglutamine (polyQ), causing aberrant functions as well as aggregate formation of mutant Htt. Effective treatments for HD are yet to be developed. Methodology/Principal Findings Here, we report a novel dual-function compound, N6-(4-hydroxybenzyl)adenine riboside (designated T1-11) which activates the A2AR and a major adenosine transporter (ENT1). T1-11 was originally isolated from a Chinese medicinal herb. Molecular modeling analyses showed that T1-11 binds to the adenosine pockets of the A2AR and ENT1. Introduction of T1-11 into the striatum significantly enhanced the level of striatal adenosine as determined by a microdialysis technique, demonstrating that T1-11 inhibited adenosine uptake in vivo. A single intraperitoneal injection of T1-11 in wildtype mice, but not in A2AR knockout mice, increased cAMP level in the brain. Thus, T1-11 enters the brain and elevates cAMP via activation of the A2AR in vivo. Most importantly, addition of T1-11 (0.05 mg/ml) to the drinking water of a transgenic mouse model of HD (R6/2) ameliorated the progressive deterioration in motor coordination, reduced the formation of striatal Htt aggregates, elevated proteasome activity, and increased the level of an important neurotrophic factor (brain derived neurotrophic factor) in the brain. These results demonstrate the therapeutic potential of T1-11 for treating HD. Conclusions/Significance The dual functions of T1-11 enable T1-11 to effectively activate the adenosinergic system and subsequently delay the progression of HD. This is a novel therapeutic strategy for HD. Similar dual-function drugs aimed at a particular neurotransmitter system as proposed herein may be applicable to other neurotransmitter systems (e.g., the dopamine receptor/dopamine transporter and the serotonin receptor/serotonin transporter) and may facilitate the development of new drugs for other neurodegenerative diseases.
Collapse
Affiliation(s)
- Nai-Kuei Huang
- National Research Institute of Chinese Medicine, Taipei, Taiwan
| | - Jung-Hsin Lin
- Division of Mechanics, Research Center for Applied Sciences, Academia Sinica, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
- School of Pharmacy, National Taiwan University, Taipei, Taiwan
| | - Jiun-Tsai Lin
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Chia-I Lin
- Department of Chemistry, National Taiwan University, Taipei, Taiwan
| | - Eric Minwei Liu
- School of Pharmacy, National Taiwan University, Taipei, Taiwan
| | - Chun-Jung Lin
- School of Pharmacy, National Taiwan University, Taipei, Taiwan
| | - Wan-Ping Chen
- National Research Institute of Chinese Medicine, Taipei, Taiwan
| | - Yuh-Chiang Shen
- National Research Institute of Chinese Medicine, Taipei, Taiwan
| | - Hui-Mei Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Jhih-Bin Chen
- Department of Chemistry, National Taiwan University, Taipei, Taiwan
| | - Hsing-Lin Lai
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Chieh-Wen Yang
- Department of Chemistry, National Taiwan University, Taipei, Taiwan
| | - Ming-Chang Chiang
- Graduate Institute of Biotechnology, Chinese Culture University, Taipei, Taiwan
| | - Yu-Shuo Wu
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Chen Chang
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Jiang-Fan Chen
- Department of Neurology, Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Jim-Min Fang
- Department of Chemistry, National Taiwan University, Taipei, Taiwan
- The Genomics Research Center, Academia Sinica, Taipei, Taiwan
- * E-mail: (YC); (YLL); (JMF)
| | - Yun-Lian Lin
- National Research Institute of Chinese Medicine, Taipei, Taiwan
- * E-mail: (YC); (YLL); (JMF)
| | - Yijuang Chern
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
- * E-mail: (YC); (YLL); (JMF)
| |
Collapse
|
107
|
Brylinski M, Gao M, Skolnick J. Why not consider a spherical protein? Implications of backbone hydrogen bonding for protein structure and function. Phys Chem Chem Phys 2011; 13:17044-55. [PMID: 21655593 DOI: 10.1039/c1cp21140d] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The intrinsic ability of protein structures to exhibit the geometric features required for molecular function in the absence of evolution is examined in the context of three systems: the reference set of real, single domain protein structures, a library of computationally generated, compact homopolypeptides, artificial structures with protein-like secondary structural elements, and quasi-spherical random proteins packed at the same density as proteins but lacking backbone secondary structure and hydrogen bonding. Without any evolutionary selection, the library of artificial structures has similar backbone hydrogen bonding, global shape, surface to volume ratio and statistically significant structural matches to real protein global structures. Moreover, these artificial structures have native like ligand binding cavities, and a tiny subset has interfacial geometries consistent with native-like protein-protein interactions and DNA binding. In contrast, the quasi-spherical random proteins, being devoid of secondary structure, have a lower surface to volume ratio and lack ligand binding pockets and intermolecular interaction interfaces. Surprisingly, these quasi-spherical random proteins exhibit protein like distributions of virtual bond angles and almost all have a statistically significant structural match to real protein structures. This implies that it is local chain stiffness, even without backbone hydrogen bonding, and compactness that give rise to the likely completeness of the library solved single domain protein structures. These studies also suggest that the packing of secondary structural elements generates the requisite geometry for intermolecular binding. Thus, backbone hydrogen bonding plays an important role not only in protein structure but also in protein function. Such ability to bind biological molecules is an inherent feature of protein structure; if combined with appropriate protein sequences, it could provide the non-zero background probability for low-level function that evolution requires for selection to occur.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, Georgia Institute of Technology, 250 14th St NW, Atlanta, GA 30076, USA
| | | | | |
Collapse
|
108
|
Kifer I, Nussinov R, Wolfson HJ. Protein structure prediction using a docking-based hierarchical folding scheme. Proteins 2011; 79:1759-73. [PMID: 21445943 PMCID: PMC3092838 DOI: 10.1002/prot.22999] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Revised: 01/02/2011] [Accepted: 01/18/2011] [Indexed: 12/13/2022]
Abstract
The pathways by which proteins fold into their specific native structure are still an unsolved mystery. Currently, many methods for protein structure prediction are available, and most of them tackle the problem by relying on the vast amounts of data collected from known protein structures. These methods are often not concerned with the route the protein follows to reach its final fold. This work is based on the premise that proteins fold in a hierarchical manner. We present FOBIA, an automated method for predicting a protein structure. FOBIA consists of two main stages: the first finds matches between parts of the target sequence and independently folding structural units using profile-profile comparison. The second assembles these units into a 3D structure by searching and ranking their possible orientations toward each other using a docking-based approach. We have previously reported an application of an initial version of this strategy to homology based targets. Since then we have considerably enhanced our method's abilities to allow it to address the more difficult template-based target category. This allows us to now apply FOBIA to the template-based targets of CASP8 and to show that it is both very efficient and promising. Our method can provide an alternative for template-based structure prediction, and in particular, the docking-basedranking technique presented here can be incorporated into any profile-profile comparison based method.
Collapse
Affiliation(s)
- Ilona Kifer
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
| | | | | |
Collapse
|
109
|
Reyes S, Park S, Johnson BD, Terzic A, Olson TM. KATP channel Kir6.2 E23K variant overrepresented in human heart failure is associated with impaired exercise stress response. Hum Genet 2011; 126:779-89. [PMID: 19685080 DOI: 10.1007/s00439-009-0731-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2009] [Accepted: 08/05/2009] [Indexed: 12/13/2022]
Abstract
ATP-sensitive K+ (K(ATP)) channels maintain cardiac homeostasis under stress, as revealed by murine gene knockout models of the KCNJ11-encoded Kir6.2 pore. However, the translational significance of K(ATP) channels in human cardiac physiology remains largely unknown. Here, the frequency of the minor K23 allele of the common functional Kir6.2 E23K polymorphism was found overrepresented in 115 subjects with congestive heart failure compared to 2,031 community-based controls (69 vs. 56%, P < 0.001). Moreover, the KK genotype, present in 18% of heart failure patients, was associated with abnormal cardiopulmonary exercise stress testing. In spite of similar baseline heart rates at rest among genotypic subgroups (EE: 72.2 ± 2.3, EK: 75.0 ± 1.8 and KK:77.1 ± 3.0 bpm), subjects with the KK genotype had a significantly reduced heart rate increase at matched workload (EE: 32.8 ± 2.7%, EK: 28.8 ± 2.1%, KK: 21.7 ± 2.6%, P < 0.05), at 75% of maximum oxygen consumption (EE: 53.9 ± 3.9%, EK: 49.9 ± 3.1%, KK: 36.8 ± 5.3%, P < 0.05), and at peak V(O2) (EE: 82.8 ± 6.0%, EK: 80.5 ± 4.7%, KK: 59.7 ± 8.1%, P < 0.05). Molecular modeling of the tetrameric Kir6.2 pore structure revealed the E23 residue within the functionally relevant intracellular slide helix region. Substitution of the wild-type E residue with an oppositely charged, bulkier K residue would potentially result in a significant structural rearrangement and disrupted interactions with neighboring Kir6.2 subunits, providing a basis for altered high-fidelity K(ATP) channel gating, particularly in the homozygous state. Blunted heart rate response during exercise is a risk factor for mortality in patients with heart failure, establishing the clinical relevance of Kir6.2 E23K as a biomarker for impaired stress performance and underscoring the essential role of K(ATP) channels in human cardiac physiology.
Collapse
Affiliation(s)
- Santiago Reyes
- Marriott Heart Disease Research Program, Mayo Clinic, Rochester, MN 55905, USA
| | | | | | | | | |
Collapse
|
110
|
Taylor CM, Fischer K, Abubucker S, Wang Z, Martin J, Jiang D, Magliano M, Rosso MN, Li BW, Fischer PU, Mitreva M. Targeting protein-protein interactions for parasite control. PLoS One 2011; 6:e18381. [PMID: 21556146 PMCID: PMC3083401 DOI: 10.1371/journal.pone.0018381] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2010] [Accepted: 02/28/2011] [Indexed: 01/24/2023] Open
Abstract
Finding new drug targets for pathogenic infections would be of great utility for humanity, as there is a large need to develop new drugs to fight infections due to the developing resistance and side effects of current treatments. Current drug targets for pathogen infections involve only a single protein. However, proteins rarely act in isolation, and the majority of biological processes occur via interactions with other proteins, so protein-protein interactions (PPIs) offer a realm of unexplored potential drug targets and are thought to be the next-generation of drug targets. Parasitic worms were chosen for this study because they have deleterious effects on human health, livestock, and plants, costing society billions of dollars annually and many sequenced genomes are available. In this study, we present a computational approach that utilizes whole genomes of 6 parasitic and 1 free-living worm species and 2 hosts. The species were placed in orthologous groups, then binned in species-specific orthologous groups. Proteins that are essential and conserved among species that span a phyla are of greatest value, as they provide foundations for developing broad-control strategies. Two PPI databases were used to find PPIs within the species specific bins. PPIs with unique helminth proteins and helminth proteins with unique features relative to the host, such as indels, were prioritized as drug targets. The PPIs were scored based on RNAi phenotype and homology to the PDB (Protein DataBank). EST data for the various life stages, GO annotation, and druggability were also taken into consideration. Several PPIs emerged from this study as potential drug targets. A few interactions were supported by co-localization of expression in M. incognita (plant parasite) and B. malayi (H. sapiens parasite), which have extremely different modes of parasitism. As more genomes of pathogens are sequenced and PPI databases expanded, this methodology will become increasingly applicable.
Collapse
Affiliation(s)
- Christina M. Taylor
- Department of Genetics, The Genome Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Kerstin Fischer
- Infectious Diseases Division, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Sahar Abubucker
- Department of Genetics, The Genome Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Zhengyuan Wang
- Department of Genetics, The Genome Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - John Martin
- Department of Genetics, The Genome Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Daojun Jiang
- Infectious Diseases Division, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Marc Magliano
- INRA 1301, CNRS 6243, UNSA, Interactions Biotiques et Santé Végétale, Sophia-Antipolis, France
| | - Marie-Noëlle Rosso
- INRA 1301, CNRS 6243, UNSA, Interactions Biotiques et Santé Végétale, Sophia-Antipolis, France
| | - Ben-Wen Li
- Infectious Diseases Division, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Peter U. Fischer
- Infectious Diseases Division, Department of Internal Medicine, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Makedonka Mitreva
- Department of Genetics, The Genome Center, Washington University School of Medicine, St. Louis, Missouri, United States of America
| |
Collapse
|
111
|
Shirota M, Ishida T, Kinoshita K. Absolute quality evaluation of protein model structures using statistical potentials with respect to the native and reference states. Proteins 2011; 79:1550-63. [PMID: 21365682 DOI: 10.1002/prot.22982] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2010] [Revised: 11/19/2010] [Accepted: 12/19/2010] [Indexed: 11/06/2022]
Abstract
In protein structure prediction, it is crucial to evaluate the degree of native-likeness of given model structures. Statistical potentials extracted from protein structure data sets are widely used for such quality assessment problems, but they are only applicable for comparing different models of the same protein. Although various other methods, such as machine learning approaches, were developed to predict the absolute similarity of model structures to the native ones, they required a set of decoy structures in addition to the model structures. In this paper, we tried to reformulate the statistical potentials as absolute quality scores, without using the information from decoy structures. For this purpose, we regarded the native state and the reference state, which are necessary components of statistical potentials, as the good and bad standard states, respectively, and first showed that the statistical potentials can be regarded as the state functions, which relate a model structure to the native and reference states. Then, we proposed a standardized measure of protein structure, called native-likeness, by interpolating the score of a model structure between the native and reference state scores defined for each protein. The native-likeness correlated with the similarity to the native structures and discriminated the native structures from the models, with better accuracy than the raw score. Our results show that statistical potentials can quantify the native-like properties of protein structures, if they fully utilize the statistical information obtained from the data set.
Collapse
Affiliation(s)
- Matsuyuki Shirota
- Department of Applied Information Sciences, Graduate School of Information Science, Tohoku University, 6-3-09, Aoba, Aramaki, Aoba-Ku, Sendai, Miyagi 980-8579, Japan
| | | | | |
Collapse
|
112
|
Pandit SB, Skolnick J. TASSER_low-zsc: an approach to improve structure prediction using low z-score-ranked templates. Proteins 2011; 78:2769-80. [PMID: 20635423 DOI: 10.1002/prot.22791] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
In a variety of threading methods, often poorly ranked (low z-score) templates have good alignments. Here, a new method, TASSER_low-zsc that identifies these low z-score-ranked templates to improve protein structure prediction accuracy, is described. The approach consists of clustering of threading templates by affinity propagation on the basis of structural similarity (thread_cluster) followed by TASSER modeling, with final models selected by using a TASSER_QA variant. To establish the generality of the approach, templates provided by two threading methods, SP(3) and SPARKS(2), are examined. The SP(3) and SPARKS(2) benchmark datasets consist of 351 and 357 medium/hard proteins (those with moderate to poor quality templates and/or alignments) of length < or =250 residues, respectively. For SP(3) medium and hard targets, using thread_cluster, the TM-scores of the best template improve by approximately 4 and 9% over the original set (without low z-score templates) respectively; after TASSER modeling/refinement and ranking, the best model improves by approximately 7 and 9% over the best model generated with the original template set. Moreover, TASSER_low-zsc generates 22% (43%) more foldable medium (hard) targets. Similar improvements are observed with low-ranked templates from SPARKS(2). The template clustering approach could be applied to other modeling methods that utilize multiple templates to improve structure prediction.
Collapse
Affiliation(s)
- Shashi B Pandit
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | | |
Collapse
|
113
|
Gopal V, Guruprasad K. Structure prediction and validation of an affibody engineered for cell-specific nucleic acid targeting. SYSTEMS AND SYNTHETIC BIOLOGY 2011; 4:293-7. [PMID: 22132056 DOI: 10.1007/s11693-011-9074-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2010] [Revised: 11/23/2010] [Accepted: 02/03/2011] [Indexed: 12/18/2022]
Abstract
Cell-penetrating peptides comprising cloned epitopes that contribute to membrane transduction, DNA-binding and cell targeting functions are known to facilitate nucleic acid delivery. Using the ITASSER software, we predicted the 3-D structure of a well characterized and efficient transfecting cell-penetrating peptide, namely TAT-Mu and its derivative TAT-Mu-AF protein that harbors a targeting ligand, the HER2-binding affibody. Our model predicts TAT-Mu-AF fusion protein as primarily comprising α-helices. The affibody in TAT-Mu-AF is predicted as a 3-helical domain that is distinct from the TAT-Mu domain. Its positioning in three-dimensional structure is oriented in a manner that possibly favors interactions with receptor and facilitates transport to the target site. The linker region between TAT-Mu and the affibody is also predicted as a helix that is likely to stabilize the overall fold of the TAT-Mu-AF complex. Further, the evaluation of secondary structure of the designed TAT-Mu-AF fusion protein by circular dichroism is in support of our predictions.
Collapse
Affiliation(s)
- Vijaya Gopal
- Centre for Cellular and Molecular Biology, (Council for Scientific and Industrial Research), Uppal Road, Hyderabad, Andhra Pradesh 500007 India
| | | |
Collapse
|
114
|
Brylinski M, Skolnick J. Comprehensive structural and functional characterization of the human kinome by protein structure modeling and ligand virtual screening. J Chem Inf Model 2011; 50:1839-54. [PMID: 20853887 DOI: 10.1021/ci100235n] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The growing interest in the identification of kinase inhibitors, promising therapeutics in the treatment of many diseases, has created a demand for the structural characterization of the entire human kinome. At the outset of the drug development process, the lead-finding stage, approaches that enrich the screening library with bioactive compounds are needed. Here, protein structure based methods can play an important role, but despite structural genomics efforts, it is unlikely that the three-dimensional structures of the entire kinome will be available soon. Therefore, at the proteome level, structure-based approaches must rely on predicted models, with a key issue being their utility in virtual ligand screening. In this study, we employ the recently developed FINDSITE/Q-Dock ligand homology modeling approach, which is well-suited for proteome-scale applications using predicted structures, to provide extensive structural and functional characterization of the human kinome. Specifically, we construct structure models for the human kinome; these are subsequently subject to virtual screening against a library of more than 2 million compounds. To rank the compounds, we employ a hierarchical approach that combines ligand- and structure-based filters. Modeling accuracy is carefully validated using available experimental data with particularly encouraging results found for the ability to identify, without prior knowledge, specific kinase inhibitors. More generally, the modeling procedure results in a large number of predicted molecular interactions between kinases and small ligands that should be of practical use in the development of novel inhibitors. The data set is freely available to the academic community via a user-friendly Web interface at http://cssb.biology.gatech.edu/kinomelhm/ as well as at the ZINC Web site ( http://zinc.docking.org/applications/2010Apr/Brylinski-2010.tar.gz ).
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | | |
Collapse
|
115
|
Lee SY, Skolnick J. TASSER_WT: a protein structure prediction algorithm with accurate predicted contact restraints for difficult protein targets. Biophys J 2011; 99:3066-75. [PMID: 21044605 DOI: 10.1016/j.bpj.2010.09.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2010] [Revised: 08/29/2010] [Accepted: 09/07/2010] [Indexed: 12/29/2022] Open
Abstract
To improve the prediction accuracy in the regime where template alignment quality is poor, an updated version of TASSER_2.0, namely TASSER_WT, was developed. TASSER_WT incorporates more accurate contact restraints from a new method, COMBCON. COMBCON uses confidence-weighted contacts from PROSPECTOR_3.5, the latest version, PROSPECTOR_4, and a new local structural fragment-based threading algorithm, STITCH, implemented in two variants depending on expected fragment prediction accuracy. TASSER_WT is tested on 622 Hard proteins, the most difficult targets (incorrect alignments and/or templates and incorrect side-chain contact restraints) in a comprehensive benchmark of 2591 nonhomologous, single domain proteins ≤ 200 residues that cover the PDB at 35% pairwise sequence identity. For 454 of 622 Hard targets, COMBCON provides contact restraints with higher accuracy and number of contacts per residue. As contact coverage with confidence weight ≥ 3 (F(wt ≥ 3)(cov)) increases, the more improved are TASSER_WT models. When F(wt ≥ 3)(cov) > 1.0 and > 0.4, the average root mean-square deviation of TASSER_WT (TASSER_2.0) models is 4.11 Å (6.72 Å) and 5.03 Å (6.40 Å), respectively. Regarding a structure prediction as successful when a model has a TM-score to the native structure ≥ 0.4, when F(wt ≥ 3)(cov) > 1.0 and > 0.4, the success rate of TASSER_WT (TASSER_2.0) is 98.8% (76.2%) and 93.7% (81.1%), respectively.
Collapse
Affiliation(s)
- Seung Yup Lee
- Center for Study of Systems Biology, Georgia Institute of Technology, Atlanta, Georgia, USA
| | | |
Collapse
|
116
|
RNA and protein 3D structure modeling: similarities and differences. J Mol Model 2011; 17:2325-36. [PMID: 21258831 PMCID: PMC3168752 DOI: 10.1007/s00894-010-0951-x] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2010] [Accepted: 12/29/2010] [Indexed: 02/06/2023]
Abstract
In analogy to proteins, the function of RNA depends on its structure and dynamics, which are encoded in the linear sequence. While there are numerous methods for computational prediction of protein 3D structure from sequence, there have been very few such methods for RNA. This review discusses template-based and template-free approaches for macromolecular structure prediction, with special emphasis on comparison between the already tried-and-tested methods for protein structure modeling and the very recently developed “protein-like” modeling methods for RNA. We highlight analogies between many successful methods for modeling of these two types of biological macromolecules and argue that RNA 3D structure can be modeled using “protein-like” methodology. We also highlight the areas where the differences between RNA and proteins require the development of RNA-specific solutions. Approaches for predicting RNA structure. Top: Template-free modeling. Bottom: Template-based modeling ![]()
Collapse
|
117
|
Betancourt MR. Optimization of Monte Carlo trial moves for protein simulations. J Chem Phys 2011; 134:014104. [PMID: 21218994 DOI: 10.1063/1.3515960] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Closed rigid-body rotations of residue segments under bond-angle restraints are simple and effective Monte Carlo moves for searching the conformational space of proteins. The efficiency of these moves is examined here as a function of the number of moving residues and the magnitude of their displacement. It is found that the efficiency of folding and equilibrium simulations can be significantly improved by tailoring the distribution of the number of moving residues to the simulation temperature. In general, simulations exploring compact conformations are more efficient when the average number of moving residues is smaller. It is also demonstrated that the moves do not require additional restrictions on the magnitude of the rotation displacements and perform much better than other rotation moves that do not restrict the bond angles a priori. As an example, these results are applied to the replica exchange method. By assigning distributions that generate a smaller number of moving residues to lower temperature replicas, the simulation times are decreased as long as the higher temperature replicas are effective.
Collapse
Affiliation(s)
- Marcos R Betancourt
- Department of Physics, Indiana University Purdue University Indianapolis, 402 N. Blackford St., LD156-J Indianapolis, Indiana 46202, USA.
| |
Collapse
|
118
|
Vorobjev YN. Advances in implicit models of water solvent to compute conformational free energy and molecular dynamics of proteins at constant pH. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2011; 85:281-322. [PMID: 21920327 DOI: 10.1016/b978-0-12-386485-7.00008-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Modern implicit solvent models for macromolecular simulations in water-proton bath are considered. The fundamental quantity that implicit models approximate is the solute potential of mean force, which is obtained by averaging over solvent degrees of freedom. The implicit solvent models suggest practical ways to calculate free energies of macromolecular conformations taking into account equilibrium interactions with water solvent and proton bath, while the explicit solvent approach is unable to do that due to the need to account for a large number of solvent degrees of freedom. The most advanced realizations of the implicit continuum models by different research groups are discussed, their accuracy are examined, and some applications of the implicit solvent models to macromolecular modeling, such as free energy calculations, protein folding, and constant pH molecular dynamics are highlighted.
Collapse
|
119
|
Zhou Y, Duan Y, Yang Y, Faraggi E, Lei H. Trends in template/fragment-free protein structure prediction. Theor Chem Acc 2011; 128:3-16. [PMID: 21423322 PMCID: PMC3030773 DOI: 10.1007/s00214-010-0799-2] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2010] [Accepted: 08/15/2010] [Indexed: 12/13/2022]
Abstract
Predicting the structure of a protein from its amino acid sequence is a long-standing unsolved problem in computational biology. Its solution would be of both fundamental and practical importance as the gap between the number of known sequences and the number of experimentally solved structures widens rapidly. Currently, the most successful approaches are based on fragment/template reassembly. Lacking progress in template-free structure prediction calls for novel ideas and approaches. This article reviews trends in the development of physical and specific knowledge-based energy functions as well as sampling techniques for fragment-free structure prediction. Recent physical- and knowledge-based studies demonstrated that it is possible to sample and predict highly accurate protein structures without borrowing native fragments from known protein structures. These emerging approaches with fully flexible sampling have the potential to move the field forward.
Collapse
Affiliation(s)
- Yaoqi Zhou
- School of Informatics, Indiana Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indiana University Purdue University, 719 Indiana Ave #319, Walker Plaza Building, Indianapolis, IN 46202 USA
| | - Yong Duan
- UC Davis Genome Center and Department of Applied Science, University of California, One Shields Avenue, Davis, CA USA
- College of Physics, Huazhong University of Science and Technology, 1037 Luoyu Road, 430074 Wuhan, China
| | - Yuedong Yang
- School of Informatics, Indiana Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indiana University Purdue University, 719 Indiana Ave #319, Walker Plaza Building, Indianapolis, IN 46202 USA
| | - Eshel Faraggi
- School of Informatics, Indiana Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indiana University Purdue University, 719 Indiana Ave #319, Walker Plaza Building, Indianapolis, IN 46202 USA
| | - Hongxing Lei
- UC Davis Genome Center and Department of Applied Science, University of California, One Shields Avenue, Davis, CA USA
- Beijing Institute of Genomics, Chinese Academy of Sciences, 100029 Beijing, China
| |
Collapse
|
120
|
Brylinski M, Skolnick J. FINDSITE-metal: integrating evolutionary information and machine learning for structure-based metal-binding site prediction at the proteome level. Proteins 2010; 79:735-51. [PMID: 21287609 DOI: 10.1002/prot.22913] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2010] [Revised: 09/27/2010] [Accepted: 10/07/2010] [Indexed: 12/13/2022]
Abstract
The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure-based approaches showing considerable promise. In this article, we present FINDSITE-metal, a new threading-based method designed specifically to detect metal-binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE-metal. Combining structure/evolutionary information with machine learning results in highly accurate metal-binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal-binding sites are detected with the best predicted binding site at rank 1 and within the top two ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium, and magnesium ions, the binding metal can be predicted with high, typically 70% to 90%, accuracy. FINDSITE-metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome-wide application of FINDSITE-metal that quantifies the metal-binding complement of the human proteome. FINDSITE-metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite-metal/.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | | |
Collapse
|
121
|
Zhou H, Skolnick J. Improving threading algorithms for remote homology modeling by combining fragment and template comparisons. Proteins 2010; 78:2041-8. [PMID: 20455261 DOI: 10.1002/prot.22717] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
In this work, we develop a method called fragment comparison and the template comparison (FTCOM) for assessing the global quality of protein structural models for targets of medium and hard difficulty (remote homology) produced by structure prediction approaches such as threading or ab initio structure prediction. FTCOM requires the C(alpha) coordinates of full length models and assesses model quality based on fragment comparison and a score derived from comparison of the model to top threading templates. On a set of 361 medium/hard targets, FTCOM was applied to and assessed for its ability to improve on the results from the SP(3), SPARKS, PROSPECTOR_3, and PRO-SP(3)-TASSER threading algorithms. The average TM-score improves by 5-10% for the first selected model by the new method over models obtained by the original selection procedure in the respective threading methods. Moreover, the number of foldable targets (TM-score >or= 0.4) increases from least 7.6% for SP(3) to 54% for SPARKS. Thus, FTCOM is a promising approach to template selection. Proteins 2010. (c) 2010 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | | |
Collapse
|
122
|
Gracy J, Chiche L. Optimizing structural modeling for a specific protein scaffold: knottins or inhibitor cystine knots. BMC Bioinformatics 2010; 11:535. [PMID: 21029427 PMCID: PMC2984590 DOI: 10.1186/1471-2105-11-535] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2010] [Accepted: 10/28/2010] [Indexed: 12/03/2022] Open
Abstract
Background Knottins are small, diverse and stable proteins with important drug design potential. They can be classified in 30 families which cover a wide range of sequences (1621 sequenced), three-dimensional structures (155 solved) and functions (> 10). Inter knottin similarity lies mainly between 15% and 40% sequence identity and 1.5 to 4.5 Å backbone deviations although they all share a tightly knotted disulfide core. This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines. The prediction of structural models for all knottin sequences would open new directions for the analysis of interaction sites and to provide a better understanding of the structural and functional organization of proteins sharing this scaffold. Results We have designed an automated modeling procedure for predicting the three-dimensionnal structure of knottins. The different steps of the homology modeling pipeline were carefully optimized relatively to a test set of knottins with known structures: template selection and alignment, extraction of structural constraints and model building, model evaluation and refinement. After optimization, the accuracy of predicted models was shown to lie between 1.50 and 1.96 Å from native structures at 50% and 10% maximum sequence identity levels, respectively. These average model deviations represent an improvement varying between 0.74 and 1.17 Å over a basic homology modeling derived from a unique template. A database of 1621 structural models for all known knottin sequences was generated and is freely accessible from our web server at http://knottin.cbs.cnrs.fr. Models can also be interactively constructed from any knottin sequence using the structure prediction module Knoter1D3D available from our protein analysis toolkit PAT at http://pat.cbs.cnrs.fr. Conclusions This work explores different directions for a systematic homology modeling of a diverse family of protein sequences. In particular, we have shown that the accuracy of the models constructed at a low level of sequence identity can be improved by 1) a careful optimization of the modeling procedure, 2) the combination of multiple structural templates and 3) the use of conserved structural features as modeling restraints.
Collapse
Affiliation(s)
- Jérôme Gracy
- CNRS, UMR5048, Université Montpellier 1 et 2, Centre de Biochimie Structurale, 34090 Montpellier, France.
| | | |
Collapse
|
123
|
Zhang J, Zhang Y. A novel side-chain orientation dependent potential derived from random-walk reference state for protein fold selection and structure prediction. PLoS One 2010; 5:e15386. [PMID: 21060880 PMCID: PMC2965178 DOI: 10.1371/journal.pone.0015386] [Citation(s) in RCA: 173] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2010] [Accepted: 09/01/2010] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND An accurate potential function is essential to attack protein folding and structure prediction problems. The key to developing efficient knowledge-based potential functions is to design reference states that can appropriately counteract generic interactions. The reference states of many knowledge-based distance-dependent atomic potential functions were derived from non-interacting particles such as ideal gas, however, which ignored the inherent sequence connectivity and entropic elasticity of proteins. METHODOLOGY We developed a new pair-wise distance-dependent, atomic statistical potential function (RW), using an ideal random-walk chain as reference state, which was optimized on CASP models and then benchmarked on nine structural decoy sets. Second, we incorporated a new side-chain orientation-dependent energy term into RW (RWplus) and found that the side-chain packing orientation specificity can further improve the decoy recognition ability of the statistical potential. SIGNIFICANCE RW and RWplus demonstrate a significantly better ability than the best performing pair-wise distance-dependent atomic potential functions in both native and near-native model selections. It has higher energy-RMSD and energy-TM-score correlations compared with other potentials of the same type in real-life structure assembly decoys. When benchmarked with a comprehensive list of publicly available potentials, RW and RWplus shows comparable performance to the state-of-the-art scoring functions, including those combining terms from multiple resources. These data demonstrate the usefulness of random-walk chain as reference states which correctly account for sequence connectivity and entropic elasticity of proteins. It shows potential usefulness in structure recognition and protein folding simulations. The RW and RWplus potentials, as well as the newly generated I-TASSER decoys, are freely available in http://zhanglab.ccmb.med.umich.edu/RW.
Collapse
Affiliation(s)
- Jian Zhang
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Yang Zhang
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
124
|
Wu S, Zhang Y. Recognizing protein substructure similarity using segmental threading. Structure 2010; 18:858-67. [PMID: 20637422 DOI: 10.1016/j.str.2010.04.007] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Revised: 04/02/2010] [Accepted: 04/03/2010] [Indexed: 11/15/2022]
Abstract
Protein template identification is essential to protein structure and function predictions. However, conventional whole-chain threading approaches often fail to recognize conserved substructure motifs when the target and templates do not share the same fold. We developed a new approach, SEGMER, for identifying protein substructure similarities by segmental threading. The target sequence is split into segments of two to four consecutive or nonconsecutive secondary structural elements, which are then threaded through PDB to identify appropriate substructure motifs. SEGMER is tested on 144 nonredundant hard proteins. When combined with whole-chain threading, the TM-score of alignments and accuracy of spatial restraints of SEGMER increase by 16% and 25%, respectively, compared with that by the whole-chain threading methods only. When tested on 12 free modeling targets from CASP8, SEGMER increases the TM-score and contact accuracy by 28% and 48%, respectively. This significant improvement should have important impact on protein structure modeling and functional inference.
Collapse
Affiliation(s)
- Sitao Wu
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Drive, Lawrence, KS 66047, USA
| | | |
Collapse
|
125
|
Crowding and hydrodynamic interactions likely dominate in vivo macromolecular motion. Proc Natl Acad Sci U S A 2010; 107:18457-62. [PMID: 20937902 DOI: 10.1073/pnas.1011354107] [Citation(s) in RCA: 291] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
To begin to elucidate the principles of intermolecular dynamics in the crowded environment of cells, employing brownian dynamics (BD) simulations, we examined possible mechanism(s) responsible for the great reduction in diffusion constants of macromolecules in vivo from that at infinite dilution. In an Escherichia coli cytoplasm model comprised of 15 different macromolecule types at physiological concentrations, BD simulations of molecular-shaped and equivalent sphere representations were performed with a soft repulsive potential. At cellular concentrations, the calculated diffusion constant of GFP is much larger than experiment, with no significant shape dependence. Next, using the equivalent sphere system, hydrodynamic interactions (HI) were considered. Without adjustable parameters, the in vivo experimental GFP diffusion constant was reproduced. Finally, the effects of nonspecific attractive interactions were examined. The reduction in diffusivity is very sensitive to macromolecular radius with the motion of the largest macromolecules dramatically slowed down; this is not seen if HI dominate. In addition, long-lived clusters involving the largest macromolecules form if attractions dominate, whereas HI give rise to significant, size independent intermolecular dynamic correlations. These qualitative differences provide a testable means of differentiating the importance of HI vs. nonspecific attractive interactions on macromolecular motion in cells.
Collapse
|
126
|
Lin MS, Head-Gordon T. Reliable protein structure refinement using a physical energy function. J Comput Chem 2010; 32:709-17. [DOI: 10.1002/jcc.21664] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2010] [Revised: 08/02/2010] [Accepted: 08/07/2010] [Indexed: 11/10/2022]
|
127
|
Meng L, Feldman LJ. CLE14/CLE20 peptides may interact with CLAVATA2/CORYNE receptor-like kinases to irreversibly inhibit cell division in the root meristem of Arabidopsis. PLANTA 2010; 232:1061-74. [PMID: 20697738 PMCID: PMC2940047 DOI: 10.1007/s00425-010-1236-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2010] [Accepted: 07/15/2010] [Indexed: 05/03/2023]
Abstract
Towards an understanding of the interacting nature of the CLAVATA (CLV) complex, we predicted the 3D structures of CLV3/ESR-related (CLE) peptides and the ectodomain of their potential receptor proteins/kinases, and docking models of these molecules. The results show that the ectodomain of CLV1 can form homodimers and that the 12-/13-amino-acid CLV3 peptide fits into the binding clefts of the CLV1 dimers. Our results also demonstrate that the receptor domain of CORYNE (CRN), a recently identified receptor-like kinase, binds tightly to the ectodomain of CLV2, and this likely leads to an increased possibility for docking with CLV1. Furthermore, our docking models reveal that two CRN-CLV2 ectodomain heterodimers are able to form a tetramer receptor complex. Peptides of CLV3, CLE14, CLE19, and CLE20 are also able to bind a potential CLV2-CRN heterodimer or heterotetramer complex. Using a cell-division reporter line, we found that synthetic 12-amino-acid CLE14 and CLE20 peptides inhibit, irreversibly, root growth by reducing cell division rates in the root apical meristem, resulting in a short-root phenotype. Intriguingly, we observed that exogenous application of cytokinin can partially rescue the short-root phenotype induced by over-expression of either CLE14 or CLE20 in planta. However, cytokinin treatment does not rescue the short-root phenotype caused by exogenous application of the synthetic CLE14/CLE20 peptides, suggesting a requirement for a condition provided only in living plants. These results therefore imply that the CLE14/CLE20 peptides may act through the CLV2-CRN receptor kinase, and that their availabilities and/or abundances may be affected by cytokinin activity in planta.
Collapse
Affiliation(s)
- Ling Meng
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall, Berkeley, CA 94720-3102 USA
| | - Lewis J. Feldman
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall, Berkeley, CA 94720-3102 USA
| |
Collapse
|
128
|
Brylinski M, Lee SY, Zhou H, Skolnick J. The utility of geometrical and chemical restraint information extracted from predicted ligand-binding sites in protein structure refinement. J Struct Biol 2010; 173:558-69. [PMID: 20850544 DOI: 10.1016/j.jsb.2010.09.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2010] [Revised: 09/08/2010] [Accepted: 09/10/2010] [Indexed: 01/01/2023]
Abstract
Exhaustive exploration of molecular interactions at the level of complete proteomes requires efficient and reliable computational approaches to protein function inference. Ligand docking and ranking techniques show considerable promise in their ability to quantify the interactions between proteins and small molecules. Despite the advances in the development of docking approaches and scoring functions, the genome-wide application of many ligand docking/screening algorithms is limited by the quality of the binding sites in theoretical receptor models constructed by protein structure prediction. In this study, we describe a new template-based method for the local refinement of ligand-binding regions in protein models using remotely related templates identified by threading. We designed a Support Vector Regression (SVR) model that selects correct binding site geometries in a large ensemble of multiple receptor conformations. The SVR model employs several scoring functions that impose geometrical restraints on the Cα positions, account for the specific chemical environment within a binding site and optimize the interactions with putative ligands. The SVR score is well correlated with the RMSD from the native structure; in 47% (70%) of the cases, the Pearson's correlation coefficient is >0.5 (>0.3). When applied to weakly homologous models, the average heavy atom, local RMSD from the native structure of the top-ranked (best of top five) binding site geometries is 3.1Å (2.9Å) for roughly half of the targets; this represents a 0.1 (0.3)Å average improvement over the original predicted structure. Focusing on the subset of strongly conserved residues, the average heavy atom RMSD is 2.6Å (2.3Å). Furthermore, we estimate the upper bound of template-based binding site refinement using only weakly related proteins to be ∼2.6Å RMSD. This value also corresponds to the plasticity of the ligand-binding regions in distant homologues. The Binding Site Refinement (BSR) approach is available to the scientific community as a web server that can be accessed at http://cssb.biology.gatech.edu/bsr/.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, Georgia Institute of Technology, Atlanta, GA 30318, USA
| | | | | | | |
Collapse
|
129
|
Michino M, Chen J, Stevens RC, Brooks CL. FoldGPCR: structure prediction protocol for the transmembrane domain of G protein-coupled receptors from class A. Proteins 2010; 78:2189-201. [PMID: 20544957 PMCID: PMC2933064 DOI: 10.1002/prot.22731] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Building reliable structural models of G protein-coupled receptors (GPCRs) is a difficult task because of the paucity of suitable templates, low sequence identity, and the wide variety of ligand specificities within the superfamily. Template-based modeling is known to be the most successful method for protein structure prediction. However, refinement of homology models within 1-3 A C alpha RMSD of the native structure remains a major challenge. Here, we address this problem by developing a novel protocol (foldGPCR) for modeling the transmembrane (TM) region of GPCRs in complex with a ligand, aimed to accurately model the structural divergence between the template and target in the TM helices. The protocol is based on predicted conserved inter-residue contacts between the template and target, and exploits an all-atom implicit membrane force field. The placement of the ligand in the binding pocket is guided by biochemical data. The foldGPCR protocol is implemented by a stepwise hierarchical approach, in which the TM helical bundle and the ligand are assembled by simulated annealing trials in the first step, and the receptor-ligand complex is refined with replica exchange sampling in the second step. The protocol is applied to model the human beta(2)-adrenergic receptor (beta(2)AR) bound to carazolol, using contacts derived from the template structure of bovine rhodopsin. Comparison with the X-ray crystal structure of the beta(2)AR shows that our protocol is particularly successful in accurately capturing helix backbone irregularities and helix-helix packing interactions that distinguish rhodopsin from beta(2)AR.
Collapse
Affiliation(s)
- Mayako Michino
- Department of Molecular Biology, The Scripps Research Institute, 10550 N. Torrey Pines Rd, La Jolla, CA 92037
| | - Jianhan Chen
- Department of Biochemistry, Kansas State University, Manhattan, KS 66506
| | - Raymond C. Stevens
- Departments of Molecular Biology and Chemistry, The Scripps Research Institute, La Jolla, CA 92037
| | - Charles L. Brooks
- Department of Chemistry and Biophysics Program, University of Michigan, 930 N University Ave, Ann Arbor, MI 48109
| |
Collapse
|
130
|
Application of biasing-potential replica-exchange simulations for loop modeling and refinement of proteins in explicit solvent. Proteins 2010; 78:2809-19. [DOI: 10.1002/prot.22796] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
131
|
Shah AA, Folino G, Krasnogor N. Toward High-Throughput, Multicriteria Protein-Structure Comparison and Analysis. IEEE Trans Nanobioscience 2010; 9:144-55. [DOI: 10.1109/tnb.2010.2043851] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
132
|
Karakaş M, Woetzel N, Meiler J. BCL::contact-low confidence fold recognition hits boost protein contact prediction and de novo structure determination. J Comput Biol 2010; 17:153-68. [PMID: 19772383 DOI: 10.1089/cmb.2009.0030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Knowledge of all residue-residue contacts within a protein allows determination of the protein fold. Accurate prediction of even a subset of long-range contacts (contacts between amino acids far apart in sequence) can be instrumental for determining tertiary structure. Here we present BCL::Contact, a novel contact prediction method that utilizes artificial neural networks (ANNs) and specializes in the prediction of medium to long-range contacts. BCL::Contact comes in two modes: sequence-based and structure-based. The sequence-based mode uses only sequence information and has individual ANNs specialized for helix-helix, helix-strand, strand-helix, strand-strand, and sheet-sheet contacts. The structure-based mode combines results from 32-fold recognition methods with sequence information to a consensus prediction. The two methods were presented in the 6(th) and 7(th) Critical Assessment of Techniques for Protein Structure Prediction (CASP) experiments. The present work focuses on elucidating the impact of fold recognition results onto contact prediction via a direct comparison of both methods on a joined benchmark set of proteins. The sequence-based mode predicted contacts with 42% accuracy (7% false positive rate), while the structure-based mode achieved 45% accuracy (2% false positive rate). Predictions by both modes of BCL::Contact were supplied as input to the protein tertiary structure prediction program Rosetta for a benchmark of 17 proteins with no close sequence homologs in the protein data bank (PDB). Rosetta created higher accuracy models, signified by an improvement of 1.3 A on average root mean square deviation (RMSD), when driven by the predicted contacts. Further, filtering Rosetta models by agreement with the predicted contacts enriches for native-like fold topologies.
Collapse
Affiliation(s)
- Mert Karakaş
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, USA
| | | | | |
Collapse
|
133
|
Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers. BMC STRUCTURAL BIOLOGY 2010; 10 Suppl 1:S2. [PMID: 20487509 PMCID: PMC2873825 DOI: 10.1186/1472-6807-10-s1-s2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Background Prediction of long-range inter-residue contacts is an important topic in bioinformatics research. It is helpful for determining protein structures, understanding protein foldings, and therefore advancing the annotation of protein functions. Results In this paper, we propose a novel ensemble of genetic algorithm classifiers (GaCs) to address the long-range contact prediction problem. Our method is based on the key idea called sequence profile centers (SPCs). Each SPC is the average sequence profiles of residue pairs belonging to the same contact class or non-contact class. GaCs train on multiple but different pairs of long-range contact data (positive data) and long-range non-contact data (negative data). The negative data sets, having roughly the same sizes as the positive ones, are constructed by random sampling over the original imbalanced negative data. As a result, about 21.5% long-range contacts are correctly predicted. We also found that the ensemble of GaCs indeed makes an accuracy improvement by around 5.6% over the single GaC. Conclusions Classifiers with the use of sequence profile centers may advance the long-range contact prediction. In line with this approach, key structural features in proteins would be determined with high efficiency and accuracy.
Collapse
|
134
|
Huang C, Yang X, He Z. Protein folding simulations of 2D HP model by the genetic algorithm based on optimal secondary structures. Comput Biol Chem 2010; 34:137-42. [PMID: 20627698 DOI: 10.1016/j.compbiolchem.2010.04.002] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2009] [Revised: 12/20/2009] [Accepted: 04/27/2010] [Indexed: 11/16/2022]
Abstract
In this paper, based on the evolutionary Monte Carlo (EMC) algorithm, we have made four points of ameliorations and propose a so-called genetic algorithm based on optimal secondary structure (GAOSS) method to predict efficiently the protein folding conformations in the two-dimensional hydrophobic-hydrophilic (2D HP) model. Nine benchmarks are tested to verify the effectiveness of the proposed approach and the results show that for the listed benchmarks GAOSS can find the best solutions so far. It means that reasonable, effective and compact secondary structures (SSs) can avoid blind searches and can reduce time consuming significantly. On the other hand, as examples, we discuss the diversity of protein GSC for the 24-mer and 85-mer sequences. Several GSCs have been found by GAOSS and some of the conformations are quite different from each other. It would be useful for the designing of protein molecules. GAOSS would be an efficient tool for the protein structure predictions (PSP).
Collapse
Affiliation(s)
- Chenhua Huang
- MOE Key Laboratory of Laser Life Science & Institute of Laser Life Science, South China Normal University, Zhongshan Road, Guangzhou 510631, China
| | | | | |
Collapse
|
135
|
Zhang J, Wang Q, Barz B, He Z, Kosztin I, Shang Y, Xu D. MUFOLD: A new solution for protein 3D structure prediction. Proteins 2010; 78:1137-52. [PMID: 19927325 DOI: 10.1002/prot.22634] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
There have been steady improvements in protein structure prediction during the past 2 decades. However, current methods are still far from consistently predicting structural models accurately with computing power accessible to common users. Toward achieving more accurate and efficient structure prediction, we developed a number of novel methods and integrated them into a software package, MUFOLD. First, a systematic protocol was developed to identify useful templates and fragments from Protein Data Bank for a given target protein. Then, an efficient process was applied for iterative coarse-grain model generation and evaluation at the Calpha or backbone level. In this process, we construct models using interresidue spatial restraints derived from alignments by multidimensional scaling, evaluate and select models through clustering and static scoring functions, and iteratively improve the selected models by integrating spatial restraints and previous models. Finally, the full-atom models were evaluated using molecular dynamics simulations based on structural changes under simulated heating. We have continuously improved the performance of MUFOLD by using a benchmark of 200 proteins from the Astral database, where no template with >25% sequence identity to any target protein is included. The average root-mean-square deviation of the best models from the native structures is 4.28 A, which shows significant and systematic improvement over our previous methods. The computing time of MUFOLD is much shorter than many other tools, such as Rosetta. MUFOLD demonstrated some success in the 2008 community-wide experiment for protein structure prediction CASP8.
Collapse
Affiliation(s)
- Jingfen Zhang
- Department of Computer Science, University of Missouri, Columbia, Missouri 65211, USA
| | | | | | | | | | | | | |
Collapse
|
136
|
Abstract
The success of ligand docking calculations typically depends on the quality of the receptor structure. Given improvements in protein structure prediction approaches, approximate protein models now can be routinely obtained for the majority of gene products in a given proteome. Structure-based virtual screening of large combinatorial libraries of lead candidates against theoretically modeled receptor structures requires fast and reliable docking techniques capable of dealing with structural inaccuracies in protein models. Here, we present Q-Dock(LHM), a method for low-resolution refinement of binding poses provided by FINDSITE(LHM), a ligand homology modeling approach. We compare its performance to that of classical ligand docking approaches in ligand docking against a representative set of experimental (both holo and apo) as well as theoretically modeled receptor structures. Docking benchmarks reveal that unlike all-atom docking, Q-Dock(LHM) exhibits the desired tolerance to the receptor's structure deformation. Our results suggest that the use of an evolution-based approach to ligand homology modeling followed by fast low-resolution refinement is capable of achieving satisfactory performance in ligand-binding pose prediction with promising applicability to proteome-scale applications.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318
| |
Collapse
|
137
|
Kalman M, Ben-Tal N. Quality assessment of protein model-structures using evolutionary conservation. ACTA ACUST UNITED AC 2010; 26:1299-307. [PMID: 20385730 PMCID: PMC2865859 DOI: 10.1093/bioinformatics/btq114] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Motivation: Programs that evaluate the quality of a protein structural model are important both for validating the structure determination procedure and for guiding the model-building process. Such programs are based on properties of native structures that are generally not expected for faulty models. One such property, which is rarely used for automatic structure quality assessment, is the tendency for conserved residues to be located at the structural core and for variable residues to be located at the surface. Results: We present ConQuass, a novel quality assessment program based on the consistency between the model structure and the protein's conservation pattern. We show that it can identify problematic structural models, and that the scores it assigns to the server models in CASP8 correlate with the similarity of the models to the native structure. We also show that when the conservation information is reliable, the method's performance is comparable and complementary to that of the other single-structure quality assessment methods that participated in CASP8 and that do not use additional structural information from homologs. Availability: A perl implementation of the method, as well as the various perl and R scripts used for the analysis are available at http://bental.tau.ac.il/ConQuass/. Contact:nirb@tauex.tau.ac.il Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matan Kalman
- Department of Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel
| | | |
Collapse
|
138
|
Batyanovskii AV, Esipova NG, Shnoll SE. Mutual disposition of short conformationally stanch oligopeptides in the 3D structure of globular proteins. Biophysics (Nagoya-shi) 2010. [DOI: 10.1134/s0006350909060153] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
139
|
Roy A, Kucukural A, Zhang Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat Protoc 2010; 5:725-38. [PMID: 20360767 PMCID: PMC2849174 DOI: 10.1038/nprot.2010.5] [Citation(s) in RCA: 4836] [Impact Index Per Article: 322.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The iterative threading assembly refinement (I-TASSER) server is an integrated platform for automated protein structure and function prediction based on the sequence-to-structure-to-function paradigm. Starting from an amino acid sequence, I-TASSER first generates three-dimensional (3D) atomic models from multiple threading alignments and iterative structural assembly simulations. The function of the protein is then inferred by structurally matching the 3D models with other known proteins. The output from a typical server run contains full-length secondary and tertiary structure predictions, and functional annotations on ligand-binding sites, Enzyme Commission numbers and Gene Ontology terms. An estimate of accuracy of the predictions is provided based on the confidence score of the modeling. This protocol provides new insights and guidelines for designing of online server systems for the state-of-the-art protein structure and function predictions. The server is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER.
Collapse
Affiliation(s)
- Ambrish Roy
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| | - Alper Kucukural
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| | - Yang Zhang
- Center for Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Ave, Ann Arbor, MI 48109, USA
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, KS 66047, USA
| |
Collapse
|
140
|
Zhang N, Duan G, Gao S, Ruan J, Zhang T. Prediction of the parallel/antiparallel orientation of beta-strands using amino acid pairing preferences and support vector machines. J Theor Biol 2010; 263:360-8. [PMID: 20035768 DOI: 10.1016/j.jtbi.2009.12.019] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2009] [Revised: 11/05/2009] [Accepted: 12/17/2009] [Indexed: 10/20/2022]
|
141
|
Chugunov AO, Efremov RG. [Prediction of the spatial structure of proteins: emphasis on membrane targets]. RUSSIAN JOURNAL OF BIOORGANIC CHEMISTRY 2010; 35:744-60. [PMID: 20208575 DOI: 10.1134/s106816200906003x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Knowledge of the spatial structure of proteins is a prerequisite for both awareness of their functional mechanisms and the framework for rational drug discovery and design. Meanwhile, direct structural determination is often hampered or impractical due to the complexity, expensiveness, and limited capabilities of experimental techniques. These issues are especially pronounced for integral membrane proteins. On numerous occasions, the theoretical prediction of protein structures may facilitate the process by exploiting physical or empirical principles. This paper surveys modern techniques for the prediction of the spatial structure of proteins using computer algorithms, and the main emphasis is placed on the most "complex" targets - membrane proteins (MPs). The first part of the review describes de novo methods based on empirical physical principles; in the second part, a comparative modeling philosophy, which accounts for the structure of related proteins, is described. Special focus is made regarding pharmacologically relevant classes of G-coupled receptors, receptor tyrosine ki-nases, and other MPs. Algorithms for the assessment of the models quality and potential fields of application of computer models are discussed.
Collapse
|
142
|
Faraggi E, Yang Y, Zhang S, Zhou Y. Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. Structure 2010; 17:1515-27. [PMID: 19913486 DOI: 10.1016/j.str.2009.09.006] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2009] [Revised: 09/01/2009] [Accepted: 09/03/2009] [Indexed: 11/30/2022]
Abstract
Local structures predicted from protein sequences are used extensively in every aspect of modeling and prediction of protein structure and function. For more than 50 years, they have been predicted at a low-resolution coarse-grained level (e.g., three-state secondary structure). Here, we combine a two-state classifier with real-value predictor to predict local structure in continuous representation by backbone torsion angles. The accuracy of the angles predicted by this approach is close to that derived from NMR chemical shifts. Their substitution for predicted secondary structure as restraints for ab initio structure prediction doubles the success rate. This result demonstrates the potential of predicted local structure for fragment-free tertiary-structure prediction. It further implies potentially significant benefits from using predicted real-valued torsion angles as a replacement for or supplement to the secondary-structure prediction tools used almost exclusively in many computational methods ranging from sequence alignment to function prediction.
Collapse
Affiliation(s)
- Eshel Faraggi
- Indiana University School of Informatics, Indiana University-Purdue University and Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | | | | | | |
Collapse
|
143
|
McAllister SR, Floudas CA. An improved hybrid global optimization method for protein tertiary structure prediction. COMPUTATIONAL OPTIMIZATION AND APPLICATIONS 2010; 45:377-413. [PMID: 20357906 PMCID: PMC2847311 DOI: 10.1007/s10589-009-9277-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
First principles approaches to the protein structure prediction problem must search through an enormous conformational space to identify low-energy, near-native structures. In this paper, we describe the formulation of the tertiary structure prediction problem as a nonlinear constrained minimization problem, where the goal is to minimize the energy of a protein conformation subject to constraints on torsion angles and interatomic distances. The core of the proposed algorithm is a hybrid global optimization method that combines the benefits of the αBB deterministic global optimization approach with conformational space annealing. These global optimization techniques employ a local minimization strategy that combines torsion angle dynamics and rotamer optimization to identify and improve the selection of initial conformations and then applies a sequential quadratic programming approach to further minimize the energy of the protein conformations subject to constraints. The proposed algorithm demonstrates the ability to identify both lower energy protein structures, as well as larger ensembles of low-energy conformations.
Collapse
|
144
|
Liang S, Wang G, Zhou Y. Refining near-native protein-protein docking decoys by local resampling and energy minimization. Proteins 2010; 76:309-16. [PMID: 19156819 DOI: 10.1002/prot.22343] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
How to refine a near-native structure to make it closer to its native conformation is an unsolved problem in protein-structure and protein-protein complex-structure prediction. In this article, we first test several scoring functions for selecting locally resampled near-native protein-protein docking conformations and then propose a computationally efficient protocol for structure refinement via local resampling and energy minimization. The proposed method employs a statistical energy function based on a Distance-scaled Ideal-gas REference state (DFIRE) as an initial filter and an empirical energy function EMPIRE (EMpirical Protein-InteRaction Energy) for optimization and re-ranking. Significant improvement of final top-1 ranked structures over initial near-native structures is observed in the ZDOCK 2.3 decoy set for Benchmark 1.0 (74% whose global rmsd reduced by 0.5 A or more and only 7% increased by 0.5 A or more). Less significant improvement is observed for Benchmark 2.0 (38% versus 33%). Possible reasons are discussed.
Collapse
Affiliation(s)
- Shide Liang
- Indiana University School of Informatics, Indiana University-Purdue University, Indianapolis, 46202, USA
| | | | | |
Collapse
|
145
|
Chi YH, Koo YD, Dai SY, Ahn JE, Yun DJ, Lee SY, Zhu-Salzman K. N-glycosylation at non-canonical Asn-X-Cys sequence of an insect recombinant cathepsin B-like counter-defense protein. Comp Biochem Physiol B Biochem Mol Biol 2010; 156:40-7. [PMID: 20139027 DOI: 10.1016/j.cbpb.2010.01.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2009] [Revised: 01/29/2010] [Accepted: 01/30/2010] [Indexed: 10/19/2022]
Abstract
CmCatB, a cowpea bruchid cathepsin B-like cysteine protease, facilitates insects coping with dietary protease inhibitor challenge. Expression of recombinant CmCatB using a Pichia pastoris system yielded an enzymatically active protein that was heterogeneously glycosylated, migrating as a smear of > or =50kDa on SDS-PAGE. Treatment with peptide:N-glycosidase F indicated that N-glycosylation was predominant. CmCatB contains three N-glycosylation Asn-X-Ser/Thr consensus sequences. Simultaneously replacing all three Asn residues with Gln via site-directed mutagenesis did not result in completely unglycosylated protein, suggesting the existence of additional atypical glycosylation sites. We subsequently investigated potential N-glycosylation at the two Asn-X-Cys sites (Asn(100) and Asn(236)) in CmCatB. Asn to Gln substitution at Asn(100)-X-Cys on the background of the double mutation at the canonical sites (m1m2, Asn(97)-->Gln and Asn(207)-->Gln) resulted in a single discrete band on the gel, namely m1m2c1 (Asn(97)-->Gln, Asn(207)-->Gln and Asn(100)-->Gln). However, another triple mutant protein m1m2c2 (Asn(97)-->Gln, Asn(207)-->Gln and Asn(236)-->Gln) and quadruple mutant protein m1m2c1c2 were unable to be expressed in Pichia cells. Thus Asn(236) appears necessary for protein expression while Asn(100) is responsible for non-canonical glycosylation. Removal of carbohydrate moieties, particularly at Asn(100), substantially enhanced proteolytic activity but compromised protein stability. Thus, glycosylation could significantly impact biochemical properties of CmCatB.
Collapse
Affiliation(s)
- Yong Hun Chi
- Department of Entomology, Texas A&M University, College Station, 77843, USA
| | | | | | | | | | | | | |
Collapse
|
146
|
Zhou H, Pandit SB, Skolnick J. Performance of the Pro-sp3-TASSER server in CASP8. Proteins 2010; 77 Suppl 9:123-7. [PMID: 19639638 DOI: 10.1002/prot.22501] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The performance of the protein structure prediction server pro-sp3-TASSER in CASP8 is described. Compared to CASP7, the major improvement in prediction is in the quality of input models to TASSER. These improvements are due to the PRO-SP(3) threading method, the improved quality of contact predictions provided by TASSER_2.0, multiple short TASSER simulations for building the full-length model, and the accuracy of model selection using the TASSER-QA quality assessment method. Finally, we analyze the overall performance and highlight some successful predictions of the pro-sp3-TASSER server.
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | | | | |
Collapse
|
147
|
Abstract
The I-TASSER algorithm for 3D protein structure prediction was tested in CASP8, with the procedure fully automated in both the Server and Human sections. The quality of the server models is close to that of human ones but the human predictions incorporate more diverse templates from other servers which improve the human predictions in some of the distant homology targets. For the first time, the sequence-based contact predictions from machine learning techniques are found helpful for both template-based modeling (TBM) and template-free modeling (FM). In TBM, although the accuracy of the sequence based contact predictions is on average lower than that from template-based ones, the novel contacts in the sequence-based predictions, which are complementary to the threading templates in the weakly or unaligned regions, are important to improve the global and local packing in these regions. Moreover, the newly developed atomic structural refinement algorithm was tested in CASP8 and found to improve the hydrogen-bonding networks and the overall TM-score, which is mainly due to its ability of removing steric clashes so that the models can be generated from cluster centroids. Nevertheless, one of the major issues of the I-TASSER pipeline is the model selection where the best models could not be appropriately recognized when the correct templates are detected only by the minority of the threading algorithms. There are also problems related with domain-splitting and mirror image recognition which mainly influences the performance of I-TASSER modeling in the FM-based structure predictions.
Collapse
Affiliation(s)
- Yang Zhang
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, Lawrence, Kansas 66047, USA.
| |
Collapse
|
148
|
Pandit SB, Brylinski M, Zhou H, Gao M, Arakaki AK, Skolnick J. PSiFR: an integrated resource for prediction of protein structure and function. Bioinformatics 2010; 26:687-8. [PMID: 20080513 DOI: 10.1093/bioinformatics/btq006] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
UNLABELLED In the post-genomic era, the annotation of protein function facilitates the understanding of various biological processes. To extend the range of function annotation methods to the twilight zone of sequence identity, we have developed approaches that exploit both protein tertiary structure and/or protein sequence evolutionary relationships. To serve the scientific community, we have integrated the structure prediction tools, TASSER, TASSER-Lite and METATASSER, and the functional inference tools, FINDSITE, a structure-based algorithm for binding site prediction, Gene Ontology molecular function inference and ligand screening, EFICAz(2), a sequence-based approach to enzyme function inference and DBD-hunter, an algorithm for predicting DNA-binding proteins and associated DNA-binding residues, into a unified web resource, Protein Structure and Function prediction Resource (PSiFR). AVAILABILITY AND IMPLEMENTATION PSiFR is freely available for use on the web at http://psifr.cssb.biology.gatech.edu/
Collapse
Affiliation(s)
- Shashi B Pandit
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA 30318, USA
| | | | | | | | | | | |
Collapse
|
149
|
Abstract
The quaternary structure (QS) of a protein is determined by measuring its molecular weight in solution. The data have to be extracted from the literature, and they may be missing even for proteins that have a crystal structure reported in the Protein Data Bank (PDB). The PDB and other databases derived from it report QS information that either was obtained from the depositors or is based on an analysis of the contacts between polypeptide chains in the crystal, and this frequently differs from the QS determined in solution.The QS of a protein can be predicted from its sequence using either homology or threading methods. However, a majority of the proteins with less than 30% sequence identity have different QSs. A model of the QS can also be derived by docking the subunits when their 3D structure is independently known, but the model is likely to be incorrect if large conformation changes take place when the oligomer assembles.
Collapse
Affiliation(s)
- Anne Poupon
- Yeast Structural Genomics, IBBMC UMR 8619 CNRS, Université Paris-Sud, Orsay, France
| | | |
Collapse
|
150
|
Lescat M, Hoede C, Clermont O, Garry L, Darlu P, Tuffery P, Denamur E, Picard B. aes, the gene encoding the esterase B in Escherichia coli, is a powerful phylogenetic marker of the species. BMC Microbiol 2009; 9:273. [PMID: 20040078 PMCID: PMC2805673 DOI: 10.1186/1471-2180-9-273] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2009] [Accepted: 12/29/2009] [Indexed: 11/30/2022] Open
Abstract
Background Previous studies have established a correlation between electrophoretic polymorphism of esterase B, and virulence and phylogeny of Escherichia coli. Strains belonging to the phylogenetic group B2 are more frequently implicated in extraintestinal infections and include esterase B2 variants, whereas phylogenetic groups A, B1 and D contain less virulent strains and include esterase B1 variants. We investigated esterase B as a marker of phylogeny and/or virulence, in a thorough analysis of the esterase B-encoding gene. Results We identified the gene encoding esterase B as the acetyl-esterase gene (aes) using gene disruption. The analysis of aes nucleotide sequences in a panel of 78 reference strains, including the E. coli reference (ECOR) strains, demonstrated that the gene is under purifying selection. The phylogenetic tree reconstructed from aes sequences showed a strong correlation with the species phylogenetic history, based on multi-locus sequence typing using six housekeeping genes. The unambiguous distinction between variants B1 and B2 by electrophoresis was consistent with Aes amino-acid sequence analysis and protein modelling, which showed that substituted amino acids in the two esterase B variants occurred mostly at different sites on the protein surface. Studies in an experimental mouse model of septicaemia using mutant strains did not reveal a direct link between aes and extraintestinal virulence. Moreover, we did not find any genes in the chromosomal region of aes to be associated with virulence. Conclusion Our findings suggest that aes does not play a direct role in the virulence of E. coli extraintestinal infection. However, this gene acts as a powerful marker of phylogeny, illustrating the extensive divergence of B2 phylogenetic group strains from the rest of the species.
Collapse
|