1
|
Caffrey PJ, Eckenroth BE, Burkhart BW, Zatopek KM, McClung CM, Santangelo TJ, Doublié S, Gardner AF. Thermococcus kodakarensis TK0353 is a novel AP lyase with a new fold. J Biol Chem 2024; 300:105503. [PMID: 38013090 PMCID: PMC10731606 DOI: 10.1016/j.jbc.2023.105503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 11/02/2023] [Accepted: 11/12/2023] [Indexed: 11/29/2023] Open
Abstract
Hyperthermophilic organisms thrive in extreme environments prone to high levels of DNA damage. Growth at high temperature stimulates DNA base hydrolysis resulting in apurinic/apyrimidinic (AP) sites that destabilize the genome. Organisms across all domains have evolved enzymes to recognize and repair AP sites to maintain genome stability. The hyperthermophilic archaeon Thermococcus kodakarensis encodes several enzymes to repair AP site damage including the essential AP endonuclease TK endonuclease IV. Recently, using functional genomic screening, we discovered a new family of AP lyases typified by TK0353. Here, using biochemistry, structural analysis, and genetic deletion, we have characterized the TK0353 structure and function. TK0353 lacks glycosylase activity on a variety of damaged bases and is therefore either a monofunctional AP lyase or may be a glycosylase-lyase on a yet unidentified substrate. The crystal structure of TK0353 revealed a novel fold, which does not resemble other known DNA repair enzymes. The TK0353 gene is not essential for T. kodakarensis viability presumably because of redundant base excision repair enzymes involved in AP site processing. In summary, TK0353 is a novel AP lyase unique to hyperthermophiles that provides redundant repair activity necessary for genome maintenance.
Collapse
Affiliation(s)
| | - Brian E Eckenroth
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont, USA
| | - Brett W Burkhart
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, Colorado, USA
| | | | | | - Thomas J Santangelo
- Department of Biochemistry and Molecular Biology, Colorado State University, Fort Collins, Colorado, USA
| | - Sylvie Doublié
- Department of Microbiology and Molecular Genetics, University of Vermont, Burlington, Vermont, USA
| | | |
Collapse
|
2
|
Wei Q, Wang R, Jiang Y, Wei L, Sun Y, Geng J, Su R. ConPep: Prediction of peptide contact maps with pre-trained biological language model and multi-view feature extracting strategy. Comput Biol Med 2023; 167:107631. [PMID: 37948966 DOI: 10.1016/j.compbiomed.2023.107631] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 10/16/2023] [Accepted: 10/23/2023] [Indexed: 11/12/2023]
Abstract
The accurate prediction of peptide contact maps remains a challenging task due to the difficulty in obtaining the interactive information between residues on short sequences. To address this challenge, we propose ConPep, a deep learning framework designed for predicting the contact map of peptides based on sequences only. To sufficiently incorporate the sequential semantic information between residues in peptide sequences, we use a pre-trained biological language model and transfer prior knowledge from large scale databases. Additionally, to extract and integrate sequential local information and residue-based global correlations, our model incorporates Bidirectional Gated Recurrent Unit and attention mechanisms. They can obtain multi-view features and thus enhance the accuracy and robustness of our prediction. Comparative results on independent tests demonstrate that our proposed method significantly outperforms state-of-the-art methods even with short peptides. Notably, our method exhibits superior performance at the sequence level, suggesting the robust ability of our model compared with the multiple sequence alignment (MSA) analysis-based methods. We expect it can be meaningful research for facilitating the wide use of our method.
Collapse
Affiliation(s)
- Qingxin Wei
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Ruheng Wang
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Yi Jiang
- School of Software, Shandong University, Jinan, China; Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China; Centre for Artificial Intelligence driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University, Macao SAR, China
| | - Yu Sun
- Beidahuang Industry Group General Hospital, Harbin, China.
| | - Jie Geng
- Department of Cardiology, Tianjin Chest Hospital, Tianjin, China.
| | - Ran Su
- College of Intelligence and Computing, Tianjin University, Tianjin, China.
| |
Collapse
|
3
|
Zheng W, Wuyun Q, Freddolino PL, Zhang Y. Integrating deep learning, threading alignments, and a multi-MSA strategy for high-quality protein monomer and complex structure prediction in CASP15. Proteins 2023; 91:1684-1703. [PMID: 37650367 PMCID: PMC10840719 DOI: 10.1002/prot.26585] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 08/04/2023] [Accepted: 08/14/2023] [Indexed: 09/01/2023]
Abstract
We report the results of the "UM-TBM" and "Zheng" groups in CASP15 for protein monomer and complex structure prediction. These prediction sets were obtained using the D-I-TASSER and DMFold-Multimer algorithms, respectively. For monomer structure prediction, D-I-TASSER introduced four new features during CASP15: (i) a multiple sequence alignment (MSA) generation protocol that combines multi-source MSA searching and a structural modeling-based MSA ranker; (ii) attention-network based spatial restraints; (iii) a multi-domain module containing domain partition and arrangement for domain-level templates and spatial restraints; (iv) an optimized I-TASSER-based folding simulation system for full-length model creation guided by a combination of deep learning restraints, threading alignments, and knowledge-based potentials. For 47 free modeling targets in CASP15, the final models predicted by D-I-TASSER showed average TM-score 19% higher than the standard AlphaFold2 program. We thus showed that traditional Monte Carlo-based folding simulations, when appropriately coupled with deep learning algorithms, can generate models with improved accuracy over end-to-end deep learning methods alone. For protein complex structure prediction, DMFold-Multimer generated models by integrating a new MSA generation algorithm (DeepMSA2) with the end-to-end modeling module from AlphaFold2-Multimer. For the 38 complex targets, DMFold-Multimer generated models with an average TM-score of 0.83 and Interface Contact Score of 0.60, both significantly higher than those of competing complex prediction tools. Our analyses on complexes highlighted the critical role played by MSA generating, ranking, and pairing in protein complex structure prediction. We also discuss future room for improvement in the areas of viral protein modeling and complex model ranking.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Peter L Freddolino
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Computer Science, School of Computing, National University of Singapore, 117417 Singapore
- Cancer Science Institute of Singapore, National University of Singapore, 117599, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, 117596, Singapore
| |
Collapse
|
4
|
Sikorska C, Liwo A. Origin of Correlations between Local Conformational States of Consecutive Amino Acid Residues and Their Role in Shaping Protein Structures and in Allostery. J Phys Chem B 2022; 126:9493-9505. [PMID: 36367920 PMCID: PMC9706564 DOI: 10.1021/acs.jpcb.2c04610] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/27/2022] [Indexed: 11/13/2022]
Abstract
By analyzing the Kubo-cluster-cumulant expansion of the potential of mean force of polypeptide chains corresponding to backbone-local interactions averaged over the rotation of the peptide groups about the Cα···Cα virtual bonds, we identified two important kinds of "along-chain" correlations that pertain to extended chain segments bordered by turns (usually the β-strands) and to the folded spring-like segments (usually α-helices), respectively, and are expressed as multitorsional potentials. These terms affect the positioning of structural elements with respect to each other and, consequently, contribute to determining their packing. Additionally, for extended chain segments, the correlation terms contribute to propagating the conformational change at one end to the other end, which is characteristic of allosteric interactions. We confirmed both findings by statistical analysis of the virtual-bond geometry of 77 950 proteins. Augmenting coarse-grained and, possibly, all-atom force fields with these correlation terms could improve their capacity to model protein structure and dynamics.
Collapse
Affiliation(s)
- Celina Sikorska
- The
MacDiarmid Institute for Advanced Materials and Nanotechnology, Department
of Physics, The University of Auckland,
Private Bag 92019, Auckland1142, New Zealand
| | - Adam Liwo
- Faculty
of Chemistry, University of Gdańsk,
Fahrenheit Union of Universities in Gdańsk, Wita Stwosza 63, 80-308Gdańsk, Poland
| |
Collapse
|
5
|
Wang L, Fan R, Li Z, Wang L, Bai X, Bu T, Dong Y, Xu Y, Quan C. Insights into the structure and function of the histidine kinase ComP from Bacillus amyloliquefaciens based on molecular modeling. Biosci Rep 2022; 42:BSR20220352. [PMID: 36052710 PMCID: PMC9620489 DOI: 10.1042/bsr20220352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 08/01/2022] [Accepted: 09/01/2022] [Indexed: 11/21/2022] Open
Abstract
The ComPA two-component signal transduction system (TCS) is essential in Bacillus spp. However, the molecular mechanism of the histidine kinase ComP remains unclear. Here, we predicted the structure of ComP from Bacillus amyloliquefaciens Q-426 (BaComP) using an artificial intelligence approach, analyzed the structural characteristics based on the molecular docking results and compared homologous proteins, and then investigated the biochemical properties of BaComP. We obtained a truncated ComPS protein with high purity and correct folding in solution based on the predicted structures. The expression and purification of BaComP proteins suggested that the subdomains in the cytoplasmic region influenced the expression and stability of the recombinant proteins. ComPS is a bifunctional enzyme that exhibits the activity of both histidine kinase and phosphotransferase. We found that His571 played an obligatory role in the autophosphorylation of BaComP based on the analysis of the structures and mutagenesis studies. The molecular docking results suggested that the HATPase_c domain contained an ATP-binding pocket, and the ATP molecule was coordinated by eight conserved residues from the N, G1, and G2 boxes. Our study provides novel insight into the histidine kinase BaComP and its homologous proteins.
Collapse
Affiliation(s)
- Lulu Wang
- School of Life Science and Biotechnology, Dalian University of Technology, No. 2 Linggong Road, Dalian 116024, Liaoning, China
- Key Laboratory of Biotechnology and Bioresources Utilization of Ministry of Education, College of Life Science, Dalian Minzu University, China
| | - Ruochen Fan
- School of Life Science and Biotechnology, Dalian University of Technology, No. 2 Linggong Road, Dalian 116024, Liaoning, China
- Key Laboratory of Biotechnology and Bioresources Utilization of Ministry of Education, College of Life Science, Dalian Minzu University, China
| | - Zhuting Li
- Key Laboratory of Biotechnology and Bioresources Utilization of Ministry of Education, College of Life Science, Dalian Minzu University, China
- Department of Bioengineering, College of Life Science, Dalian Minzu University, Dalian 116600, Liaoning, China
| | - Lina Wang
- Institute of Cancer Stem Cell, Dalian Medical University, 9 Western Lvshun Road, Dalian 116044, Liaoning, China
| | - Xue Bai
- Key Laboratory of Biotechnology and Bioresources Utilization of Ministry of Education, College of Life Science, Dalian Minzu University, China
- Department of Bioengineering, College of Life Science, Dalian Minzu University, Dalian 116600, Liaoning, China
| | - Tingting Bu
- Key Laboratory of Biotechnology and Bioresources Utilization of Ministry of Education, College of Life Science, Dalian Minzu University, China
- Department of Bioengineering, College of Life Science, Dalian Minzu University, Dalian 116600, Liaoning, China
| | - Yuesheng Dong
- School of Life Science and Biotechnology, Dalian University of Technology, No. 2 Linggong Road, Dalian 116024, Liaoning, China
| | - Yongbin Xu
- Key Laboratory of Biotechnology and Bioresources Utilization of Ministry of Education, College of Life Science, Dalian Minzu University, China
- Department of Bioengineering, College of Life Science, Dalian Minzu University, Dalian 116600, Liaoning, China
| | - Chunshan Quan
- Key Laboratory of Biotechnology and Bioresources Utilization of Ministry of Education, College of Life Science, Dalian Minzu University, China
- Department of Bioengineering, College of Life Science, Dalian Minzu University, Dalian 116600, Liaoning, China
| |
Collapse
|
6
|
I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat Protoc 2022; 17:2326-2353. [PMID: 35931779 DOI: 10.1038/s41596-022-00728-0] [Citation(s) in RCA: 128] [Impact Index Per Article: 64.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 05/24/2022] [Indexed: 01/17/2023]
Abstract
Most proteins in cells are composed of multiple folding units (or domains) to perform complex functions in a cooperative manner. Relative to the rapid progress in single-domain structure prediction, there are few effective tools available for multi-domain protein structure assembly, mainly due to the complexity of modeling multi-domain proteins, which involves higher degrees of freedom in domain-orientation space and various levels of continuous and discontinuous domain assembly and linker refinement. To meet the challenge and the high demand of the community, we developed I-TASSER-MTD to model the structures and functions of multi-domain proteins through a progressive protocol that combines sequence-based domain parsing, single-domain structure folding, inter-domain structure assembly and structure-based function annotation in a fully automated pipeline. Advanced deep-learning models have been incorporated into each of the steps to enhance both the domain modeling and inter-domain assembly accuracy. The protocol allows for the incorporation of experimental cross-linking data and cryo-electron microscopy density maps to guide the multi-domain structure assembly simulations. I-TASSER-MTD is built on I-TASSER but substantially extends its ability and accuracy in modeling large multi-domain protein structures and provides meaningful functional insights for the targets at both the domain- and full-chain levels from the amino acid sequence alone.
Collapse
|
7
|
Biomarkers and De Novo Protein Design Can Improve Precise Amino Acid Nutrition in Broilers. Animals (Basel) 2022; 12:ani12070935. [PMID: 35405923 PMCID: PMC8997161 DOI: 10.3390/ani12070935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 03/28/2022] [Accepted: 03/29/2022] [Indexed: 12/10/2022] Open
Abstract
Simple Summary Almost half of the protein ingested by broilers is not retained and is excreted, impairing the nitrogen utilization, health and productivity of the animals, and intensifying the environmental impact of poultry meat production. This work proposes two potential tools, combining traditional nutrition with biotechnological, metabolomics, computational and protein engineering knowledge, which can contribute to improving precise amino acid nutrition in broilers in the future: (i) the use of serum uric nitrogen content as a rapid biomarker of amino acid imbalances, and (ii) the design and modeling of de novo proteins that are fully digestible and fit exactly to the animal’s requirements. Both tools can open up new opportunities to form an integrated framework for precise amino acid nutrition in broilers, helping us to achieve more efficient, resilient, and sustainable production. This information can help to determine the exact ratio of amino acids that will improve the efficiency of the use of nitrogen by poultry. Abstract Precision nutrition in broilers requires tools capable of identifying amino acid imbalances individually or in groups, as well as knowledge on how more digestible proteins can be designed for innovative feeding programs adjusted to animals’ dynamic requirements. This work proposes two potential tools, combining traditional nutrition with biotechnological, metabolomic, computational and protein engineering knowledge, which can contribute to improving the precise amino acid nutrition of broilers in the future: (i) the use of serum uric nitrogen content as a rapid biomarker of amino acid imbalances, and (ii) the design and modeling of de novo proteins that are fully digestible and fit exactly to the animal’s requirements. Each application is illustrated with a case study. Case study 1 demonstrates that serum uric nitrogen can be a useful rapid indicator of individual or group amino acid deficiencies or imbalances when reducing dietary protein and adjusting the valine and arginine to lysine ratios in broilers. Case study 2 describes a stepwise approach to design an ideal protein, resulting in a potential amino acid sequence and structure prototype that is ideally adjusted to the requirements of the targeted animal, and is theoretically completely digestible. Both tools can open up new opportunities to form an integrated framework for precise amino acid nutrition in broilers, helping us to achieve more efficient, resilient, and sustainable production. This information can help to determine the exact ratio of amino acids that will improve the efficiency of the use of nitrogen by poultry.
Collapse
|
8
|
Mathur Y, Mohammad T, Anjum F, Shafie A, Elasbali AM, Uversky VN, Hassan MI. PyPAn: An Automated Graphical User Interface for Protein Sequence and
Structure Analyses. Protein Pept Lett 2022; 29:306-312. [DOI: 10.2174/0929866529666220210155421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 12/17/2021] [Accepted: 12/27/2021] [Indexed: 11/22/2022]
Abstract
Background:
Protein sequence and structure analyses have been essential components of
bioinformatics and structural biology. They provide a deeper insight into the physicochemical
properties, structure, and subsequent functions of a protein. Advanced computational approaches
and bioinformatics utilities help solve several issues related to protein analysis. Still, beginners and
non-professional may struggle when encountering a wide variety of computational tools and the
sheer number of input parameter variables required by each tool.
Methods:
We introduce a free-to-access graphical user interface (GUI) named PyPAn 'Python-based
Protein Analysis' for varieties of protein sequence/structure analyses. PyPAn serves as a universal
platform to analyze protein sequences, structure, and their properties. PyPAn facilitates onboard
analysis of each task in just a single click. It can be used to calculate the physicochemical properties,
including instability index and molar extinction coefficient, for a protein. PyPAn is one of the few
computational tools that allow users to generate a Ramachandran plot and calculate solvent
accessibility and the radius of gyration (Rg) of proteins at once. In addition, it can refine the protein
model along with computation and minimization of its energy.
Results:
PyPAn can generate a recommendation for an appropriate structure modelling method to
employ for a query protein sequence. PyPAn is one of the few, if not the only, Python-based
computational GUI tools with an array of options for the user to employ as they see fit.
Conclusion:
PyPAn aims to unify many successful academically significant proteomic applications
and is freely available for academic and industrial research uses at https://hassanlab.org/pypan.
Collapse
Affiliation(s)
- Yash Mathur
- Department of Computer Science, Jamia Millia Islamia, Jamia Nagar, New Delhi 110025, India
| | - Taj Mohammad
- Centre for
Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, Jamia Nagar, New Delhi 110025, India
| | - Farah Anjum
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Taif University, P.O. Box 11099,
Taif 21944, Saudi Arabia
| | - Alaa Shafie
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, Taif University, P.O. Box 11099,
Taif 21944, Saudi Arabia
| | - Abdelbaset M. Elasbali
- Clinical Laboratory Science, College of Applied Sciences-Qurayyat, Jouf University, Jouf,
Saudi Arabia
| | - Vladimir N. Uversky
- Department of Molecular Medicine and Byrd Alzheimer\'s Research Institute, Morsani College of
Medicine, University of South Florida, Tampa, Florida, USA
| | - Md. Imtaiyaz Hassan
- Centre for
Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, Jamia Nagar, New Delhi 110025, India
| |
Collapse
|
9
|
Philip J, Örd M, Silva A, Singh S, Diffley JFX, Remus D, Loog M, Ikui AE. Cdc6 is sequentially regulated by PP2A-Cdc55, Cdc14, and Sic1 for origin licensing in S. cerevisiae. eLife 2022; 11:e74437. [PMID: 35142288 PMCID: PMC8830886 DOI: 10.7554/elife.74437] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 12/15/2021] [Indexed: 01/31/2023] Open
Abstract
Cdc6, a subunit of the pre-replicative complex (pre-RC), contains multiple regulatory cyclin-dependent kinase (Cdk1) consensus sites, SP or TP motifs. In Saccharomyces cerevisiae, Cdk1 phosphorylates Cdc6-T7 to recruit Cks1, the Cdk1 phospho-adaptor in S phase, for subsequent multisite phosphorylation and protein degradation. Cdc6 accumulates in mitosis and is tightly bound by Clb2 through N-terminal phosphorylation in order to prevent premature origin licensing and degradation. It has been extensively studied how Cdc6 phosphorylation is regulated by the cyclin-Cdk1 complex. However, a detailed mechanism on how Cdc6 phosphorylation is reversed by phosphatases has not been elucidated. Here, we show that PP2ACdc55 dephosphorylates Cdc6 N-terminal sites to release Clb2. Cdc14 dephosphorylates the C-terminal phospho-degron, leading to Cdc6 stabilization in mitosis. In addition, Cdk1 inhibitor Sic1 releases Clb2·Cdk1·Cks1 from Cdc6 to load Mcm2-7 on the chromatin upon mitotic exit. Thus, pre-RC assembly and origin licensing are promoted by phosphatases through the attenuation of distinct Cdk1-dependent Cdc6 inhibitory mechanisms.
Collapse
Affiliation(s)
- Jasmin Philip
- The PhD Program in Biochemistry, The Graduate Center, CUNYBrooklynUnited States
- Brooklyn CollegeBrooklynUnited States
| | | | - Andriele Silva
- The PhD Program in Biochemistry, The Graduate Center, CUNYBrooklynUnited States
- Brooklyn CollegeBrooklynUnited States
| | - Shaneen Singh
- The PhD Program in Biochemistry, The Graduate Center, CUNYBrooklynUnited States
- Brooklyn CollegeBrooklynUnited States
| | | | - Dirk Remus
- Memorial Sloan-Kettering Cancer CenterNew YorkUnited States
| | | | - Amy E Ikui
- The PhD Program in Biochemistry, The Graduate Center, CUNYBrooklynUnited States
- Brooklyn CollegeBrooklynUnited States
| |
Collapse
|
10
|
Han Y, Wang Z, Chen A, Ali I, Cai J, Ye S, Li J. An inductive transfer learning force field (ITLFF) protocol builds protein force fields in seconds. Brief Bioinform 2022; 23:6509736. [PMID: 35039818 DOI: 10.1093/bib/bbab590] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 12/19/2021] [Accepted: 12/23/2021] [Indexed: 01/15/2023] Open
Abstract
Accurate simulation of protein folding is a unique challenge in understanding the physical process of protein folding, with important implications for protein design and drug discovery. Molecular dynamics simulation strongly requires advanced force fields with high accuracy to achieve correct folding. However, the current force fields are inaccurate, inapplicable and inefficient. We propose a machine learning protocol, the inductive transfer learning force field (ITLFF), to construct protein force fields in seconds with any level of accuracy from a small dataset. This process is achieved by incorporating an inductive transfer learning algorithm into deep neural networks, which learn knowledge of any high-level calculations from a large dataset of low-level method. Here, we use a double-hybrid density functional theory (DFT) as a case functional, but ITLFF is suitable for any high-precision functional. The performance of the selected 18 proteins indicates that compared with the fragment-based double-hybrid DFT algorithm, the force field constructed by ITLFF achieves considerable accuracy with a mean absolute error of 0.0039 kcal/mol/atom for energy and a root mean square error of 2.57 $\mathrm{kcal}/\mathrm{mol}/{\AA}$ for force, and it is more than 30 000 times faster and obtains more significant efficiency benefits as the system increases. The outstanding performance of ITLFF provides promising prospects for accurate and efficient protein dynamic simulations and makes an important step toward protein folding simulation. Due to the ability of ITLFF to utilize the knowledge acquired in one task to solve related problems, it is also applicable for various problems in biology, chemistry and material science.
Collapse
Affiliation(s)
- Yanqiang Han
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory for Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Zhilong Wang
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory for Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - An Chen
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory for Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Imran Ali
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory for Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Junfei Cai
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory for Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Simin Ye
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory for Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jinjin Li
- National Key Laboratory of Science and Technology on Micro/Nano Fabrication, Shanghai Jiao Tong University, Shanghai, 200240, China
- Key Laboratory for Thin Film and Microfabrication of Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
11
|
Overhoff B, Falls Z, Mangione W, Samudrala R. A Deep-Learning Proteomic-Scale Approach for Drug Design. Pharmaceuticals (Basel) 2021; 14:1277. [PMID: 34959678 PMCID: PMC8709297 DOI: 10.3390/ph14121277] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/27/2021] [Accepted: 11/29/2021] [Indexed: 12/26/2022] Open
Abstract
Computational approaches have accelerated novel therapeutic discovery in recent decades. The Computational Analysis of Novel Drug Opportunities (CANDO) platform for shotgun multitarget therapeutic discovery, repurposing, and design aims to improve their efficacy and safety by employing a holistic approach that computes interaction signatures between every drug/compound and a large library of non-redundant protein structures corresponding to the human proteome fold space. These signatures are compared and analyzed to determine if a given drug/compound is efficacious and safe for a given indication/disease. In this study, we used a deep learning-based autoencoder to first reduce the dimensionality of CANDO-computed drug-proteome interaction signatures. We then employed a reduced conditional variational autoencoder to generate novel drug-like compounds when given a target encoded "objective" signature. Using this approach, we designed compounds to recreate the interaction signatures for twenty approved and experimental drugs and showed that 16/20 designed compounds were predicted to be significantly (p-value ≤ 0.05) more behaviorally similar relative to all corresponding controls, and 20/20 were predicted to be more behaviorally similar relative to a random control. We further observed that redesigns of objectives developed via rational drug design performed significantly better than those derived from natural sources (p-value ≤ 0.05), suggesting that the model learned an abstraction of rational drug design. We also show that the designed compounds are structurally diverse and synthetically feasible when compared to their respective objective drugs despite consistently high predicted behavioral similarity. Finally, we generated new designs that enhanced thirteen drugs/compounds associated with non-small cell lung cancer and anti-aging properties using their predicted proteomic interaction signatures. his study represents a significant step forward in automating holistic therapeutic design with machine learning, enabling the rapid generation of novel, effective, and safe drug leads for any indication.
Collapse
Affiliation(s)
| | | | | | - Ram Samudrala
- Department of Biomedical Informatics, Jacobs School of Medicine and Biomedical Sciences, University at Buffalo, Buffalo, NY 14203, USA; (B.O.); (Z.F.); (W.M.)
| |
Collapse
|
12
|
Perera DDBD, Perera KML, Peiris DC. A Novel In Silico Benchmarked Pipeline Capable of Complete Protein Analysis: A Possible Tool for Potential Drug Discovery. BIOLOGY 2021; 10:biology10111113. [PMID: 34827106 PMCID: PMC8615085 DOI: 10.3390/biology10111113] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/17/2021] [Revised: 10/16/2021] [Accepted: 10/25/2021] [Indexed: 01/11/2023]
Abstract
Simple Summary Protein interactions govern the majority of an organism’s biological processes. Therefore, to fully understand the functionality of an organism, we must know how proteins work at a molecular level. This study assembled a protocol that enables scientists to construct a protein’s tertiary structure easily and subsequently to investigate its mechanism and function. Each step involved in prediction, validation, and functional analysis of a protein is crucial to obtain an accurate result. We have dubbed this the trifecta analysis. It was clear early in our research that no single study in the literature had previously encompassed the complete trifecta analysis. In particular, studies that recommend free, open-source tools that have been benchmarked for each step are lacking. The present study ensures that predictions are accurate and validated and will greatly benefit new and experienced scientists alike in obtaining a strong understanding of the trifecta analysis, resulting in a domino effect that could lead to drug development. Abstract Current in silico proteomics require the trifecta analysis, namely, prediction, validation, and functional assessment of a modeled protein. The main drawback of this endeavor is the lack of a single protocol that utilizes a proper set of benchmarked open-source tools to predict a protein’s structure and function accurately. The present study rectifies this drawback through the design and development of such a protocol. The protocol begins with the characterization of a novel coding sequence to identify the expressed protein. It then recognizes and isolates evolutionarily conserved sequence motifs through phylogenetics. The next step is to predict the protein’s secondary structure, followed by the prediction, refinement, and validation of its three-dimensional tertiary structure. These steps enable the functional analysis of the macromolecule through protein docking, which facilitates the identification of the protein’s active site. Each of these steps is crucial for the complete characterization of the protein under study. We have dubbed this process the trifecta analysis. In this study, we have proven the effectiveness of our protocol using the cystatin C and AChE proteins. Beginning with just their sequences, we have characterized both proteins’ structures and functions, including identifying the cystatin C protein’s seven-residue active site and the AChE protein’s active-site gorge via protein–protein and protein–ligand docking, respectively. This process will greatly benefit new and experienced scientists alike in obtaining a strong understanding of the trifecta analysis, resulting in a domino effect that could expand drug development.
Collapse
Affiliation(s)
- D. D. B. D. Perera
- Department of Zoology, Faculty of Applied Sciences, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka;
- Correspondence: (D.D.B.D.P.); (D.C.P.); Tel.: +94-714-018-537 (D.C.P.)
| | - K. Minoli L. Perera
- Department of Zoology, Faculty of Applied Sciences, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka;
| | - Dinithi C. Peiris
- Genetics & Molecular Biology Unit (Center for Biotechnology), Department of Zoology, Faculty of Applied Sciences, University of Sri Jayewardenepura, Nugegoda 10250, Sri Lanka
- Correspondence: (D.D.B.D.P.); (D.C.P.); Tel.: +94-714-018-537 (D.C.P.)
| |
Collapse
|
13
|
Antoniak A, Biskupek I, Bojarski KK, Czaplewski C, Giełdoń A, Kogut M, Kogut MM, Krupa P, Lipska AG, Liwo A, Lubecka EA, Marcisz M, Maszota-Zieleniak M, Samsonov SA, Sieradzan AK, Ślusarz MJ, Ślusarz R, Wesołowski PA, Ziȩba K. Modeling protein structures with the coarse-grained UNRES force field in the CASP14 experiment. J Mol Graph Model 2021; 108:108008. [PMID: 34419932 DOI: 10.1016/j.jmgm.2021.108008] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 08/12/2021] [Accepted: 08/13/2021] [Indexed: 12/31/2022]
Abstract
The UNited RESidue (UNRES) force field was tested in the 14th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP14), in which larger oligomeric and multimeric targets were present compared to previous editions. Three prediction modes were tested (i) ab initio (the UNRES group), (ii) contact-assisted (the UNRES-contact group), and (iii) template-assisted (the UNRES-template group). For most of the targets, the contact restraints were derived from the server models top-ranked by the DeepQA method, while the DNCON2 method was used for 11 targets. Our consensus-fragment procedure was used to run template-assisted predictions. Each group also processed the Nuclear Magnetic Resonance (NMR)- and Small Angle X-Ray Scattering (SAXS)-data assisted targets. The average Global Distance Test Total Score (GDT_TS) of the 'Model 1' predictions were 29.17, 39.32, and 56.37 for the UNRES, UNRES-contact, and UNRES-template predictions, respectively, increasing by 0.53, 2.24, and 3.76, respectively, compared to CASP13. It was also found that the GDT_TS of the UNRES models obtained in ab initio mode and in the contact-assisted mode decreases with the square root of chain length, while the exponent in this relationship is 0.20 for the UNRES-template group models and 0.11 for the best performing AlphaFold2 models, which suggests that incorporation of database information, which stems from protein evolution, brings in long-range correlations, thus enabling the correction of force-field inaccuracies.
Collapse
Affiliation(s)
- Anna Antoniak
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Iga Biskupek
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Krzysztof K Bojarski
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Cezary Czaplewski
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Artur Giełdoń
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Mateusz Kogut
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Małgorzata M Kogut
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Paweł Krupa
- Institute of Physics, Polish Academy of Sciences, Aleja Lotników 32/46, Warsaw, PL-02668, Poland
| | - Agnieszka G Lipska
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland; School of Computational Sciences, Korea Institute for Advanced Study, 87 Hoegiro, Dongdaemun-gu, 130-722, Seoul, Republic of Korea.
| | - Emilia A Lubecka
- Faculty of Electronics, Telecommunications and Informatics, Gdańsk University of Technology, G. Narutowicza 11/12, 80-233, Gdańsk, Poland
| | - Mateusz Marcisz
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland; Intercollegiate Faculty of Biotechnology, University of Gdańsk and Medical University of Gdańsk, ul. Abrahama 58, 80-307, Gdańsk, Poland
| | | | - Sergey A Samsonov
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Adam K Sieradzan
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Magdalena J Ślusarz
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Rafał Ślusarz
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| | - Patryk A Wesołowski
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland; Intercollegiate Faculty of Biotechnology, University of Gdańsk and Medical University of Gdańsk, ul. Abrahama 58, 80-307, Gdańsk, Poland
| | - Karolina Ziȩba
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308, Gdańsk, Poland
| |
Collapse
|
14
|
Mortuza SM, Zheng W, Zhang C, Li Y, Pearce R, Zhang Y. Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions. Nat Commun 2021; 12:5011. [PMID: 34408149 PMCID: PMC8373938 DOI: 10.1038/s41467-021-25316-w] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 08/04/2021] [Indexed: 11/28/2022] Open
Abstract
Sequence-based contact prediction has shown considerable promise in assisting non-homologous structure modeling, but it often requires many homologous sequences and a sufficient number of correct contacts to achieve correct folds. Here, we developed a method, C-QUARK, that integrates multiple deep-learning and coevolution-based contact-maps to guide the replica-exchange Monte Carlo fragment assembly simulations. The method was tested on 247 non-redundant proteins, where C-QUARK could fold 75% of the cases with TM-scores (template-modeling scores) ≥0.5, which was 2.6 times more than that achieved by QUARK. For the 59 cases that had either low contact accuracy or few homologous sequences, C-QUARK correctly folded 6 times more proteins than other contact-based folding methods. C-QUARK was also tested on 64 free-modeling targets from the 13th CASP (critical assessment of protein structure prediction) experiment and had an average GDT_TS (global distance test) score that was 5% higher than the best CASP predictors. These data demonstrate, in a robust manner, the progress in modeling non-homologous protein structures using low-accuracy and sparse contact-map predictions.
Collapse
Affiliation(s)
- S M Mortuza
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
15
|
A physiologic rise in cytoplasmic calcium ion signal increases pannexin1 channel activity via a C-terminus phosphorylation by CaMKII. Proc Natl Acad Sci U S A 2021; 118:2108967118. [PMID: 34301850 DOI: 10.1073/pnas.2108967118] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Pannexin1 (Panx1) channels are ubiquitously expressed in vertebrate cells and are widely accepted as adenosine triphosphate (ATP)-releasing membrane channels. Activation of Panx1 has been associated with phosphorylation in a specific tyrosine residue or cleavage of its C-terminal domains. In the present work, we identified a residue (S394) as a putative phosphorylation site by Ca2+/calmodulin-dependent kinase II (CaMKII). In HeLa cells transfected with rat Panx1 (rPanx1), membrane stretch (MS)-induced activation-measured by changes in DAPI uptake rate-was drastically reduced by either knockdown of Piezo1 or pharmacological inhibition of calmodulin or CaMKII. By site-directed mutagenesis we generated rPanx1S394A-EGFP (enhanced green fluorescent protein), which lost its sensitivity to MS, and rPanx1S394D-EGFP, mimicking phosphorylation, which shows high DAPI uptake rate without MS stimulation or cleavage of the C terminus. Using whole-cell patch-clamp and outside-out excised patch configurations, we found that rPanx1-EGFP and rPanx1S394D-EGFP channels showed current at all voltages between ±100 mV, similar single channel currents with outward rectification, and unitary conductance (∼30 to 70 pS). However, using cell-attached configuration we found that rPanx1S394D-EGFP channels show increased spontaneous unitary events independent of MS stimulation. In silico studies revealed that phosphorylation of S394 caused conformational changes in the selectivity filter and increased the average volume of lateral tunnels, allowing ATP to be released via these conduits and DAPI uptake directly from the channel mouth to the cytoplasmic space. These results could explain one possible mechanism for activation of rPanx1 upon increase in cytoplasmic Ca2+ signal elicited by diverse physiological conditions in which the C-terminal domain is not cleaved.
Collapse
|
16
|
Bagci EZ, Senguler-Ciftci F, Ciftci U, Demir A. A novel measure to analyze protein structures: Aspect ratio in protein alpha shapes. Proteins 2021; 89:1270-1276. [PMID: 33993533 DOI: 10.1002/prot.26148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 03/01/2021] [Accepted: 05/03/2021] [Indexed: 11/10/2022]
Abstract
Proteins' three-dimensional (3D) structures are analyzed traditionally using geometric descriptors such as torsional angles and inter-atomic distances. In this study a measure that is borrowed from computational geometry, aspect ratio of each tetrahedron in alpha shapes of proteins, is utilized. This geometric descriptor differentiates alpha and beta structural classes of proteins when combined with principal components analysis. The method converts the structures of individual proteins, 3D coordinates of the atoms, to points on a plane. It has a high degree of accuracy in differentiating R and T structures of hemoglobin. Therefore, it is anticipated that the geometric measure can be used successfully in a method that is extended to solve classification problems in machine learning.
Collapse
Affiliation(s)
- Elife Z Bagci
- Department of Biology, Tekirdag Namik Kemal University, Tekirdag, Turkey
| | | | - Unver Ciftci
- Department of Mathematics, Tekirdag Namik Kemal University, Tekirdag, Turkey
| | - Ayhan Demir
- Department of Projects Management and Support, Turkish Health Institutes (TÜSEB), Ankara, Turkey
| |
Collapse
|
17
|
Protein Structure Prediction: Conventional and Deep Learning Perspectives. Protein J 2021; 40:522-544. [PMID: 34050498 DOI: 10.1007/s10930-021-10003-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/21/2021] [Indexed: 10/21/2022]
Abstract
Protein structure prediction is a way to bridge the sequence-structure gap, one of the main challenges in computational biology and chemistry. Predicting any protein's accurate structure is of paramount importance for the scientific community, as these structures govern their function. Moreover, this is one of the complicated optimization problems that computational biologists have ever faced. Experimental protein structure determination methods include X-ray crystallography, Nuclear Magnetic Resonance Spectroscopy and Electron Microscopy. All of these are tedious and time-consuming procedures that require expertise. To make the process less cumbersome, scientists use predictive tools as part of computational methods, using data consolidated in the protein repositories. In recent years, machine learning approaches have raised the interest of the structure prediction community. Most of the machine learning approaches for protein structure prediction are centred on co-evolution based methods. The accuracy of these approaches depends on the number of homologous protein sequences available in the databases. The prediction problem becomes challenging for many proteins, especially those without enough sequence homologs. Deep learning methods allow for the extraction of intricate features from protein sequence data without making any intuitions. Accurately predicted protein structures are employed for drug discovery, antibody designs, understanding protein-protein interactions, and interactions with other molecules. This article provides a review of conventional and deep learning approaches in protein structure prediction. We conclude this review by outlining a few publicly available datasets and deep learning architectures currently employed for protein structure prediction tasks.
Collapse
|
18
|
Prediction of LncRNA-encoded small peptides in glioma and oligomer channel functional analysis using in silico approaches. PLoS One 2021; 16:e0248634. [PMID: 33735310 PMCID: PMC7971536 DOI: 10.1371/journal.pone.0248634] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 03/03/2021] [Indexed: 12/13/2022] Open
Abstract
Glioma is a lethal malignant brain cancer, and many reports have shown that abnormalities in the behavior of water and ion channels play an important role in regulating tumor proliferation, migration, apoptosis, and differentiation. Recently, new studies have suggested that some long noncoding RNAs containing small open reading frames can encode small peptides and form oligomers for water or ion regulation. However, because the peptides are difficult to identify, their functional mechanisms are far from being clearly understood. In this study, we used bioinformatics methods to identify and evaluate lncRNAs, which may encode small transmembrane peptides in gliomas. Combining ab initio homology modeling, molecular dynamics simulations, and free energy calculations, we constructed a predictive model and predicted the oligomer channel activity of peptides by identifying the lncRNA ORFs. We found that one key hub lncRNA, namely, DLEU1, which contains two smORFs (ORF1 and ORF8), encodes small peptides that form pentameric channels. The mechanics of water and ion (Na+ and Cl-) transport through this pentameric channel were simulated. The potential mean force of the H2O molecules along the two ORF-encoded peptide channels indicated that the energy barrier was different between ORF1 and ORF8. The ORF1-encoded peptide pentamer acted as a self-assembled water channel but not as an ion channel, and the ORF8 permeated neither ions nor water. This work provides new methods and theoretical support for further elucidation of the function of lncRNA-encoded small peptides and their role in cancer. Additionally, this study provides a theoretical basis for drug development.
Collapse
|
19
|
Cryo-EM structures of engineered active bc 1-cbb 3 type CIII 2CIV super-complexes and electronic communication between the complexes. Nat Commun 2021; 12:929. [PMID: 33568648 PMCID: PMC7876108 DOI: 10.1038/s41467-021-21051-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 01/06/2021] [Indexed: 01/30/2023] Open
Abstract
Respiratory electron transport complexes are organized as individual entities or combined as large supercomplexes (SC). Gram-negative bacteria deploy a mitochondrial-like cytochrome (cyt) bc1 (Complex III, CIII2), and may have specific cbb3-type cyt c oxidases (Complex IV, CIV) instead of the canonical aa3-type CIV. Electron transfer between these complexes is mediated by soluble (c2) and membrane-anchored (cy) cyts. Here, we report the structure of an engineered bc1-cbb3 type SC (CIII2CIV, 5.2 Å resolution) and three conformers of native CIII2 (3.3 Å resolution). The SC is active in vivo and in vitro, contains all catalytic subunits and cofactors, and two extra transmembrane helices attributed to cyt cy and the assembly factor CcoH. The cyt cy is integral to SC, its cyt domain is mobile and it conveys electrons to CIV differently than cyt c2. The successful production of a native-like functional SC and determination of its structure illustrate the characteristics of membrane-confined and membrane-external respiratory electron transport pathways in Gram-negative bacteria.
Collapse
|
20
|
Zhang GJ, Wang XQ, Ma LF, Wang LJ, Hu J, Zhou XG. Two-Stage Distance Feature-based Optimization Algorithm for De novo Protein Structure Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2119-2130. [PMID: 31107659 DOI: 10.1109/tcbb.2019.2917452] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
De novo protein structure prediction can be treated as a conformational space optimization problem under the guidance of an energy function. However, it is a challenge of how to design an accurate energy function which ensures low-energy conformations close to native structures. Fortunately, recent studies have shown that the accuracy of de novo protein structure prediction can be significantly improved by integrating the residue-residue distance information. In this paper, a two-stage distance feature-based optimization algorithm (TDFO) for de novo protein structure prediction is proposed within the framework of evolutionary algorithm. In TDFO, a similarity model is first designed by using feature information which is extracted from distance profiles by bisecting K-means algorithm. The similarity model-based selection strategy is then developed to guide conformation search, and thus improve the quality of the predicted models. Moreover, global and local mutation strategies are designed, and a state estimation strategy is also proposed to strike a trade-off between the exploration and exploitation of the search space. Experimental results of 35 benchmark proteins show that the proposed TDFO can improve prediction accuracy for a large portion of test proteins.
Collapse
|
21
|
Dos Santos-Silva CA, Zupin L, Oliveira-Lima M, Vilela LMB, Bezerra-Neto JP, Ferreira-Neto JR, Ferreira JDC, de Oliveira-Silva RL, Pires CDJ, Aburjaile FF, de Oliveira MF, Kido EA, Crovella S, Benko-Iseppon AM. Plant Antimicrobial Peptides: State of the Art, In Silico Prediction and Perspectives in the Omics Era. Bioinform Biol Insights 2020; 14:1177932220952739. [PMID: 32952397 PMCID: PMC7476358 DOI: 10.1177/1177932220952739] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 07/30/2020] [Indexed: 12/14/2022] Open
Abstract
Even before the perception or interaction with pathogens, plants rely on constitutively guardian molecules, often specific to tissue or stage, with further expression after contact with the pathogen. These guardians include small molecules as antimicrobial peptides (AMPs), generally cysteine-rich, functioning to prevent pathogen establishment. Some of these AMPs are shared among eukaryotes (eg, defensins and cyclotides), others are plant specific (eg, snakins), while some are specific to certain plant families (such as heveins). When compared with other organisms, plants tend to present a higher amount of AMP isoforms due to gene duplications or polyploidy, an occurrence possibly also associated with the sessile habit of plants, which prevents them from evading biotic and environmental stresses. Therefore, plants arise as a rich resource for new AMPs. As these molecules are difficult to retrieve from databases using simple sequence alignments, a description of their characteristics and in silico (bioinformatics) approaches used to retrieve them is provided, considering resources and databases available. The possibilities and applications based on tools versus database approaches are considerable and have been so far underestimated.
Collapse
Affiliation(s)
| | - Luisa Zupin
- Genetic Immunology laboratory, Institute for Maternal and Child Health-IRCCS, Burlo Garofolo, Trieste, Italy
| | - Marx Oliveira-Lima
- Departamento de Genética, Universidade Federal de Pernambuco, Recife, Brazil
| | | | | | | | - José Diogo Cavalcanti Ferreira
- Departamento de Genética, Universidade Federal de Pernambuco, Recife, Brazil.,Departamento de Genética, Instituto Federal de Pernambuco, Pesqueira, Brazil
| | | | | | | | | | - Ederson Akio Kido
- Departamento de Genética, Universidade Federal de Pernambuco, Recife, Brazil
| | - Sergio Crovella
- Genetic Immunology laboratory, Institute for Maternal and Child Health-IRCCS, Burlo Garofolo, Trieste, Italy.,Department of Medicine, Surgery and Health Sciences, University of Trieste, Trieste, Italy
| | | |
Collapse
|
22
|
Kashani-Amin E, Tabatabaei-Malazy O, Sakhteman A, Larijani B, Ebrahim-Habibi A. A Systematic Review on Popularity, Application and Characteristics of Protein Secondary Structure Prediction Tools. Curr Drug Discov Technol 2020; 16:159-172. [PMID: 29493456 DOI: 10.2174/1570163815666180227162157] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2017] [Revised: 02/15/2018] [Accepted: 02/22/2018] [Indexed: 01/22/2023]
Abstract
BACKGROUND Prediction of proteins' secondary structure is one of the major steps in the generation of homology models. These models provide structural information which is used to design suitable ligands for potential medicinal targets. However, selecting a proper tool between multiple Secondary Structure Prediction (SSP) options is challenging. The current study is an insight into currently favored methods and tools, within various contexts. OBJECTIVE A systematic review was performed for a comprehensive access to recent (2013-2016) studies which used or recommended protein SSP tools. METHODS Three databases, Web of Science, PubMed and Scopus were systematically searched and 99 out of the 209 studies were finally found eligible to extract data. RESULTS Four categories of applications for 59 retrieved SSP tools were: (I) prediction of structural features of a given sequence, (II) evaluation of a method, (III) providing input for a new SSP method and (IV) integrating an SSP tool as a component for a program. PSIPRED was found to be the most popular tool in all four categories. JPred and tools utilizing PHD (Profile network from HeiDelberg) method occupied second and third places of popularity in categories I and II. JPred was only found in the two first categories, while PHD was present in three fields. CONCLUSION This study provides a comprehensive insight into the recent usage of SSP tools which could be helpful for selecting a proper tool.
Collapse
Affiliation(s)
- Elaheh Kashani-Amin
- Biosensor Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Ozra Tabatabaei-Malazy
- Non-Communicable Diseases Research Center, Endocrinology and Metabolism Population Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran.,Endocrinology and Metabolism Research Center, Endocrinology and Metabolism Clinical Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Amirhossein Sakhteman
- Department of Medicinal Chemistry, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran.,Medicinal Chemistry and Natural Products Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Bagher Larijani
- Endocrinology and Metabolism Research Center, Endocrinology and Metabolism Clinical Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Azadeh Ebrahim-Habibi
- Biosensor Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
23
|
Bendahou MA, Arrouchi H, Lakhlili W, Allam L, Aanniz T, Cherradi N, Ibrahimi A, Boutarbouch M. Computational Analysis of IDH1, IDH2, and TP53 Mutations in Low-Grade Gliomas Including Oligodendrogliomas and Astrocytomas. Cancer Inform 2020; 19:1176935120915839. [PMID: 32313423 PMCID: PMC7160765 DOI: 10.1177/1176935120915839] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2019] [Accepted: 03/09/2020] [Indexed: 12/18/2022] Open
Abstract
Introduction: The emergence of new omics approaches, such as genomic algorithms to identify
tumor mutations and molecular modeling tools to predict the
three-dimensional structure of proteins, has facilitated the understanding
of the dynamic mechanisms involved in the pathogenesis of low-grade gliomas
including oligodendrogliomas and astrocytomas. Methods: In this study, we targeted known mutations involved in low-grade gliomas,
starting with the sequencing of genomic regions encompassing exon 4 of
isocitrate dehydrogenase 1 (IDH1) and isocitrate
dehydrogenase 2 (IDH2) and the four exons (5-6 and 7-8) of
TP53 from 32 samples, followed by computational
analysis to study the impact of these mutations on the structure and
function of 3 proteins IDH1, IDH2, and
p53. Results: We obtain a mutation that has an effect on the catalytic site of the protein
IDH1 as R132H and on the catalytic site of the protein
IDH2 as R172M. Other mutations at p53
have been identified as K305N, which is a pathogenic mutation; R175 H, which
is a benign mutation; and R158G, which disrupts the structural conformation
of the tumor suppressor protein. Conclusion: In low-grade gliomas, mutations in IDH1, IDH2, and
TP53 may be the key to tumor progression because they
have an effect on the function of the protein such as mutations R132H in
IDH1 and R172M in IDH2, which change
the function of the enzyme alpha-ketoglutarate, or R158G in
TP53, which affects the structure of the generated
protein, thus their importance in understanding gliomagenesis and for more
accurate diagnosis complementary to the anatomical pathology tests.
Collapse
Affiliation(s)
- Mohammed Amine Bendahou
- Medical Biotechnology Laboratory (MedBiotech), BioInova Research Center, Medical and Pharmacy School, Mohammed V University Rabat, Morocco
| | - Housna Arrouchi
- Medical Biotechnology Laboratory (MedBiotech), BioInova Research Center, Medical and Pharmacy School, Mohammed V University Rabat, Morocco
| | - Wiame Lakhlili
- Medical Biotechnology Laboratory (MedBiotech), BioInova Research Center, Medical and Pharmacy School, Mohammed V University Rabat, Morocco
| | - Loubna Allam
- Medical Biotechnology Laboratory (MedBiotech), BioInova Research Center, Medical and Pharmacy School, Mohammed V University Rabat, Morocco
| | - Tarik Aanniz
- Medical Biotechnology Laboratory (MedBiotech), BioInova Research Center, Medical and Pharmacy School, Mohammed V University Rabat, Morocco
| | - Nadia Cherradi
- Department of Pathological Anatomy, Hospital of Specialties, CHU Ibn Sina, Rabat, Medical and Pharmacy School, Mohammed V University Rabat, Morocco
| | - Azeddine Ibrahimi
- Medical Biotechnology Laboratory (MedBiotech), BioInova Research Center, Medical and Pharmacy School, Mohammed V University Rabat, Morocco
| | - Mahjouba Boutarbouch
- Department of Neurosurgery, Hospital of Specialties, CHU Ibn Sina, Rabat, Medical and Pharmacy School, Mohammed V University Rabat, Morocco
| |
Collapse
|
24
|
Karczyńska AS, Ziȩba K, Uciechowska U, Mozolewska MA, Krupa P, Lubecka EA, Lipska AG, Sikorska C, Samsonov SA, Sieradzan AK, Giełdoń A, Liwo A, Ślusarz R, Ślusarz M, Lee J, Joo K, Czaplewski C. Improved Consensus-Fragment Selection in Template-Assisted Prediction of Protein Structures with the UNRES Force Field in CASP13. J Chem Inf Model 2020; 60:1844-1864. [PMID: 31999919 PMCID: PMC7588044 DOI: 10.1021/acs.jcim.9b00864] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
The method for protein-structure
prediction, which combines the
physics-based coarse-grained UNRES force field with knowledge-based
modeling, has been developed further and tested in the 13th Community
Wide Experiment on the Critical Assessment of Techniques for Protein
Structure Prediction (CASP13). The method implements restraints from
the consensus fragments common to server models. In this work, the
server models to derive fragments have been chosen on the basis of
quality assessment; a fully automatic fragment-selection procedure
has been introduced, and Dynamic Fragment Assembly pseudopotentials
have been fully implemented. The Global Distance Test Score (GDT_TS),
averaged over our “Model 1” predictions, increased by
over 10 units with respect to CASP12 for the free-modeling category
to reach 40.82. Our “Model 1” predictions ranked 20
and 14 for all and free-modeling targets, respectively (upper 20.2%
and 14.3% of all models submitted to CASP13 in these categories, respectively),
compared to 27 (upper 21.1%) and 24 (upper 18.9%) in CASP12, respectively.
For oligomeric targets, the Interface Patch Similarity (IPS) and Interface
Contact Similarity (ICS) averaged over our best oligomer models increased
from 0.28 to 0.36 and from 12.4 to 17.8, respectively, from CASP12
to CASP13, and top-ranking models of 2 targets (H0968 and T0997o)
were obtained (none in CASP12). The improvement of our method in CASP13
over CASP12 was ascribed to the combined effect of the overall enhancement
of server-model quality, our success in selecting server models and
fragments to derive restraints, and improvements of the restraint
and potential-energy functions.
Collapse
Affiliation(s)
| | - Karolina Ziȩba
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland
| | - Urszula Uciechowska
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland
| | - Magdalena A Mozolewska
- Institute of Computer Science, Polish Academy of Sciences, ul. Jana Kazimierza 5, Warsaw PL-02668, Poland
| | - Paweł Krupa
- Institute of Physics, Polish Academy of Sciences, Aleja Lotników 32/46, Warsaw PL-02668, Poland
| | - Emilia A Lubecka
- Institute of Informatics, Faculty of Mathematics, Physics, and Informatics, University of Gdańsk, Wita Stwosza 57, Gdańsk 80-308, Poland
| | - Agnieszka G Lipska
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland
| | - Celina Sikorska
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland
| | - Sergey A Samsonov
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland
| | - Adam K Sieradzan
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland.,School of Computational Sciences, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Artur Giełdoń
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland
| | - Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland.,School of Computational Sciences, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Rafał Ślusarz
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland
| | - Magdalena Ślusarz
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland
| | - Jooyoung Lee
- School of Computational Sciences, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Keehyoung Joo
- Center for Advanced Computation, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Cezary Czaplewski
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, Gdańsk 80-308, Poland
| |
Collapse
|
25
|
Mulnaes D, Porta N, Clemens R, Apanasenko I, Reiners J, Gremer L, Neudecker P, Smits SHJ, Gohlke H. TopModel: Template-Based Protein Structure Prediction at Low Sequence Identity Using Top-Down Consensus and Deep Neural Networks. J Chem Theory Comput 2020; 16:1953-1967. [DOI: 10.1021/acs.jctc.9b00825] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Daniel Mulnaes
- Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
| | - Nicola Porta
- Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
| | - Rebecca Clemens
- Institute für Biochemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
| | - Irina Apanasenko
- Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) & JuStruct, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| | - Jens Reiners
- Institute für Biochemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
- Center for Structural Studies Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
| | - Lothar Gremer
- Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) & JuStruct, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| | - Philipp Neudecker
- Institut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) & JuStruct, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| | - Sander H. J. Smits
- Institute für Biochemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
- Center for Structural Studies Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
| | - Holger Gohlke
- Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
- Institute of Biological Information Processing (IBI-7: Structural Biochemistry) & JuStruct, Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
- John von Neumann Institute for Computing (NIC) & Jülich Supercomputing Centre (JSC), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| |
Collapse
|
26
|
Soules KR, Dmitriev A, LaBrie SD, Dimond ZE, May BH, Johnson DK, Zhang Y, Battaile KP, Lovell S, Hefty PS. Structural and ligand binding analyses of the periplasmic sensor domain of RsbU in Chlamydia trachomatis support a role in TCA cycle regulation. Mol Microbiol 2020; 113:68-88. [PMID: 31637787 PMCID: PMC7007330 DOI: 10.1111/mmi.14401] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/03/2019] [Indexed: 12/17/2022]
Abstract
Chlamydia trachomatis is an obligate intracellular bacteria that undergo dynamic morphologic and physiologic conversions upon gaining an access to a eukaryotic cell. These conversions likely require the detection of key environmental conditions and regulation of metabolic activity. Chlamydia encodes homologs to proteins in the Rsb phosphoregulatory partner-switching pathway, best described in Bacillus subtilis. ORF CT588 has a strong sequence similarity to RsbU cytoplasmic phosphatase domain but also contains a unique periplasmic sensor domain that is expected to control the phosphatase activity. A 1.7 Å crystal structure of the periplasmic domain of the RsbU protein from C. trachomatis (PDB 6MAB) displays close structural similarity to DctB from Vibrio and Sinorhizobium. DctB has been shown, both structurally and functionally, to specifically bind to the tricarboxylic acid (TCA) cycle intermediate succinate. Surface plasmon resonance and differential scanning fluorimetry of TCA intermediates and potential metabolites from a virtual screen of RsbU revealed that alpha-ketoglutarate, malate and oxaloacetate bound to the RsbU periplasmic domain. Substitutions in the putative binding site resulted in reduced binding capabilities. An RsbU null mutant showed severe growth defects which could be restored through genetic complementation. Chemical inhibition of ATP synthesis by oxidative phosphorylation phenocopied the growth defect observed in the RsbU null strain. Altogether, these data support a model with the Rsb system responding differentially to TCA cycle intermediates to regulate metabolism and key differentiation processes.
Collapse
Affiliation(s)
- Katelyn R Soules
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS, 66045, USA
| | - Aidan Dmitriev
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS, 66045, USA
| | - Scott D LaBrie
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS, 66045, USA
| | - Zoë E Dimond
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS, 66045, USA
| | - Benjamin H May
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS, 66045, USA
| | - David K Johnson
- Computational Chemical Biology Core Facility, Del Shankel Structural Biology Center, University of Kansas, Lawrence, KS, 66047, USA
| | - Yang Zhang
- Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Kevin P Battaile
- IMCA-CAT, Hauptman-Woodward Medical Research Institute, Argonne, IL, 60439, USA
| | - Scott Lovell
- Protein Structure Laboratory, Del Shankel Structural Biology Center, University of Kansas, Lawrence, KS, 66047, USA
| | - P Scott Hefty
- Department of Molecular Biosciences, University of Kansas, Lawrence, KS, 66045, USA
| |
Collapse
|
27
|
Zheng W, Li Y, Zhang C, Pearce R, Mortuza SM, Zhang Y. Deep-learning contact-map guided protein structure prediction in CASP13. Proteins 2019; 87:1149-1164. [PMID: 31365149 PMCID: PMC6851476 DOI: 10.1002/prot.25792] [Citation(s) in RCA: 131] [Impact Index Per Article: 26.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 07/14/2019] [Accepted: 07/27/2019] [Indexed: 12/28/2022]
Abstract
We report the results of two fully automated structure prediction pipelines, "Zhang-Server" and "QUARK", in CASP13. The pipelines were built upon the C-I-TASSER and C-QUARK programs, which in turn are based on I-TASSER and QUARK but with three new modules: (a) a novel multiple sequence alignment (MSA) generation protocol to construct deep sequence-profiles for contact prediction; (b) an improved meta-method, NeBcon, which combines multiple contact predictors, including ResPRE that predicts contact-maps by coupling precision-matrices with deep residual convolutional neural-networks; and (c) an optimized contact potential to guide structure assembly simulations. For 50 CASP13 FM domains that lacked homologous templates, average TM-scores of the first models produced by C-I-TASSER and C-QUARK were 28% and 56% higher than those constructed by I-TASSER and QUARK, respectively. For the first time, contact-map predictions demonstrated usefulness on TBM domains with close homologous templates, where TM-scores of C-I-TASSER models were significantly higher than those of I-TASSER models with a P-value <.05. Detailed data analyses showed that the success of C-I-TASSER and C-QUARK was mainly due to the increased accuracy of deep-learning-based contact-maps, as well as the careful balance between sequence-based contact restraints, threading templates, and generic knowledge-based potentials. Nevertheless, challenges still remain for predicting quaternary structure of multi-domain proteins, due to the difficulties in domain partitioning and domain reassembly. In addition, contact prediction in terminal regions was often unsatisfactory due to the sparsity of MSAs. Development of new contact-based domain partitioning and assembly methods and training contact models on sparse MSAs may help address these issues.
Collapse
Affiliation(s)
- Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Yang Li
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - S M Mortuza
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
28
|
Arifuzzaman M, Mitra S, Das R, Hamza A, Absar N, Dash R. In silico analysis of nonsynonymous single-nucleotide polymorphisms (nsSNPs) of the SMPX gene. Ann Hum Genet 2019; 84:54-71. [PMID: 31583691 DOI: 10.1111/ahg.12350] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 07/06/2019] [Accepted: 08/08/2019] [Indexed: 02/06/2023]
Abstract
Mutations in the SMPX gene can disrupt the regular activity of the SMPX protein, which is involved in the hearing process. Recent reports showing a link between nonsynonymous single-nucleotide polymorphisms (nsSNPs) in SMPX and hearing loss, thus classifying deleterious SNPs in SMPX will be an uphill task before designing a more extensive population study. In this study, damaging nsSNPs of SMPX from the dbSNP database were identified by using 13 bioinformatics tools. Initially, the impact of nsSNPs in the SMPX gene were evaluated through different in silico predictors; and the deleterious convergent changes were analyzed by energy-minimization-guided residual network analysis. In addition, the pathogenic effects of mutations in SMPX-mediated protein-protein interactions were also characterized by structural modeling and binding energy calculations. A total of four mutations (N19D, A29T, K54N, and S71L) were found to be highly deleterious by all the tools, which are located at highly conserved regions. Furthermore, all four mutants showed structural alterations, and the communities of amino acids for mutant proteins were readily changed, compared to the wild-type. Among them, A29T (rs772775896) was revealed as the most damaging nsSNP, which caused significant structural deviation of the SMPX protein, as a result reducing the binding affinity to other functional partners. These findings reflect the computational insights into the deleterious role of nsSNPs in SMPX, which might be helpful for subjecting wet-lab confirmatory analysis.
Collapse
Affiliation(s)
- Md Arifuzzaman
- College of Pharmacy, Yeungnam University, Gyeongbuk, Republic of Korea
| | - Sarmistha Mitra
- Plasma Bioscience Research Center, Plasma-Bio Display, Kwangwoon University, Seoul, Republic of Korea
| | - Raju Das
- Department of Biochemistry and Biotechnology, University of Science & Technology Chittagong, Chittagong, Bangladesh
| | - Amir Hamza
- Department of Biochemistry, Hallym University, Gangwon, Republic of Korea
| | - Nurul Absar
- Department of Biochemistry and Biotechnology, University of Science & Technology Chittagong, Chittagong, Bangladesh
| | - Raju Dash
- Department of Anatomy, Dongguk University Graduate School of Medicine, Gyeongju, Republic of Korea
| |
Collapse
|
29
|
Sieradzan AK, Bogunia M, Mech P, Ganzynkowicz R, Giełdoń A, Liwo A, Makowski M. Introduction of Phosphorylated Residues into the UNRES Coarse-Grained Model: Toward Modeling of Signaling Processes. J Phys Chem B 2019; 123:5721-5729. [PMID: 31194908 DOI: 10.1021/acs.jpcb.9b03799] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Phosphorylated proteins take part in many signaling pathways and play a key role in homeostasis regulation. The all-atom force fields enable us to study the systems containing phosphorylated proteins, but they are limited to short time scales. In this paper, we report the extension of the physics-based coarse-grained UNRES force field to treat systems with phosphorylated amino-acid residues. To derive the respective potentials, appropriate physics-based analytical expressions were fitted to the potentials of mean force of systems modeling phosphorylated amino-acid residues computed in our previous work and implemented in UNRES. The extended UNRES performed well in ab initio simulations of two miniproteins containing phosphorylated residues, strongly suggesting that realistic large-scale simulations of processes involving phosphorylated proteins, especially signaling processes, are now possible.
Collapse
Affiliation(s)
- Adam K Sieradzan
- Faculty of Chemistry , University of Gdańsk , ul. Wita Stwosza 63 , 80-308 Gdańsk , Poland
| | - Małgorzata Bogunia
- Faculty of Chemistry , University of Gdańsk , ul. Wita Stwosza 63 , 80-308 Gdańsk , Poland
| | - Paulina Mech
- Faculty of Chemistry , University of Gdańsk , ul. Wita Stwosza 63 , 80-308 Gdańsk , Poland
| | - Robert Ganzynkowicz
- Faculty of Chemistry , University of Gdańsk , ul. Wita Stwosza 63 , 80-308 Gdańsk , Poland
| | - Artur Giełdoń
- Faculty of Chemistry , University of Gdańsk , ul. Wita Stwosza 63 , 80-308 Gdańsk , Poland
| | - Adam Liwo
- Faculty of Chemistry , University of Gdańsk , ul. Wita Stwosza 63 , 80-308 Gdańsk , Poland
| | - Mariusz Makowski
- Faculty of Chemistry , University of Gdańsk , ul. Wita Stwosza 63 , 80-308 Gdańsk , Poland
| |
Collapse
|
30
|
Lubecka EA, Liwo A. Introduction of a bounded penalty function in contact-assisted simulations of protein structures to omit false restraints. J Comput Chem 2019; 40:2164-2178. [PMID: 31037754 DOI: 10.1002/jcc.25847] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 03/29/2019] [Accepted: 04/14/2019] [Indexed: 12/26/2022]
Abstract
Contact-assisted simulations, the contacts being predicted or determined experimentally, have become very important in the determination of the structures of proteins and other biological macromolecules. In this work, the effect of contact-distance restraints on the simulated structures was investigated with the use of multiplexed replica exchange simulations with the coarse-grained UNRES force field. A modified bounded flat-bottom restraint function that does not generate a gradient when a restraint cannot be satisfied was implemented. Calculations were run with (i) a set of four small proteins, with contact restraints derived from experimental structures, and (ii) selected CASP11 and CASP12 targets, with restraints as used at prediction time. The bounded penalty function largely omitted false contacts, which were usually inconsistent. It was found that at least 20% of correct contacts must be present in the restraint set to improve model quality with respect to unrestrained simulations. © 2019 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Emilia A Lubecka
- Institute of Informatics, Faculty of Mathematics, Physics and Informatics, University of Gdańsk, Wita Stwosza 57, 80-308 Gdańsk, Poland.,Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| | - Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| |
Collapse
|
31
|
Xu G, Ma T, Wang Q, Ma J. OPUS-SSF: A side-chain-inclusive scoring function for ranking protein structural models. Protein Sci 2019; 28:1157-1162. [PMID: 30919509 DOI: 10.1002/pro.3608] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Revised: 03/21/2019] [Accepted: 03/27/2019] [Indexed: 12/21/2022]
Abstract
We introduce a side-chain-inclusive scoring function, named OPUS-SSF, for ranking protein structural models. The method builds a scoring function based on the native distributions of the coordinate components of certain anchoring points in a local molecular system for peptide segments of 5, 7, 9, and 11 residues in length. Differing from our previous OPUS-CSF [Xu et al., Protein Sci. 2018; 27: 286-292], which exclusively uses main chain information, OPUS-SSF employs anchoring points on side chains so that the effect of side chains is taken into account. The performance of OPUS-SSF was tested on 15 decoy sets containing totally 603 proteins, and 571 of them had their native structures recognized from their decoys. Similar to OPUS-CSF, OPUS-SSF does not employ the Boltzmann formula in constructing scoring functions. The results indicate that OPUS-SSF has achieved a significant improvement on decoy recognition and it should be a very useful tool for protein structural prediction and modeling.
Collapse
Affiliation(s)
- Gang Xu
- School of Life Sciences, Tsinghua University, Beijing 100084, People's Republic of China
| | - Tianqi Ma
- Applied Physics Program, Rice University, Houston, Texas 77005.,Department of Bioengineering, Rice University, Houston, Texas 77005
| | - Qinghua Wang
- Verna and Marrs Mclean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030
| | - Jianpeng Ma
- School of Life Sciences, Tsinghua University, Beijing 100084, People's Republic of China.,Applied Physics Program, Rice University, Houston, Texas 77005.,Department of Bioengineering, Rice University, Houston, Texas 77005.,Verna and Marrs Mclean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030
| |
Collapse
|
32
|
Adhikari B, Hou J, Cheng J. DNCON2: improved protein contact prediction using two-level deep convolutional neural networks. Bioinformatics 2019; 34:1466-1472. [PMID: 29228185 PMCID: PMC5925776 DOI: 10.1093/bioinformatics/btx781] [Citation(s) in RCA: 105] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Accepted: 12/07/2017] [Indexed: 12/14/2022] Open
Abstract
Motivation Significant improvements in the prediction of protein residue–residue contacts are observed in the recent years. These contacts, predicted using a variety of coevolution-based and machine learning methods, are the key contributors to the recent progress in ab initio protein structure prediction, as demonstrated in the recent CASP experiments. Continuing the development of new methods to reliably predict contact maps is essential to further improve ab initio structure prediction. Results In this paper we discuss DNCON2, an improved protein contact map predictor based on two-level deep convolutional neural networks. It consists of six convolutional neural networks—the first five predict contacts at 6, 7.5, 8, 8.5 and 10 Å distance thresholds, and the last one uses these five predictions as additional features to predict final contact maps. On the free-modeling datasets in CASP10, 11 and 12 experiments, DNCON2 achieves mean precisions of 35, 50 and 53.4%, respectively, higher than 30.6% by MetaPSICOV on CASP10 dataset, 34% by MetaPSICOV on CASP11 dataset and 46.3% by Raptor-X on CASP12 dataset, when top L/5 long-range contacts are evaluated. We attribute the improved performance of DNCON2 to the inclusion of short- and medium-range contacts into training, two-level approach to prediction, use of the state-of-the-art optimization and activation functions, and a novel deep learning architecture that allows each filter in a convolutional layer to access all the input features of a protein of arbitrary length. Availability and implementation The web server of DNCON2 is at http://sysbio.rnet.missouri.edu/dncon2/ where training and testing datasets as well as the predictions for CASP10, 11 and 12 free-modeling datasets can also be downloaded. Its source code is available at https://github.com/multicom-toolbox/DNCON2/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Badri Adhikari
- Department of Mathematics and Computer Science, University of Missouri-St. Louis, St. Louis, MO 63121, USA
| | - Jie Hou
- Department of Mathematics and Computer Science, University of Missouri-St. Louis, St. Louis, MO 63121, USA
| | - Jianlin Cheng
- Department of Mathematics and Computer Science, University of Missouri-St. Louis, St. Louis, MO 63121, USA.,Informatics Institute, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
33
|
Olaya C, Adhikari B, Raikhy G, Cheng J, Pappu HR. Identification and localization of Tospovirus genus-wide conserved residues in 3D models of the nucleocapsid and the silencing suppressor proteins. Virol J 2019; 16:7. [PMID: 30634979 PMCID: PMC6330412 DOI: 10.1186/s12985-018-1106-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 10/16/2018] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Tospoviruses (genus Tospovirus, family Peribunyaviridae, order Bunyavirales) cause significant losses to a wide range of agronomic and horticultural crops worldwide. Identification and characterization of specific sequences and motifs that are critical for virus infection and pathogenicity could provide useful insights and targets for engineering virus resistance that is potentially both broad spectrum and durable. Tomato spotted wilt virus (TSWV), the most prolific member of the group, was used to better understand the structure-function relationships of the nucleocapsid gene (N), and the silencing suppressor gene (NSs), coded by the TSWV small RNA. METHODS Using a global collection of orthotospoviral sequences, several amino acids that were conserved across the genus and the potential location of these conserved amino acid motifs in these proteins was determined. We used state of the art 3D modeling algorithms, MULTICOM-CLUSTER, MULTICOM-CONSTRUCT, MULTICOM-NOVEL, I-TASSER, ROSETTA and CONFOLD to predict the secondary and tertiary structures of the N and the NSs proteins. RESULTS We identified nine amino acid residues in the N protein among 31 known tospoviral species, and ten amino acid residues in NSs protein among 27 tospoviral species that were conserved across the genus. For the N protein, all three algorithms gave nearly identical tertiary models. While the conserved residues were distributed throughout the protein on a linear scale, at the tertiary level, three residues were consistently located in the coil in all the models. For NSs protein models, there was no agreement among the three algorithms. However, with respect to the localization of the conserved motifs, G18 was consistently located in coil, while H115 was localized in the coil in three models. CONCLUSIONS This is the first report of predicting the 3D structure of any tospoviral NSs protein and revealed a consistent location for two of the ten conserved residues. The modelers used gave accurate prediction for N protein allowing the localization of the conserved residues. Results form the basis for further work on the structure-function relationships of tospoviral proteins and could be useful in developing novel virus control strategies targeting the conserved residues.
Collapse
Affiliation(s)
- Cristian Olaya
- Department of Plant Pathology, Washington State University, Pullman, WA, 99164, USA
| | - Badri Adhikari
- Department of Mathematics and Computer Science, University of Missouri, St. Louis, MO, 63121, USA
| | - Gaurav Raikhy
- Department of Microbiology and Immunology, Louisiana State University, Shreverport, LA, 71101, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Hanu R Pappu
- Department of Plant Pathology, Washington State University, Pullman, WA, 99164, USA.
| |
Collapse
|
34
|
Oldfield CJ, Chen K, Kurgan L. Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences. Methods Mol Biol 2019; 1958:73-100. [PMID: 30945214 DOI: 10.1007/978-1-4939-9161-7_4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Many new methods for the sequence-based prediction of the secondary and supersecondary structures have been developed over the last several years. These and older sequence-based predictors are widely applied for the characterization and prediction of protein structure and function. These efforts have produced countless accurate predictors, many of which rely on state-of-the-art machine learning models and evolutionary information generated from multiple sequence alignments. We describe and motivate both types of predictions. We introduce concepts related to the annotation and computational prediction of the three-state and eight-state secondary structure as well as several types of supersecondary structures, such as β hairpins, coiled coils, and α-turn-α motifs. We review 34 predictors focusing on recent tools and provide detailed information for a selected set of 14 secondary structure and 3 supersecondary structure predictors. We conclude with several practical notes for the end users of these predictive methods.
Collapse
Affiliation(s)
- Christopher J Oldfield
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA
| | - Ke Chen
- School of Computer Science and Software Engineering, Tianjin Polytechnic University, Tianjin, People's Republic of China
| | - Lukasz Kurgan
- Department of Computer Science, College of Engineering, Virginia Commonwealth University, Richmond, VA, USA.
| |
Collapse
|
35
|
Cheung NJ, Yu W. De novo protein structure prediction using ultra-fast molecular dynamics simulation. PLoS One 2018; 13:e0205819. [PMID: 30458007 PMCID: PMC6245515 DOI: 10.1371/journal.pone.0205819] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2018] [Accepted: 10/02/2018] [Indexed: 11/19/2022] Open
Abstract
Modern genomics sequencing techniques have provided a massive amount of protein sequences, but experimental endeavor in determining protein structures is largely lagging far behind the vast and unexplored sequences. Apparently, computational biology is playing a more important role in protein structure prediction than ever. Here, we present a system of de novo predictor, termed NiDelta, building on a deep convolutional neural network and statistical potential enabling molecular dynamics simulation for modeling protein tertiary structure. Combining with evolutionary-based residue-contacts, the presented predictor can predict the tertiary structures of a number of target proteins with remarkable accuracy. The proposed approach is demonstrated by calculations on a set of eighteen large proteins from different fold classes. The results show that the ultra-fast molecular dynamics simulation could dramatically reduce the gap between the sequence and its structure at atom level, and it could also present high efficiency in protein structure determination if sparse experimental data is available.
Collapse
Affiliation(s)
- Ngaam J. Cheung
- Department of Brain and Cognitive Science, DGIST, Daegu, South Korea
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge, United Kingdom
| | - Wookyung Yu
- Department of Brain and Cognitive Science, DGIST, Daegu, South Korea
- Core Protein Resources Center, DGIST, Daegu, South Korea
- * E-mail:
| |
Collapse
|
36
|
Kc DB. Recent advances in sequence-based protein structure prediction. Brief Bioinform 2018; 18:1021-1032. [PMID: 27562963 DOI: 10.1093/bib/bbw070] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Indexed: 11/13/2022] Open
Abstract
The most accurate characterizations of the structure of proteins are provided by structural biology experiments. However, because of the high cost and labor-intensive nature of the structural experiments, the gap between the number of protein sequences and solved structures is widening rapidly. Development of computational methods to accurately model protein structures from sequences is becoming increasingly important to the biological community. In this article, we highlight some important progress in the field of protein structure prediction, especially those related to free modeling (FM) methods that generate structure models without using homologous templates. We also provide a short synopsis of some of the recent advances in FM approaches as demonstrated in the recent Computational Assessment of Structure Prediction competition as well as recent trends and outlook for FM approaches in protein structure prediction.
Collapse
|
37
|
Use of the UNRES force field in template-assisted prediction of protein structures and the refinement of server models: Test with CASP12 targets. J Mol Graph Model 2018; 83:92-99. [DOI: 10.1016/j.jmgm.2018.05.008] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2017] [Revised: 05/18/2018] [Accepted: 05/20/2018] [Indexed: 11/22/2022]
|
38
|
Interaction of N-terminal peptide analogues of the Na+,K+-ATPase with membranes. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2018. [DOI: 10.1016/j.bbamem.2018.03.002] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
39
|
Correa L, Borguesan B, Farfan C, Inostroza-Ponta M, Dorn M. A Memetic Algorithm for 3-D Protein Structure Prediction Problem. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:690-704. [PMID: 27925594 DOI: 10.1109/tcbb.2016.2635143] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Memetic Algorithms are population-based metaheuristics intrinsically concerned with exploiting all available knowledge about the problem under study. The incorporation of problem domain knowledge is not an optional mechanism, but a fundamental feature of the Memetic Algorithms. In this paper, we present a Memetic Algorithm to tackle the three-dimensional protein structure prediction problem. The method uses a structured population and incorporates a Simulated Annealing algorithm as a local search strategy, as well as ad-hoc crossover and mutation operators to deal with the problem. It takes advantage of structural knowledge stored in the Protein Data Bank, by using an Angle Probability List that helps to reduce the search space and to guide the search strategy. The proposed algorithm was tested on nineteen protein sequences of amino acid residues, and the results show the ability of the algorithm to find native-like protein structures. Experimental results have revealed that the proposed algorithm can find good solutions regarding root-mean-square deviation and global distance total score test in comparison with the experimental protein structures. We also show that our results are comparable in terms of folding organization with state-of-the-art prediction methods, corroborating the effectiveness of our proposal.
Collapse
|
40
|
Kozic M, Fox SJ, Thomas JM, Verma CS, Rigden DJ. Large scale ab initio modeling of structurally uncharacterized antimicrobial peptides reveals known and novel folds. Proteins 2018; 86:548-565. [DOI: 10.1002/prot.25473] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2017] [Revised: 01/16/2018] [Accepted: 01/29/2018] [Indexed: 12/20/2022]
Affiliation(s)
- Mara Kozic
- Institute of Integrative Biology, University of Liverpool; Liverpool L69 7ZB U.K
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute; Singapore
| | - Stephen J. Fox
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute; Singapore
| | - Jens M. Thomas
- Institute of Integrative Biology, University of Liverpool; Liverpool L69 7ZB U.K
| | - Chandra S. Verma
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute; Singapore
- Department of Biological Sciences; National University of Singapore; Singapore
- School of Biological Sciences; Nanyang Technological University; Singapore
| | - Daniel J. Rigden
- Institute of Integrative Biology, University of Liverpool; Liverpool L69 7ZB U.K
| |
Collapse
|
41
|
Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018; 53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]
Abstract
Prediction of protein tertiary structures from amino acid sequence and understanding the mechanisms of how proteins fold, collectively known as "the protein folding problem," has been a grand challenge in molecular biology for over half a century. Theories have been developed that provide us with an unprecedented understanding of protein folding mechanisms. However, computational simulation of protein folding is still difficult, and prediction of protein tertiary structure from amino acid sequence is an unsolved problem. Progress toward a satisfying solution has been slow due to challenges in sampling the vast conformational space and deriving sufficiently accurate energy functions. Nevertheless, several techniques and algorithms have been adopted to overcome these challenges, and the last two decades have seen exciting advances in enhanced sampling algorithms, computational power and tertiary structure prediction methodologies. This review aims at summarizing these computational techniques, specifically conformational sampling algorithms and energy approximations that have been frequently used to study protein-folding mechanisms or to de novo predict protein tertiary structures. We hope that this review can serve as an overview on how the protein-folding problem can be studied computationally and, in cases where experimental approaches are prohibitive, help the researcher choose the most relevant computational approach for the problem at hand. We conclude with a summary of current challenges faced and an outlook on potential future directions.
Collapse
Affiliation(s)
- Bian Li
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Michaela Fooksa
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, USA
| | - Sten Heinze
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA
| |
Collapse
|
42
|
Pourseif MM, Moghaddam G, Daghighkia H, Nematollahi A, Omidi Y. A novel B- and helper T-cell epitopes-based prophylactic vaccine against Echinococcus granulosus. ACTA ACUST UNITED AC 2017; 8:39-52. [PMID: 29713601 PMCID: PMC5915707 DOI: 10.15171/bi.2018.06] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Revised: 12/02/2017] [Accepted: 12/03/2017] [Indexed: 12/17/2022]
Abstract
![]()
Introduction:
In this study, we targeted the worm stage of Echinococcus granulosus to design a novel multi-epitope B- and helper T-cell based vaccine construct for immunization of dogs against this multi-host parasite.
Methods:
The vaccine was designed based on the local Eg14-3-3 antigen (Ag). DNA samples were extracted from the protoscoleces of the infected sheep’s liver, and then subjected to the polymerase chain reaction (PCR) with 14-3-3 specific forward and reverse primers. For the vaccine designing, several in silico steps were undertaken. Three-dimensional (3D) structure of the local Eg14-3-3 Ag was modeled by EasyModeller software. The protein modeling accuracy was then analyzed via various validation assays. Potential transmembrane helix, signal peptide, post-translational modifications and allergenicity of Eg14-3-3 were evaluated as the preliminary measures of B-cell epitopes (BEs ) prediction. Having used many web-servers, a well-designed process was carried out for improved prediction of BEs. High ranked linear and conformational BEs were utilized for engineering the final vaccine construct. Possible T-helper epitopes (TEs) were identified by the molecular docking between 13-mer fragments of the Eg14-3-3 Ag and two high frequent dog class II MHC alleles (i.e., DLA-DRB1*01101 and DRB1*01501). The epitopes coverage was evaluated by Shannon’s variability plot.
Results:
The final designed construct was analyzed based on different physicochemical properties, which was then codon optimized for high-level expression in Escherichia coli k12. This minigene construct is the first dog-specific epitopic vaccine construct that is established based on TEs with high-binding affinity to canine MHC alleles.
Conclusion:
This in silico study is the first part of a multi-antigenic vaccine designing work that represents as a novel dog-specific vaccine against E. granulosus. Here, we present key data on the step-by-step methodologies used for designing this de novo vaccine, which is under comprehensive in vivo investigations.
Collapse
Affiliation(s)
- Mohammad M Pourseif
- Department of Animal Sciences, Faculty of Agriculture, University of Tabriz, Tabriz, Iran.,Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Gholamali Moghaddam
- Department of Animal Sciences, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
| | - Hossein Daghighkia
- Department of Animal Sciences, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
| | - Ahmad Nematollahi
- Department of Pathobiology, Veterinary Collage, University of Tabriz, Tabriz, Iran
| | - Yadollah Omidi
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran.,Department of Pharmaceutics, Faculty of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| |
Collapse
|
43
|
Zhang C, Mortuza SM, He B, Wang Y, Zhang Y. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12. Proteins 2017; 86 Suppl 1:136-151. [PMID: 29082551 DOI: 10.1002/prot.25414] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 10/09/2017] [Accepted: 10/27/2017] [Indexed: 12/26/2022]
Abstract
We develop two complementary pipelines, "Zhang-Server" and "QUARK", based on I-TASSER and QUARK pipelines for template-based modeling (TBM) and free modeling (FM), and test them in the CASP12 experiment. The combination of I-TASSER and QUARK successfully folds three medium-size FM targets that have more than 150 residues, even though the interplay between the two pipelines still awaits further optimization. Newly developed sequence-based contact prediction by NeBcon plays a critical role to enhance the quality of models, particularly for FM targets, by the new pipelines. The inclusion of NeBcon predicted contacts as restraints in the QUARK simulations results in an average TM-score of 0.41 for the best in top five predicted models, which is 37% higher than that by the QUARK simulations without contacts. In particular, there are seven targets that are converted from non-foldable to foldable (TM-score >0.5) due to the use of contact restraints in the simulations. Another additional feature in the current pipelines is the local structure quality prediction by ResQ, which provides a robust residue-level modeling error estimation. Despite the success, significant challenges still remain in ab initio modeling of multi-domain proteins and folding of β-proteins with complicated topologies bound by long-range strand-strand interactions. Improvements on domain boundary and long-range contact prediction, as well as optimal use of the predicted contacts and multiple threading alignments, are critical to address these issues seen in the CASP12 experiment.
Collapse
Affiliation(s)
- Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - S M Mortuza
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Baoji He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.,Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, China
| | - Yanting Wang
- Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
44
|
Adhikari B, Hou J, Cheng J. Protein contact prediction by integrating deep multiple sequence alignments, coevolution and machine learning. Proteins 2017; 86 Suppl 1:84-96. [PMID: 29047157 DOI: 10.1002/prot.25405] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Revised: 09/08/2017] [Accepted: 10/16/2017] [Indexed: 12/14/2022]
Abstract
In this study, we report the evaluation of the residue-residue contacts predicted by our three different methods in the CASP12 experiment, focusing on studying the impact of multiple sequence alignment, residue coevolution, and machine learning on contact prediction. The first method (MULTICOM-NOVEL) uses only traditional features (sequence profile, secondary structure, and solvent accessibility) with deep learning to predict contacts and serves as a baseline. The second method (MULTICOM-CONSTRUCT) uses our new alignment algorithm to generate deep multiple sequence alignment to derive coevolution-based features, which are integrated by a neural network method to predict contacts. The third method (MULTICOM-CLUSTER) is a consensus combination of the predictions of the first two methods. We evaluated our methods on 94 CASP12 domains. On a subset of 38 free-modeling domains, our methods achieved an average precision of up to 41.7% for top L/5 long-range contact predictions. The comparison of the three methods shows that the quality and effective depth of multiple sequence alignments, coevolution-based features, and machine learning integration of coevolution-based features and traditional features drive the quality of predicted protein contacts. On the full CASP12 dataset, the coevolution-based features alone can improve the average precision from 28.4% to 41.6%, and the machine learning integration of all the features further raises the precision to 56.3%, when top L/5 predicted long-range contacts are evaluated. And the correlation between the precision of contact prediction and the logarithm of the number of effective sequences in alignments is 0.66.
Collapse
Affiliation(s)
- Badri Adhikari
- Department of Mathematics and Computer Science, University of Missouri-St. Louis, St. Louis, Missouri
| | - Jie Hou
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, Missouri
| |
Collapse
|
45
|
Karczyńska AS, Czaplewski C, Krupa P, Mozolewska MA, Joo K, Lee J, Liwo A. Ergodicity and model quality in template-restrained canonical and temperature/Hamiltonian replica exchange coarse-grained molecular dynamics simulations of proteins. J Comput Chem 2017; 38:2730-2746. [DOI: 10.1002/jcc.25070] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2017] [Revised: 07/10/2017] [Accepted: 09/01/2017] [Indexed: 01/22/2023]
Affiliation(s)
- Agnieszka S. Karczyńska
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
- Center for In Silico Protein Science; Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
- School of Computational Sciences; Korea Institute for Advanced Study, 85 Hoegiro Dongdaemun-gu; Seoul 02455 Republic of Korea
| | - Cezary Czaplewski
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
| | - Paweł Krupa
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
- Institute of Physics, Polish Academy of Sciences, Aleja Lotników 32/46; Warsaw PL 02668 Poland
| | - Magdalena A. Mozolewska
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
- Institute of Computer Science, Polish Academy of Sciences, ul. Jana Kazimierza 5; Warsaw 01-248 Poland
| | - Keehyoung Joo
- School of Computational Sciences; Korea Institute for Advanced Study, 85 Hoegiro Dongdaemun-gu; Seoul 02455 Republic of Korea
- Center for Advanced Computation, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
| | - Jooyoung Lee
- Center for In Silico Protein Science; Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
- School of Computational Sciences; Korea Institute for Advanced Study, 85 Hoegiro Dongdaemun-gu; Seoul 02455 Republic of Korea
- Center for Advanced Computation, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
| | - Adam Liwo
- Faculty of Chemistry; University of Gdańsk, ul. Wita Stwosza 63; Gdańsk 80-308 Poland
- Center for In Silico Protein Science; Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu; Seoul 02455 Republic of Korea
- School of Computational Sciences; Korea Institute for Advanced Study, 85 Hoegiro Dongdaemun-gu; Seoul 02455 Republic of Korea
| |
Collapse
|
46
|
|
47
|
Mozolewska MA, Krupa P, Zaborowski B, Liwo A, Lee J, Joo K, Czaplewski C. Use of Restraints from Consensus Fragments of Multiple Server Models To Enhance Protein-Structure Prediction Capability of the UNRES Force Field. J Chem Inf Model 2016; 56:2263-2279. [DOI: 10.1021/acs.jcim.6b00189] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
| | - Paweł Krupa
- Faculty
of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| | | | - Adam Liwo
- Faculty
of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
- Center
for In Silico Protein Structure and School of Computational Sciences, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Jooyoung Lee
- Center
for In Silico Protein Structure and School of Computational Sciences, Korea Institute for Advanced Study, 85 Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Keehyoung Joo
- Center
for Advanced Computation, Korea Institute for Advanced Study, 85
Hoegiro, Dongdaemun-gu, Seoul 130-722, Republic of Korea
| | - Cezary Czaplewski
- Faculty
of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| |
Collapse
|
48
|
Kmiecik S, Gront D, Kolinski M, Wieteska L, Dawid AE, Kolinski A. Coarse-Grained Protein Models and Their Applications. Chem Rev 2016; 116:7898-936. [DOI: 10.1021/acs.chemrev.6b00163] [Citation(s) in RCA: 555] [Impact Index Per Article: 69.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Sebastian Kmiecik
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Dominik Gront
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Michal Kolinski
- Bioinformatics
Laboratory, Mossakowski Medical Research Center of the Polish Academy of Sciences, Pawinskiego 5, 02-106 Warsaw, Poland
| | - Lukasz Wieteska
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
- Department
of Medical Biochemistry, Medical University of Lodz, Mazowiecka 6/8, 92-215 Lodz, Poland
| | | | - Andrzej Kolinski
- Faculty
of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
49
|
Bhattacharya D, Cao R, Cheng J. UniCon3D: de novo protein structure prediction using united-residue conformational search via stepwise, probabilistic sampling. Bioinformatics 2016; 32:2791-9. [PMID: 27259540 PMCID: PMC5018369 DOI: 10.1093/bioinformatics/btw316] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2016] [Accepted: 05/15/2016] [Indexed: 12/20/2022] Open
Abstract
MOTIVATION Recent experimental studies have suggested that proteins fold via stepwise assembly of structural units named 'foldons' through the process of sequential stabilization. Alongside, latest developments on computational side based on probabilistic modeling have shown promising direction to perform de novo protein conformational sampling from continuous space. However, existing computational approaches for de novo protein structure prediction often randomly sample protein conformational space as opposed to experimentally suggested stepwise sampling. RESULTS Here, we develop a novel generative, probabilistic model that simultaneously captures local structural preferences of backbone and side chain conformational space of polypeptide chains in a united-residue representation and performs experimentally motivated conditional conformational sampling via stepwise synthesis and assembly of foldon units that minimizes a composite physics and knowledge-based energy function for de novo protein structure prediction. The proposed method, UniCon3D, has been found to (i) sample lower energy conformations with higher accuracy than traditional random sampling in a small benchmark of 6 proteins; (ii) perform comparably with the top five automated methods on 30 difficult target domains from the 11th Critical Assessment of Protein Structure Prediction (CASP) experiment and on 15 difficult target domains from the 10th CASP experiment; and (iii) outperform two state-of-the-art approaches and a baseline counterpart of UniCon3D that performs traditional random sampling for protein modeling aided by predicted residue-residue contacts on 45 targets from the 10th edition of CASP. AVAILABILITY AND IMPLEMENTATION Source code, executable versions, manuals and example data of UniCon3D for Linux and OSX are freely available to non-commercial users at http://sysbio.rnet.missouri.edu/UniCon3D/ CONTACT: chengji@missouri.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Jianlin Cheng
- Department of Computer Science Informatics Institute, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
50
|
Fischer AW, Heinze S, Putnam DK, Li B, Pino JC, Xia Y, Lopez CF, Meiler J. CASP11--An Evaluation of a Modular BCL::Fold-Based Protein Structure Prediction Pipeline. PLoS One 2016; 11:e0152517. [PMID: 27046050 PMCID: PMC4821492 DOI: 10.1371/journal.pone.0152517] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2015] [Accepted: 03/15/2016] [Indexed: 11/18/2022] Open
Abstract
In silico prediction of a protein's tertiary structure remains an unsolved problem. The community-wide Critical Assessment of Protein Structure Prediction (CASP) experiment provides a double-blind study to evaluate improvements in protein structure prediction algorithms. We developed a protein structure prediction pipeline employing a three-stage approach, consisting of low-resolution topology search, high-resolution refinement, and molecular dynamics simulation to predict the tertiary structure of proteins from the primary structure alone or including distance restraints either from predicted residue-residue contacts, nuclear magnetic resonance (NMR) nuclear overhauser effect (NOE) experiments, or mass spectroscopy (MS) cross-linking (XL) data. The protein structure prediction pipeline was evaluated in the CASP11 experiment on twenty regular protein targets as well as thirty-three 'assisted' protein targets, which also had distance restraints available. Although the low-resolution topology search module was able to sample models with a global distance test total score (GDT_TS) value greater than 30% for twelve out of twenty proteins, frequently it was not possible to select the most accurate models for refinement, resulting in a general decay of model quality over the course of the prediction pipeline. In this study, we provide a detailed overall analysis, study one target protein in more detail as it travels through the protein structure prediction pipeline, and evaluate the impact of limited experimental data.
Collapse
Affiliation(s)
- Axel W. Fischer
- Department of Chemistry, Vanderbilt University, Nashville, TN, 37232, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, TN, 37232, United States of America
| | - Sten Heinze
- Department of Chemistry, Vanderbilt University, Nashville, TN, 37232, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, TN, 37232, United States of America
| | - Daniel K. Putnam
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, 37232, United States of America
| | - Bian Li
- Department of Chemistry, Vanderbilt University, Nashville, TN, 37232, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, TN, 37232, United States of America
| | - James C. Pino
- Chemical and Physical Biology Graduate Program, Vanderbilt University, Nashville, TN, 37232, United States of America
| | - Yan Xia
- Department of Chemistry, Vanderbilt University, Nashville, TN, 37232, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, TN, 37232, United States of America
| | - Carlos F. Lopez
- Center for Structural Biology, Vanderbilt University, Nashville, TN, 37232, United States of America
- Department of Cancer Biology and Center for Quantitative Sciences, Vanderbilt University, Nashville, TN, 37232, United States of America
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, 37232, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, TN, 37232, United States of America
| |
Collapse
|