Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wu S, Zhang Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. ACTA ACUST UNITED AC 2008;24:924-31. [PMID: 18296462 DOI: 10.1093/bioinformatics/btn069] [Citation(s) in RCA: 151] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

For:	Wu S, Zhang Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. ACTA ACUST UNITED AC 2008;24:924-31. [PMID: 18296462 DOI: 10.1093/bioinformatics/btn069] [Citation(s) in RCA: 151] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Number

Cited by Other Article(s)

Hasan AH, Shakya S, Hussain FHS, Murugesan S, Chander S, Pratama MRF, Jamil S, Das B, Biswas S, Jamalis J. Design, synthesis, anti-acetylcholinesterase evaluation and molecular modelling studies of novel coumarin-chalcone hybrids. J Biomol Struct Dyn 2023;41:11450-11462. [PMID: 36591704 DOI: 10.1080/07391102.2022.2162583] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 12/19/2022] [Indexed: 01/03/2023]

An J, Weng X. Collectively encoding protein properties enriches protein language models. BMC Bioinformatics 2022;23:467. [DOI: 10.1186/s12859-022-05031-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 10/31/2022] [Indexed: 11/10/2022] Open

Barger J, Adhikari B. New Labeling Methods for Deep Learning Real-Valued Inter-Residue Distance Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:3586-3594. [PMID: 34559660 DOI: 10.1109/tcbb.2021.3115053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Abstract

BACKGROUND

Much of the recent success in protein structure prediction has been a result of accurate protein contact prediction-a binary classification problem. Dozens of methods, built from various types of machine learning and deep learning algorithms, have been published over the last two decades for predicting contacts. Recently, many groups, including Google DeepMind, have demonstrated that reformulating the problem as a multi-class classification problem is a more promising direction to pursue. As an alternative approach, we recently proposed real-valued distance predictions, formulating the problem as a regression problem. The nuances of protein 3D structures make this formulation appropriate, allowing predictions to reflect inter-residue distances in nature. Despite these promises, the accurate prediction of real-valued distances remains relatively unexplored; possibly due to classification being better suited to machine and deep learning algorithms.

METHODS

Can regression methods be designed to predict real-valued distances as precise as binary contacts? To investigate this, we propose multiple novel methods of input label engineering, which is different from feature engineering, with the goal of optimizing the distribution of distances to cater to the loss function of the deep-learning model. Since an important utility of predicted contacts or distances is to build three-dimensional models, we also tested if predicted distances can reconstruct more accurate models than contacts.

RESULTS

Our results demonstrate, for the first time, that deep learning methods for real-valued protein distance prediction can deliver distances as precise as binary classification methods. When using an optimal distance transformation function on the standard PSICOV dataset consisting of 150 representative proteins, the precision of 'top-all' long-range contacts improves from 60.9% to 61.4% when predicting real-valued distances instead of contacts. When building three-dimensional models we observed an average TM-score increase from 0.61 to 0.72, highlighting the advantage of predicting real-valued distances.

Collapse

Potential inhibitory activity of phytoconstituents against black fungus: In silico ADMET, molecular docking and MD simulation studies. COMPUTATIONAL TOXICOLOGY 2022;24:100247. [PMID: 36193218 PMCID: PMC9508704 DOI: 10.1016/j.comtox.2022.100247] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Revised: 09/17/2022] [Accepted: 09/20/2022] [Indexed: 11/30/2022]

Gaber A, Alsanie WF, Alhomrani M, Alamri AS, Alyami H, Shakya S, Habeeballah H, Alkhatabi HA, Felimban RI, Alamri A, Alhabeeb AA, Raafat BM, Refat MS. Multispectral and Molecular Docking Studies Reveal Potential Effectiveness of Antidepressant Fluoxetine by Forming π-Acceptor Complexes. Molecules 2022;27:molecules27185883. [PMID: 36144618 PMCID: PMC9505585 DOI: 10.3390/molecules27185883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 09/05/2022] [Accepted: 09/06/2022] [Indexed: 11/26/2022] Open

Affiliation(s)

Ahmed Gaber Department of Biology, College of Science, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia Correspondence: (A.G.); (M.S.R.)
Walaa F. Alsanie Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia Department of Clinical Laboratories Sciences, The Faculty of Applied Medical Sciences, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
Majid Alhomrani Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia Department of Clinical Laboratories Sciences, The Faculty of Applied Medical Sciences, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
Abdulhakeem S. Alamri Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia Department of Clinical Laboratories Sciences, The Faculty of Applied Medical Sciences, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
Hussain Alyami College of Medicine, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
Sonam Shakya Department of Chemistry, Faculty of Science, Aligarh Muslim University, Aligarh 202002, India
Hamza Habeeballah Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences in Rabigh, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia
Heba A. Alkhatabi Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia King Fahd Medical Research Centre, Hematology Research Unit, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia
Raed I. Felimban Department of Medical Laboratory Sciences, Faculty of Applied Medical Sciences, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia Center of Innovation in Personalized Medicine (CIPM), 3D Bioprinting Unit, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia
Abdulwahab Alamri Department of Pharmacology and Toxicology, College of Pharmacy, University of Hail, P.O. Box 2240, Hail 55476, Saudi Arabia
Abdulhameed Abdullah Alhabeeb National Centre for Mental Health Promotion, P.O. Box 95459, Riyadh 11525, Saudi Arabia
Bassem M. Raafat Department of Radiological Sciences, College of Applied Medical Sciences, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
Moamen S. Refat Department of Chemistry, College of Science, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia Correspondence: (A.G.); (M.S.R.)

Collapse

Alsanie WF, Alhomrani M, Alamri AS, Alyami H, Shakya S, Habeeballah H, Alkhatabi HA, Felimban RI, Alamri A, Alhabeeb AA, Raafat BM, Refat MS, Gaber A. Attempting to Increase the Effectiveness of the Antidepressant Trazodone Hydrochloride Drug Using π-Acceptors. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022;19:11281. [PMID: 36141553 PMCID: PMC9517268 DOI: 10.3390/ijerph191811281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 09/02/2022] [Accepted: 09/05/2022] [Indexed: 06/16/2023]

Affiliation(s)

Walaa F. Alsanie Department of Clinical Laboratories Sciences, The Faculty of Applied Medical Sciences, Taif University, Taif 21944, Saudi Arabia Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, Taif 21944, Saudi Arabia
Majid Alhomrani Department of Clinical Laboratories Sciences, The Faculty of Applied Medical Sciences, Taif University, Taif 21944, Saudi Arabia Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, Taif 21944, Saudi Arabia
Abdulhakeem S. Alamri Department of Clinical Laboratories Sciences, The Faculty of Applied Medical Sciences, Taif University, Taif 21944, Saudi Arabia Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, Taif 21944, Saudi Arabia
Hussain Alyami College of Medicine, Taif University, Taif 21944, Saudi Arabia
Sonam Shakya Department of Chemistry, Faculty of Science, Aligarh Muslim University, Aligarh 202002, India
Hamza Habeeballah Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences in Rabigh, King Abdulaziz University, Jeddah 21589, Saudi Arabia
Heba A. Alkhatabi Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah 21589, Saudi Arabia King Fahd Medical Research Centre, Hematology Research Unit, King Abdulaziz University, Jeddah 21589, Saudi Arabia
Raed I. Felimban Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia Center of Innovation in Personalized Medicine (CIPM), 3D Bioprinting Unit, King Abdulaziz University, Jeddah 21589, Saudi Arabia
Abdulwahab Alamri Department of Pharmacology and Toxicology, College of Pharmacy, University of Hail, Hail 81442, Saudi Arabia
Abdulhameed Abdullah Alhabeeb National Centre for Mental Health Promotion, Riyadh 11525, Saudi Arabia
Bassem M. Raafat Department of Radiological Sciences, College of Applied Medical Sciences, Taif University, Taif 21944, Saudi Arabia
Moamen S. Refat Department of Chemistry, College of Science, Taif University, Taif 21944, Saudi Arabia
Ahmed Gaber Centre of Biomedical Sciences Research (CBSR), Deanship of Scientific Research, Taif University, Taif 21944, Saudi Arabia Department of Biology, College of Science, Taif University, Taif 21944, Saudi Arabia

Collapse

Pearce R, Li Y, Omenn GS, Zhang Y. Fast and accurate Ab Initio Protein structure prediction using deep learning potentials. PLoS Comput Biol 2022;18:e1010539. [PMID: 36112717 PMCID: PMC9518900 DOI: 10.1371/journal.pcbi.1010539] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Revised: 09/28/2022] [Accepted: 09/03/2022] [Indexed: 01/05/2023] Open

Villalobos-Alva J, Ochoa-Toledo L, Villalobos-Alva MJ, Aliseda A, Pérez-Escamirosa F, Altamirano-Bustamante NF, Ochoa-Fernández F, Zamora-Solís R, Villalobos-Alva S, Revilla-Monsalve C, Kemper-Valverde N, Altamirano-Bustamante MM. Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field. Front Bioeng Biotechnol 2022;10:788300. [PMID: 35875501 PMCID: PMC9301016 DOI: 10.3389/fbioe.2022.788300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Accepted: 05/25/2022] [Indexed: 11/23/2022] Open

Abstract

Proteins are some of the most fascinating and challenging molecules in the universe, and they pose a big challenge for artificial intelligence. The implementation of machine learning/AI in protein science gives rise to a world of knowledge adventures in the workhorse of the cell and proteome homeostasis, which are essential for making life possible. This opens up epistemic horizons thanks to a coupling of human tacit-explicit knowledge with machine learning power, the benefits of which are already tangible, such as important advances in protein structure prediction. Moreover, the driving force behind the protein processes of self-organization, adjustment, and fitness requires a space corresponding to gigabytes of life data in its order of magnitude. There are many tasks such as novel protein design, protein folding pathways, and synthetic metabolic routes, as well as protein-aggregation mechanisms, pathogenesis of protein misfolding and disease, and proteostasis networks that are currently unexplored or unrevealed. In this systematic review and biochemical meta-analysis, we aim to contribute to bridging the gap between what we call binomial artificial intelligence (AI) and protein science (PS), a growing research enterprise with exciting and promising biotechnological and biomedical applications. We undertake our task by exploring "the state of the art" in AI and machine learning (ML) applications to protein science in the scientific literature to address some critical research questions in this domain, including What kind of tasks are already explored by ML approaches to protein sciences? What are the most common ML algorithms and databases used? What is the situational diagnostic of the AI-PS inter-field? What do ML processing steps have in common? We also formulate novel questions such as Is it possible to discover what the rules of protein evolution are with the binomial AI-PS? How do protein folding pathways evolve? What are the rules that dictate the folds? What are the minimal nuclear protein structures? How do protein aggregates form and why do they exhibit different toxicities? What are the structural properties of amyloid proteins? How can we design an effective proteostasis network to deal with misfolded proteins? We are a cross-functional group of scientists from several academic disciplines, and we have conducted the systematic review using a variant of the PICO and PRISMA approaches. The search was carried out in four databases (PubMed, Bireme, OVID, and EBSCO Web of Science), resulting in 144 research articles. After three rounds of quality screening, 93 articles were finally selected for further analysis. A summary of our findings is as follows: regarding AI applications, there are mainly four types: 1) genomics, 2) protein structure and function, 3) protein design and evolution, and 4) drug design. In terms of the ML algorithms and databases used, supervised learning was the most common approach (85%). As for the databases used for the ML models, PDB and UniprotKB/Swissprot were the most common ones (21 and 8%, respectively). Moreover, we identified that approximately 63% of the articles organized their results into three steps, which we labeled pre-process, process, and post-process. A few studies combined data from several databases or created their own databases after the pre-process. Our main finding is that, as of today, there are no research road maps serving as guides to address gaps in our knowledge of the AI-PS binomial. All research efforts to collect, integrate multidimensional data features, and then analyze and validate them are, so far, uncoordinated and scattered throughout the scientific literature without a clear epistemic goal or connection between the studies. Therefore, our main contribution to the scientific literature is to offer a road map to help solve problems in drug design, protein structures, design, and function prediction while also presenting the "state of the art" on research in the AI-PS binomial until February 2021. Thus, we pave the way toward future advances in the synthetic redesign of novel proteins and protein networks and artificial metabolic pathways, learning lessons from nature for the welfare of humankind. Many of the novel proteins and metabolic pathways are currently non-existent in nature, nor are they used in the chemical industry or biomedical field.

Collapse

Affiliation(s)

Jalil Villalobos-Alva Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
Luis Ochoa-Toledo Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
Mario Javier Villalobos-Alva Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
Atocha Aliseda Instituto de Investigaciones Filosóficas, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
Fernando Pérez-Escamirosa Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
Nelly F. Altamirano-Bustamante Instituto Nacional de Pediatría, Mexico City, Mexico
Francine Ochoa-Fernández Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
Ricardo Zamora-Solís Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
Sebastián Villalobos-Alva Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
Cristina Revilla-Monsalve Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
Nicolás Kemper-Valverde Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
Myriam M. Altamirano-Bustamante Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico

Collapse

In silico discovery of multi-targeting inhibitors for the COVID-19 treatment by molecular docking, molecular dynamics simulation studies, and ADMET predictions. Struct Chem 2022. [DOI: 10.1007/s11224-022-01996-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Jiang Y, Luo J, Huang D, Liu Y, Li DD. Machine Learning Advances in Microbiology: A Review of Methods and Applications. Front Microbiol 2022;13:925454. [PMID: 35711777 PMCID: PMC9196628 DOI: 10.3389/fmicb.2022.925454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 05/09/2022] [Indexed: 12/18/2022] Open

Enhancement of Haloperidol Binding Affinity to Dopamine Receptor Via Forming a Charge-Transfer Complex with Picric Acid and 7,7,8,8-Tetracyanoquinodimethane for Improvement of the Antipsychotic Efficacy. Molecules 2022;27:molecules27103295. [PMID: 35630772 PMCID: PMC9146347 DOI: 10.3390/molecules27103295] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 05/13/2022] [Accepted: 05/18/2022] [Indexed: 02/04/2023] Open

Zhang H, Huang Y, Bei Z, Ju Z, Meng J, Hao M, Zhang J, Zhang H, Xi W. Inter-Residue Distance Prediction From Duet Deep Learning Models. Front Genet 2022;13:887491. [PMID: 35651930 PMCID: PMC9148999 DOI: 10.3389/fgene.2022.887491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 03/30/2022] [Indexed: 12/04/2022] Open

Newton MAH, Rahman J, Zaman R, Sattar A. Enhancing Protein Contact Map Prediction Accuracy via Ensembles of Inter-Residue Distance Predictors. Comput Biol Chem 2022;99:107700. [DOI: 10.1016/j.compbiolchem.2022.107700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 05/19/2022] [Accepted: 05/19/2022] [Indexed: 11/03/2022]

Santra S, Jana M. Predicting the evolution of number of native contacts of a small protein by using deep learning approach. Comput Biol Chem 2022;97:107625. [DOI: 10.1016/j.compbiolchem.2022.107625] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 01/07/2022] [Accepted: 01/09/2022] [Indexed: 11/28/2022]

Enhancing the Antipsychotic Effect of Risperidone by Increasing Its Binding Affinity to Serotonin Receptor via Picric Acid: A Molecular Dynamics Simulation. Pharmaceuticals (Basel) 2022;15:ph15030285. [PMID: 35337083 PMCID: PMC8952232 DOI: 10.3390/ph15030285] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Revised: 02/17/2022] [Accepted: 02/23/2022] [Indexed: 02/04/2023] Open

Yadav NS, Kumar P, Singh I. Structural and functional analysis of protein. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00026-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open

Novel thiophene Chalcones-Coumarin as acetylcholinesterase inhibitors: Design, synthesis, biological evaluation, molecular docking, ADMET prediction and molecular dynamics simulation. Bioorg Chem 2021;119:105572. [PMID: 34971946 DOI: 10.1016/j.bioorg.2021.105572] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 12/04/2021] [Accepted: 12/15/2021] [Indexed: 11/24/2022]

AboElkhair MA, Hasan ME, Mousa A, Moharam I, Sultan H, Malik Y, Sakr MA. In-silico evidence for enhancement of avian influenza virus H9N2 virulence by modulation of its hemagglutinin (HA) antigen function and stability during co-infection with infectious bronchitis virus in chickens. Virusdisease 2021;32:548-558. [PMID: 34631979 DOI: 10.1007/s13337-021-00688-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 04/08/2021] [Indexed: 10/20/2022] Open

Mortuza SM, Zheng W, Zhang C, Li Y, Pearce R, Zhang Y. Improving fragment-based ab initio protein structure assembly using low-accuracy contact-map predictions. Nat Commun 2021;12:5011. [PMID: 34408149 PMCID: PMC8373938 DOI: 10.1038/s41467-021-25316-w] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Accepted: 08/04/2021] [Indexed: 11/28/2022] Open

Pearce R, Zhang Y. Toward the solution of the protein structure prediction problem. J Biol Chem 2021;297:100870. [PMID: 34119522 PMCID: PMC8254035 DOI: 10.1016/j.jbc.2021.100870] [Citation(s) in RCA: 61] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 06/07/2021] [Accepted: 06/09/2021] [Indexed: 11/20/2022] Open

Reza MS, Zhang H, Hossain MT, Jin L, Feng S, Wei Y. COMTOP: Protein Residue-Residue Contact Prediction through Mixed Integer Linear Optimization. MEMBRANES 2021;11:membranes11070503. [PMID: 34209399 PMCID: PMC8305966 DOI: 10.3390/membranes11070503] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 06/24/2021] [Accepted: 06/25/2021] [Indexed: 11/17/2022]

Abstract

Protein contact prediction helps reconstruct the tertiary structure that greatly determines a protein’s function; therefore, contact prediction from the sequence is an important problem. Recently there has been exciting progress on this problem, but many of the existing methods are still low quality of prediction accuracy. In this paper, we present a new mixed integer linear programming (MILP)-based consensus method: a Consensus scheme based On a Mixed integer linear opTimization method for prOtein contact Prediction (COMTOP). The MILP-based consensus method combines the strengths of seven selected protein contact prediction methods, including CCMpred, EVfold, DeepCov, NNcon, PconsC4, plmDCA, and PSICOV, by optimizing the number of correctly predicted contacts and achieving a better prediction accuracy. The proposed hybrid protein residue–residue contact prediction scheme was tested in four independent test sets. For 239 highly non-redundant proteins, the method showed a prediction accuracy of 59.68%, 70.79%, 78.86%, 89.04%, 94.51%, and 97.35% for top-5L, top-3L, top-2L, top-L, top-L/2, and top-L/5 contacts, respectively. When tested on the CASP13 and CASP14 test sets, the proposed method obtained accuracies of 75.91% and 77.49% for top-L/5 predictions, respectively. COMTOP was further tested on 57 non-redundant α-helical transmembrane proteins and achieved prediction accuracies of 64.34% and 73.91% for top-L/2 and top-L/5 predictions, respectively. For all test datasets, the improvement of COMTOP in accuracy over the seven individual methods increased with the increasing number of predicted contacts. For example, COMTOP performed much better for large number of contact predictions (such as top-5L and top-3L) than for small number of contact predictions such as top-L/2 and top-L/5. The results and analysis demonstrate that COMTOP can significantly improve the performance of the individual methods; therefore, COMTOP is more robust against different types of test sets. COMTOP also showed better/comparable predictions when compared with the state-of-the-art predictors.

Collapse

Affiliation(s)

Md. Selim Reza School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; (M.S.R.); (H.Z.); (M.T.H.) Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
Huiling Zhang School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; (M.S.R.); (H.Z.); (M.T.H.) Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
Md. Tofazzal Hossain School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; (M.S.R.); (H.Z.); (M.T.H.) Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
Langxi Jin Department of Computer Science and Technology, School of Computer Science and Technology, Harbin University of Science and Technology, 52 Xuefu Road, Nangang District, Harbin 150080, China;
Shengzhong Feng Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
Yanjie Wei School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; (M.S.R.); (H.Z.); (M.T.H.) Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Correspondence:

Collapse

Bahai A, Asgari E, Mofrad MRK, Kloetgen A, McHardy AC. EpitopeVec: Linear Epitope Prediction Using Deep Protein Sequence Embeddings. Bioinformatics 2021;37:4517-4525. [PMID: 34180989 PMCID: PMC8652027 DOI: 10.1093/bioinformatics/btab467] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 05/28/2021] [Accepted: 06/25/2021] [Indexed: 11/19/2022] Open

Zhang H, Bei Z, Xi W, Hao M, Ju Z, Saravanan KM, Zhang H, Guo N, Wei Y. Evaluation of residue-residue contact prediction methods: From retrospective to prospective. PLoS Comput Biol 2021;17:e1009027. [PMID: 34029314 PMCID: PMC8177648 DOI: 10.1371/journal.pcbi.1009027] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 06/04/2021] [Accepted: 04/28/2021] [Indexed: 12/31/2022] Open

Abstract

Sequence-based residue contact prediction plays a crucial role in protein structure reconstruction. In recent years, the combination of evolutionary coupling analysis (ECA) and deep learning (DL) techniques has made tremendous progress for residue contact prediction, thus a comprehensive assessment of current methods based on a large-scale benchmark data set is very needed. In this study, we evaluate 18 contact predictors on 610 non-redundant proteins and 32 CASP13 targets according to a wide range of perspectives. The results show that different methods have different application scenarios: (1) DL methods based on multi-categories of inputs and large training sets are the best choices for low-contact-density proteins such as the intrinsically disordered ones and proteins with shallow multi-sequence alignments (MSAs). (2) With at least 5L (L is sequence length) effective sequences in the MSA, all the methods show the best performance, and methods that rely only on MSA as input can reach comparable achievements as methods that adopt multi-source inputs. (3) For top L/5 and L/2 predictions, DL methods can predict more hydrophobic interactions while ECA methods predict more salt bridges and disulfide bonds. (4) ECA methods can detect more secondary structure interactions, while DL methods can accurately excavate more contact patterns and prune isolated false positives. In general, multi-input DL methods with large training sets dominate current approaches with the best overall performance. Despite the great success of current DL methods must be stated the fact that there is still much room left for further improvement: (1) With shallow MSAs, the performance will be greatly affected. (2) Current methods show lower precisions for inter-domain compared with intra-domain contact predictions, as well as very high imbalances in precisions between intra-domains. (3) Strong prediction similarities between DL methods indicating more feature types and diversified models need to be developed. (4) The runtime of most methods can be further optimized.

The amino acid sequence of a protein ultimately determines its tertiary structure, and the tertiary structure determines its function(s) and plays a key role in understanding biological processes and disease pathogenesis. Protein tertiary structure can be determined using experimental techniques such as cryo-electron microscopy, nuclear magnetic resonance and X-ray crystallography, which are very expensive and time-consuming. As an alternative, researchers are trying to use in silico methods to predict the 3D structures. Residue contact-assisted protein folding paves an avenue for sequence-based protein structure prediction and therefore has become one of the most challenging and promising problems in structural bioinformatics. Over the past years, contact prediction has undergone continuous evolution in techniques. Through a retrospective analysis of traditional machine learning /evolutionary coupling analysis methods/ consensus machine learning methods and a multi-perspective study on recently developed deep learning methods, we explore the most advanced contact predictors, pursue application scenarios for different methods, and seek prospective directions for further improvement. We anticipate that our study will serve as a practical and useful guide for the development of future approaches to contact prediction.

Collapse

Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks. PLoS Comput Biol 2021;17:e1008865. [PMID: 33770072 PMCID: PMC8026059 DOI: 10.1371/journal.pcbi.1008865] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 04/07/2021] [Accepted: 03/10/2021] [Indexed: 12/24/2022] Open

Abstract

The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP 11&12 and CAMEO experiments and outperformed other top methods from CASP12 by at least 58.4% for the CASP 11&12 targets and 44.4% for the CAMEO targets in the top-L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top-L/5 long-range contact predictions. It was also shown that a simple re-training of the TripletRes model with more proteins can lead to further improvement with precisions comparable to state-of-the-art methods developed after CASP13. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library.

Ab initio protein folding has been a major unsolved problem in computational biology for more than half a century. Recent community-wide Critical Assessment of Structure Prediction (CASP) experiments have witnessed exciting progress on ab initio structure prediction, which was mainly powered by the boosting of contact-map prediction as the latter can be used as constraints to guide ab initio folding simulations. In this work, we proposed a new open-source deep-learning architecture, TripletRes, built on the residual convolutional neural networks for high-accuracy contact prediction. The large-scale benchmark and blind test results demonstrate competitive performance of the proposed methods to other top approaches in predicting medium- and long-range contact-maps that are critical for guiding protein folding simulations. Detailed data analyses showed that the major advantage of TripletRes lies in the unique protocol to fuse multiple evolutionary feature matrices which are directly extracted from whole-genome and metagenome databases and therefore minimize the information loss during the contact model training.

Collapse

Zhang GJ, Wang XQ, Ma LF, Wang LJ, Hu J, Zhou XG. Two-Stage Distance Feature-based Optimization Algorithm for De novo Protein Structure Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:2119-2130. [PMID: 31107659 DOI: 10.1109/tcbb.2019.2917452] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Sun D, Gong X. Tetramer protein complex interface residue pairs prediction with LSTM combined with graph representations. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2020;1868:140504. [PMID: 32717382 DOI: 10.1016/j.bbapap.2020.140504] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 06/30/2020] [Accepted: 07/16/2020] [Indexed: 10/23/2022]

Abstract

MOTIVATION

Protein-protein interactions are important for many biological processes. Theoretical understanding of the structurally determining factors of interaction sites will help to understand the underlying mechanism of protein-protein interactions. Taking advantage of advanced mathematical methods to correctly predict interaction sites will be useful. Although some previous studies have been devoted to the interaction interface of protein monomer and the interface residues between chains of protein dimers, very few studies about the interface residues prediction of protein multimers, including trimers, tetramer and even more monomers in a large protein complex. As we all know, a large number of proteins function with the form of multibody protein complexes. And the complexity of the protein multimers structure causes the difficulty of interface residues prediction on them. So, we hope to build a method for the prediction of protein tetramer interface residue pairs.

RESULTS

Here, we developed a new deep network based on LSTM network combining with graph to predict protein tetramers interaction interface residue pairs. On account of the protein structure data is not the same as the image or video data which is well-arranged matrices, namely the Euclidean Structure mentioned in many researches. Because the Non-Euclidean Structure data can't keep the translation invariance, and we hope to extract some spatial features from this kind of data applying on deep learning, an algorithm combining with graph was developed to predict the interface residue pairs of protein interactions based on a topological graph building a relationship between vertexes and edges in graph theory combining multilayer Long Short-Term Memory network. First, selecting the training and test samples from the Protein Data Bank, and then extracting the physicochemical property features and the geometric features of surface residue associated with interfacial properties. Subsequently, we transform the protein multimers data to topological graphs and predict protein interaction interface residue pairs using the model. In addition, different types of evaluation indicators verified its validity.

Collapse

Sun J, Frishman D. DeepHelicon: Accurate prediction of inter-helical residue contacts in transmembrane proteins by residual neural networks. J Struct Biol 2020;212:107574. [PMID: 32663598 DOI: 10.1016/j.jsb.2020.107574] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 07/03/2020] [Accepted: 07/07/2020] [Indexed: 01/16/2023]

Feng J, Shukla D. FingerprintContacts: Predicting Alternative Conformations of Proteins from Coevolution. J Phys Chem B 2020;124:3605-3615. [PMID: 32283936 DOI: 10.1021/acs.jpcb.9b11869] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Chen MC, Li Y, Zhu YH, Ge F, Yu DJ. SSCpred: Single-Sequence-Based Protein Contact Prediction Using Deep Fully Convolutional Network. J Chem Inf Model 2020;60:3295-3303. [DOI: 10.1021/acs.jcim.9b01207] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Lemke T, Berg A, Jain A, Peter C. EncoderMap(II): Visualizing Important Molecular Motions with Improved Generation of Protein Conformations. J Chem Inf Model 2019;59:4550-4560. [PMID: 31647645 DOI: 10.1021/acs.jcim.9b00675] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Abstract

Dimensionality reduction can be used to project high-dimensional molecular data into a simplified, low-dimensional map. One feature of our recently introduced dimensionality reduction technique EncoderMap, which relies on the combination of an autoencoder with multidimensional scaling, is its ability to do the reverse. It is able to generate conformations for any selected points in the low-dimensional map. This transfers the simplified, low-dimensional map back into the high-dimensional conformational space. Although the output is again high-dimensional, certain aspects of the simplification are preserved. The generated conformations only mirror the most dominant conformational differences that determine the positions of conformational states in the low-dimensional map. This allows depicting such differences and-in consequence-visualizing molecular motions and gives a unique perspective on high-dimensional conformational data. In our previous work, protein conformations described in backbone dihedral angle space were used as the input for EncoderMap, and conformations were also generated in this space. For large proteins, however, the generation of conformations is inaccurate with this approach due to the local character of backbone dihedral angles. Here, we present an improved variant of EncoderMap which is able to generate large protein conformations that are accurate in short-range and long-range orders. This is achieved by differentiable reconstruction of Cartesian coordinates from the generated dihedrals, which allows adding a contribution to the cost function that monitors the accuracy of all pairwise distances between the C_α-atoms of the generated conformations. The improved capabilities to generate conformations of large, even multidomain, proteins are demonstrated for two examples: diubiquitin and a part of the Ssa1 Hsp70 yeast chaperone. We show that the improved variant of EncoderMap can nicely visualize motions of protein domains relative to each other but is also able to highlight important conformational changes within the individual domains.

Collapse

Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y. Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks. Bioinformatics 2019;34:4039-4045. [PMID: 29931279 DOI: 10.1093/bioinformatics/bty481] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2018] [Accepted: 06/13/2018] [Indexed: 11/12/2022] Open

Abstract

Motivation

Accurate prediction of a protein contact map depends greatly on capturing as much contextual information as possible from surrounding residues for a target residue pair. Recently, ultra-deep residual convolutional networks were found to be state-of-the-art in the latest Critical Assessment of Structure Prediction techniques (CASP12) for protein contact map prediction by attempting to provide a protein-wide context at each residue pair. Recurrent neural networks have seen great success in recent protein residue classification problems due to their ability to propagate information through long protein sequences, especially Long Short-Term Memory (LSTM) cells. Here, we propose a novel protein contact map prediction method by stacking residual convolutional networks with two-dimensional residual bidirectional recurrent LSTM networks, and using both one-dimensional sequence-based and two-dimensional evolutionary coupling-based information.

Results

We show that the proposed method achieves a robust performance over validation and independent test sets with the Area Under the receiver operating characteristic Curve (AUC) > 0.95 in all tests. When compared to several state-of-the-art methods for independent testing of 228 proteins, the method yields an AUC value of 0.958, whereas the next-best method obtains an AUC of 0.909. More importantly, the improvement is over contacts at all sequence-position separations. Specifically, a 8.95%, 5.65% and 2.84% increase in precision were observed for the top L∕10 predictions over the next best for short, medium and long-range contacts, respectively. This confirms the usefulness of ResNets to congregate the short-range relations and 2D-BRLSTM to propagate the long-range dependencies throughout the entire protein contact map 'image'.

Availability and implementation

SPOT-Contact server url: http://sparks-lab.org/jack/server/SPOT-Contact/.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Zheng W, Wuyun Q, Li Y, Mortuza SM, Zhang C, Pearce R, Ruan J, Zhang Y. Detecting distant-homology protein structures by aligning deep neural-network based contact maps. PLoS Comput Biol 2019;15:e1007411. [PMID: 31622328 PMCID: PMC6818797 DOI: 10.1371/journal.pcbi.1007411] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Revised: 10/29/2019] [Accepted: 09/21/2019] [Indexed: 12/31/2022] Open

Abstract

Accurate prediction of atomic-level protein structure is important for annotating the biological functions of protein molecules and for designing new compounds to regulate the functions. Template-based modeling (TBM), which aims to construct structural models by copying and refining the structural frameworks of other known proteins, remains the most accurate method for protein structure prediction. Due to the difficulty in recognizing distant-homology templates, however, the accuracy of TBM decreases rapidly when the evolutionary relationship between the query and template vanishes. In this study, we propose a new method, CEthreader, which first predicts residue-residue contacts by coupling evolutionary precision matrices with deep residual convolutional neural-networks. The predicted contact maps are then integrated with sequence profile alignments to recognize structural templates from the PDB. The method was tested on two independent benchmark sets consisting collectively of 1,153 non-homologous protein targets, where CEthreader detected 176% or 36% more correct templates with a TM-score >0.5 than the best state-of-the-art profile- or contact-based threading methods, respectively, for the Hard targets that lacked homologous templates. Moreover, CEthreader was able to identify 114% or 20% more correct templates with the same Fold as the query, after excluding structures from the same SCOPe Superfamily, than the best profile- or contact-based threading methods. Detailed analyses show that the major advantage of CEthreader lies in the efficient coupling of contact maps with profile alignments, which helps recognize global fold of protein structures when the homologous relationship between the query and template is weak. These results demonstrate an efficient new strategy to combine ab initio contact map prediction with profile alignments to significantly improve the accuracy of template-based structure prediction, especially for distant-homology proteins.

Despite decades of effort in computational method development, template-based modeling (TBM) still remains the most reliable approach to high-resolution protein structure prediction. Previous studies have shown that the PDB library is complete for single-domain proteins and TBM is in principle sufficient to solve the structure prediction problem if the most similar structure in the PDB could be reliably identified and used as template for model reconstruction. But in reality, the success of TBM depends on the availability of closely-homologous templates, where its accuracy and reliability decrease sharply when the evolutionary relationship between query and template becomes more distant. We developed a new threading approach, CEthreader, which allows for dynamic programing alignments of predicted contact-maps through eigen-decomposition. The large-scale benchmark tests show that the coupling of contact map with profile and secondary structure alignments through the proposed protocol can significantly improve the accuracy of template recognition for distantly-homologous protein targets.

Collapse

Zheng W, Zhang C, Bell EW, Zhang Y. I-TASSER gateway: A protein structure and function prediction server powered by XSEDE. FUTURE GENERATIONS COMPUTER SYSTEMS : FGCS 2019;99:73-85. [PMID: 31427836 PMCID: PMC6699767 DOI: 10.1016/j.future.2019.04.011] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]

Wu Q, Peng Z, Anishchenko I, Cong Q, Baker D, Yang J. Protein contact prediction using metagenome sequence data and residual neural networks. Bioinformatics 2019;36:41-48. [PMID: 31173061 PMCID: PMC8792440 DOI: 10.1093/bioinformatics/btz477] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2018] [Revised: 05/30/2019] [Accepted: 06/04/2019] [Indexed: 01/31/2023] Open

El Hefnawi MM, Hasan ME, Mahmoud A, Khidr YA, El Behaidy WH, El-Absawy ESA, Hemeida AA. Prediction and Analysis of Three-Dimensional Structure of the p7- Transactivated Protein1 of Hepatitis C Virus. Infect Disord Drug Targets 2019;19:55-66. [PMID: 29243584 DOI: 10.2174/1871526518666171215123214] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2017] [Revised: 06/07/2017] [Accepted: 06/11/2017] [Indexed: 01/06/2023]

Malik A, Afaq S, Gamal BE, Ellatif MA, Hassan WN, Dera A, Noor R, Tarique M. Molecular docking and pharmacokinetic evaluation of natural compounds as targeted inhibitors against Crz1 protein in Rhizoctonia solani. Bioinformation 2019;15:277-286. [PMID: 31285645 PMCID: PMC6599437 DOI: 10.6026/97320630015277] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Accepted: 03/27/2019] [Indexed: 11/29/2022] Open

Alwabli AS, Alattas SG, Alhebshi AM, Zabermawi NM, Alkenani N, ghmady KA, Qadri I. Molecular docking analysis of netropsin and novobiocin with the viral protein targets HABD, MTD and RCD. Bioinformation 2019;15:233-239. [PMID: 31285639 PMCID: PMC6599439 DOI: 10.6026/97320630015233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2019] [Revised: 03/03/2019] [Accepted: 03/09/2019] [Indexed: 12/02/2022] Open

Luttrell J, Liu T, Zhang C, Wang Z. Predicting protein residue-residue contacts using random forests and deep networks. BMC Bioinformatics 2019;20:100. [PMID: 30871477 PMCID: PMC6419322 DOI: 10.1186/s12859-019-2627-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

The ability to predict which pairs of amino acid residues in a protein are in contact with each other offers many advantages for various areas of research that focus on proteins. For example, contact prediction can be used to reduce the computational complexity of predicting the structure of proteins and even to help identify functionally important regions of proteins. These predictions are becoming especially important given the relatively low number of experimentally determined protein structures compared to the amount of available protein sequence data.

RESULTS

Here we have developed and benchmarked a set of machine learning methods for performing residue-residue contact prediction, including random forests, direct-coupling analysis, support vector machines, and deep networks (stacked denoising autoencoders). These methods are able to predict contacting residue pairs given only the amino acid sequence of a protein. According to our own evaluations performed at a resolution of +/- two residues, the predictors we trained with the random forest algorithm were our top performing methods with average top 10 prediction accuracy scores of 85.13% (short range), 74.49% (medium range), and 54.49% (long range). Our ensemble models (stacked denoising autoencoders combined with support vector machines) were our best performing deep network predictors and achieved top 10 prediction accuracy scores of 75.51% (short range), 60.26% (medium range), and 43.85% (long range) using the same evaluation. These tests were blindly performed on targets from the CASP11 dataset; and the results suggested that our models achieved comparable performance to contact predictors developed by groups that participated in CASP11.

CONCLUSIONS

Due to the challenging nature of contact prediction, it is beneficial to develop and benchmark a variety of different prediction methods. Our work has produced useful tools with a simple interface that can provide contact predictions to users without requiring a lengthy installation process. In addition to this, we have released our C++ implementation of the direct-coupling analysis method as a standalone software package. Both this tool and our RFcon web server are freely available to the public at http://dna.cs.miami.edu/RFcon /.

Collapse

Jing X, Dong Q, Lu R, Dong Q. Protein Inter-Residue Contacts Prediction: Methods, Performances and Applications. Curr Bioinform 2019. [DOI: 10.2174/1574893613666181109130430] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

DESTINI: A deep-learning approach to contact-driven protein structure prediction. Sci Rep 2019;9:3514. [PMID: 30837676 PMCID: PMC6401133 DOI: 10.1038/s41598-019-40314-1] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 02/12/2019] [Indexed: 11/09/2022] Open

Wuyun Q, Zheng W, Peng Z, Yang J. A large-scale comparative assessment of methods for residue-residue contact prediction. Brief Bioinform 2019;19:219-230. [PMID: 27802931 DOI: 10.1093/bib/bbw106] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Indexed: 11/14/2022] Open

Ding W, Mao W, Shao D, Zhang W, Gong H. DeepConPred2: An Improved Method for the Prediction of Protein Residue Contacts. Comput Struct Biotechnol J 2018;16:503-510. [PMID: 30505403 PMCID: PMC6247404 DOI: 10.1016/j.csbj.2018.10.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 10/16/2018] [Accepted: 10/18/2018] [Indexed: 12/18/2022] Open

Kc DB. Recent advances in sequence-based protein structure prediction. Brief Bioinform 2018;18:1021-1032. [PMID: 27562963 DOI: 10.1093/bib/bbw070] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Indexed: 11/13/2022] Open

Correa L, Borguesan B, Farfan C, Inostroza-Ponta M, Dorn M. A Memetic Algorithm for 3-D Protein Structure Prediction Problem. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018;15:690-704. [PMID: 27925594 DOI: 10.1109/tcbb.2016.2635143] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

He B, Mortuza SM, Wang Y, Shen HB, Zhang Y. NeBcon: protein contact map prediction using neural network training coupled with naïve Bayes classifiers. Bioinformatics 2018;33:2296-2306. [PMID: 28369334 DOI: 10.1093/bioinformatics/btx164] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 03/21/2017] [Indexed: 12/12/2022] Open

Abstract

Motivation

Recent CASP experiments have witnessed exciting progress on folding large-size non-humongous proteins with the assistance of co-evolution based contact predictions. The success is however anecdotal due to the requirement of the contact prediction methods for the high volume of sequence homologs that are not available to most of the non-humongous protein targets. Development of efficient methods that can generate balanced and reliable contact maps for different type of protein targets is essential to enhance the success rate of the ab initio protein structure prediction.

Results

We developed a new pipeline, NeBcon, which uses the naïve Bayes classifier (NBC) theorem to combine eight state of the art contact methods that are built from co-evolution and machine learning approaches. The posterior probabilities of the NBC model are then trained with intrinsic structural features through neural network learning for the final contact map prediction. NeBcon was tested on 98 non-redundant proteins, which improves the accuracy of the best co-evolution based meta-server predictor by 22%; the magnitude of the improvement increases to 45% for the hard targets that lack sequence and structural homologs in the databases. Detailed data analysis showed that the major contribution to the improvement is due to the optimized NBC combination of the complementary information from both co-evolution and machine learning predictions. The neural network training also helps to improve the coupling of the NBC posterior probability and the intrinsic structural features, which were found particularly important for the proteins that do not have sufficient number of homologous sequences to derive reliable co-evolution profiles.

Availiablity and Implementation

On-line server and standalone package of the program are available at http://zhanglab.ccmb.med.umich.edu/NeBcon/ .

Contact

zhng@umich.edu.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Schaarschmidt J, Monastyrskyy B, Kryshtafovych A, Bonvin AM. Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age. Proteins 2018;86 Suppl 1:51-66. [PMID: 29071738 PMCID: PMC5820169 DOI: 10.1002/prot.25407] [Citation(s) in RCA: 126] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Revised: 10/06/2017] [Accepted: 10/24/2017] [Indexed: 12/20/2022]

Li B, Fooksa M, Heinze S, Meiler J. Finding the needle in the haystack: towards solving the protein-folding problem computationally. Crit Rev Biochem Mol Biol 2018;53:1-28. [PMID: 28976219 PMCID: PMC6790072 DOI: 10.1080/10409238.2017.1380596] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 08/22/2017] [Accepted: 09/13/2017] [Indexed: 12/22/2022]

Mandalaparthy V, Sanaboyana VR, Rafalia H, Gosavi S. Exploring the effects of sparse restraints on protein structure prediction. Proteins 2017;86:248-262. [DOI: 10.1002/prot.25438] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Revised: 11/20/2017] [Accepted: 11/29/2017] [Indexed: 01/06/2023]

Zhang C, Mortuza SM, He B, Wang Y, Zhang Y. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12. Proteins 2017;86 Suppl 1:136-151. [PMID: 29082551 DOI: 10.1002/prot.25414] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 10/09/2017] [Accepted: 10/27/2017] [Indexed: 12/26/2022]

Wang S, Li Z, Yu Y, Xu J. Folding Membrane Proteins by Deep Transfer Learning. Cell Syst 2017;5:202-211.e3. [PMID: 28957654 PMCID: PMC5637520 DOI: 10.1016/j.cels.2017.09.001] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Revised: 06/01/2017] [Accepted: 08/29/2017] [Indexed: 01/02/2023]