1
|
Dahlström KM, Salminen TA. Apprehensions and emerging solutions in ML-based protein structure prediction. Curr Opin Struct Biol 2024; 86:102819. [PMID: 38631107 DOI: 10.1016/j.sbi.2024.102819] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/05/2024] [Accepted: 03/31/2024] [Indexed: 04/19/2024]
Abstract
The three-dimensional structure of proteins determines their function in vital biological processes. Thus, when the structure is known, the molecular mechanism of protein function can be understood in more detail and obtained information utilized in biotechnological, diagnostics, and therapeutic applications. Over the past five years, machine learning (ML)-based modeling has pushed protein structure prediction to the next level with AlphaFold in the front line, predicting the structure for hundreds of millions of proteins. Further advances recently report promising ML-based approaches for solving remaining challenges by incorporating functionally important metals, co-factors, post-translational modifications, structural dynamics, and interdomain and multimer interactions in the structure prediction process.
Collapse
Affiliation(s)
- Käthe M Dahlström
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering, Åbo Akademi University, Tykistökatu 6A, 20520 Turku, Finland; InFLAMES Research Flagship Center, Åbo Akademi University, 20520 Turku, Finland
| | - Tiina A Salminen
- Structural Bioinformatics Laboratory, Biochemistry, Faculty of Science and Engineering, Åbo Akademi University, Tykistökatu 6A, 20520 Turku, Finland; InFLAMES Research Flagship Center, Åbo Akademi University, 20520 Turku, Finland.
| |
Collapse
|
2
|
Werren EA, Peirent ER, Jantti H, Guxholli A, Srivastava KR, Orenstein N, Narayanan V, Wiszniewski W, Dawidziuk M, Gawlinski P, Umair M, Khan A, Khan SN, Geneviève D, Lehalle D, van Gassen KLI, Giltay JC, Oegema R, van Jaarsveld RH, Rafiullah R, Rappold GA, Rabin R, Pappas JG, Wheeler MM, Bamshad MJ, Tsan YC, Johnson MB, Keegan CE, Srivastava A, Bielas SL. Biallelic variants in CSMD1 are implicated in a neurodevelopmental disorder with intellectual disability and variable cortical malformations. Cell Death Dis 2024; 15:379. [PMID: 38816421 PMCID: PMC11140003 DOI: 10.1038/s41419-024-06768-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 05/03/2024] [Accepted: 05/22/2024] [Indexed: 06/01/2024]
Abstract
CSMD1 (Cub and Sushi Multiple Domains 1) is a well-recognized regulator of the complement cascade, an important component of the innate immune response. CSMD1 is highly expressed in the central nervous system (CNS) where emergent functions of the complement pathway modulate neural development and synaptic activity. While a genetic risk factor for neuropsychiatric disorders, the role of CSMD1 in neurodevelopmental disorders is unclear. Through international variant sharing, we identified inherited biallelic CSMD1 variants in eight individuals from six families of diverse ancestry who present with global developmental delay, intellectual disability, microcephaly, and polymicrogyria. We modeled CSMD1 loss-of-function (LOF) pathogenesis in early-stage forebrain organoids differentiated from CSMD1 knockout human embryonic stem cells (hESCs). We show that CSMD1 is necessary for neuroepithelial cytoarchitecture and synchronous differentiation. In summary, we identified a critical role for CSMD1 in brain development and biallelic CSMD1 variants as the molecular basis of a previously undefined neurodevelopmental disorder.
Collapse
Affiliation(s)
- Elizabeth A Werren
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Advanced Precision Medicine Laboratory, The Jackson Laboratory for Genomic Medicine, Farmington, CTt, 06032, USA
| | - Emily R Peirent
- Neuroscience Graduate Program, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Henna Jantti
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Alba Guxholli
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Department of Pediatrics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Kinshuk Raj Srivastava
- Medicinal and Process Chemistry Division, CSIR-Central Drug Research Institute, Lucknow, 226031, India
| | - Naama Orenstein
- Schneider Children's Medical Center of Israel, Petah Tikva, 4920235, Israel
| | - Vinodh Narayanan
- Center for Rare Childhood Disorders, Translational Genomics Research Institute, Phoenix, AZ, 85004, USA
| | - Wojciech Wiszniewski
- Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, OR, 97239, USA
| | - Mateusz Dawidziuk
- Department of Medical Genetics, Institute of Mother and Child, Warsaw, 01-211, Poland
| | - Pawel Gawlinski
- Department of Medical Genetics, Institute of Mother and Child, Warsaw, 01-211, Poland
| | - Muhammad Umair
- Medical Genomics Research Department, King Abdullah International Medical Research Center, King Saud Bin Abdulaziz University for Health Sciences, Ministry of National Guard Health Affairs, Riyadh, 11481, Saudi Arabia
- Department of Life Sciences, School of Science, University of Management and Technology, Lahore, Punjab, 54770, Pakistan
| | - Amjad Khan
- Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, OR, 97239, USA
- Department of Zoology, University of Lakki Marwat, Lakki Marwat, Khyber Pakhtunkhwa, 28420, Pakistan
| | - Shahid Niaz Khan
- Department of Zoology, Kohat University of Science and Technology, Kohat, Pakistan
| | - David Geneviève
- Montpellier University, Inserm Unit U1183, Reference Center for Rare Diseases and Developmental Anomalies, CHU, 34000, Montpellier, France
| | - Daphné Lehalle
- Sorbonne University, Department of Medical Genetics, Hospital Armand Trousseau, 75012, Paris, France
| | - K L I van Gassen
- Department of Genetics, University Medical Centre Utrecht, Utrecht University, Utrecht, 3584 EA, The Netherlands
| | - Jacques C Giltay
- Department of Genetics, University Medical Centre Utrecht, Utrecht University, Utrecht, 3584 EA, The Netherlands
| | - Renske Oegema
- Department of Genetics, University Medical Centre Utrecht, Utrecht University, Utrecht, 3584 EA, The Netherlands
| | - Richard H van Jaarsveld
- Department of Genetics, University Medical Centre Utrecht, Utrecht University, Utrecht, 3584 EA, The Netherlands
| | - Rafiullah Rafiullah
- Department of Biotechnology, Faculty of Life Sciences, BUITEMS, Quetta, 87300, Pakistan
| | - Gudrun A Rappold
- Department of Human Molecular Genetics, Institute of Human Genetics, Ruprecht-Karls-University, Heidelberg, 69120, Germany
| | - Rachel Rabin
- Department of Pediatrics, NYU Grossman School of Medicine, New York, NY, 10016, USA
| | - John G Pappas
- Department of Pediatrics, NYU Grossman School of Medicine, New York, NY, 10016, USA
| | - Marsha M Wheeler
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Michael J Bamshad
- Department of Pediatrics, University of Washington, Seattle, WA, 98195, USA
- Brotman Baty Institute, Washington, 98195, USA
| | - Yao-Chang Tsan
- Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Matthew B Johnson
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Catherine E Keegan
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
- Department of Pediatrics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA
| | - Anshika Srivastava
- Department of Medical Genetics, Sanjay Gandhi Postgraduate Institute of Medical Sciences, Lucknow, Uttar Pradesh, 226014, India.
| | - Stephanie L Bielas
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
- Department of Pediatrics, University of Michigan Medical School, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
3
|
Wuyun Q, Chen Y, Shen Y, Cao Y, Hu G, Cui W, Gao J, Zheng W. Recent Progress of Protein Tertiary Structure Prediction. Molecules 2024; 29:832. [PMID: 38398585 PMCID: PMC10893003 DOI: 10.3390/molecules29040832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/06/2024] [Accepted: 02/08/2024] [Indexed: 02/25/2024] Open
Abstract
The prediction of three-dimensional (3D) protein structure from amino acid sequences has stood as a significant challenge in computational and structural bioinformatics for decades. Recently, the widespread integration of artificial intelligence (AI) algorithms has substantially expedited advancements in protein structure prediction, yielding numerous significant milestones. In particular, the end-to-end deep learning method AlphaFold2 has facilitated the rise of structure prediction performance to new heights, regularly competitive with experimental structures in the 14th Critical Assessment of Protein Structure Prediction (CASP14). To provide a comprehensive understanding and guide future research in the field of protein structure prediction for researchers, this review describes various methodologies, assessments, and databases in protein structure prediction, including traditionally used protein structure prediction methods, such as template-based modeling (TBM) and template-free modeling (FM) approaches; recently developed deep learning-based methods, such as contact/distance-guided methods, end-to-end folding methods, and protein language model (PLM)-based methods; multi-domain protein structure prediction methods; the CASP experiments and related assessments; and the recently released AlphaFold Protein Structure Database (AlphaFold DB). We discuss their advantages, disadvantages, and application scopes, aiming to provide researchers with insights through which to understand the limitations, contexts, and effective selections of protein structure prediction methods in protein-related fields.
Collapse
Affiliation(s)
- Qiqige Wuyun
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Yihan Chen
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Yifeng Shen
- Faculty of Environment and Information Studies, Keio University, Fujisawa 252-0882, Kanagawa, Japan;
| | - Yang Cao
- College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Gang Hu
- NITFID, School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin 300071, China
| | - Wei Cui
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Jianzhao Gao
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, China;
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
4
|
Peng CX, Liang F, Xia YH, Zhao KL, Hou MH, Zhang GJ. Recent Advances and Challenges in Protein Structure Prediction. J Chem Inf Model 2024; 64:76-95. [PMID: 38109487 DOI: 10.1021/acs.jcim.3c01324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2023]
Abstract
Artificial intelligence has made significant advances in the field of protein structure prediction in recent years. In particular, DeepMind's end-to-end model, AlphaFold2, has demonstrated the capability to predict three-dimensional structures of numerous unknown proteins with accuracy levels comparable to those of experimental methods. This breakthrough has opened up new possibilities for understanding protein structure and function as well as accelerating drug discovery and other applications in the field of biology and medicine. Despite the remarkable achievements of artificial intelligence in the field, there are still some challenges and limitations. In this Review, we discuss the recent progress and some of the challenges in protein structure prediction. These challenges include predicting multidomain protein structures, protein complex structures, multiple conformational states of proteins, and protein folding pathways. Furthermore, we highlight directions in which further improvements can be conducted.
Collapse
Affiliation(s)
- Chun-Xiang Peng
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Fang Liang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Kai-Long Zhao
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Ming-Hua Hou
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
| |
Collapse
|