1
|
Tsai YX, Chien YC, Hsu MF, Khoo KH, Hsu STD. Molecular basis of host recognition of human coronavirus 229E. Nat Commun 2025; 16:2045. [PMID: 40016196 DOI: 10.1038/s41467-025-57359-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2024] [Accepted: 02/20/2025] [Indexed: 03/01/2025] Open
Abstract
Human coronavirus 229E (HCoV-229E) is the earliest CoV found to infect humans. It binds to the human aminopeptidase N (hAPN) through the receptor binding domain (RBD) of its spike (S) protein to achieve host recognition. We present the cryo-electron microscopy structure of two HCoV-229E S protein in complex with a dimeric hAPN to provide structural insights on how the HCoV-229E S protein opens up its RBD to engage with its host receptor, information that is currently missing among alphacoronaviruses to which HCoV-229E belong. We quantitatively profile the glycosylation of HCoV-229E S protein and hAPN to deduce the glyco-shielding effects pertinent to antigenicity and host recognition. Finally, we present an atomic model of fully glycosylated HCoV-229E S in complex with hAPN anchored on their respective membrane bilayers to recapitulate the structural basis of the first step of host infection by HCoV-229E.
Collapse
Affiliation(s)
- Yu-Xi Tsai
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan
- Institute of Biochemical Sciences, National Taiwan University, Taipei, 10617, Taiwan
| | - Yu-Chun Chien
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan
- Institute of Biochemical Sciences, National Taiwan University, Taipei, 10617, Taiwan
| | - Min-Feng Hsu
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan
| | - Kay-Hooi Khoo
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan
- Institute of Biochemical Sciences, National Taiwan University, Taipei, 10617, Taiwan
| | - Shang-Te Danny Hsu
- Institute of Biological Chemistry, Academia Sinica, Taipei, 11529, Taiwan.
- Institute of Biochemical Sciences, National Taiwan University, Taipei, 10617, Taiwan.
- International Institute for Sustainability with Knotted Chiral Meta Matter (WPI-SKCM²), Hiroshima University, 1-3-1 Kagamiyama, Higashi-Hiroshima, Hiroshima, 739-8526, Japan.
| |
Collapse
|
2
|
Montanucci L, Brünger T, Bhattarai N, Boßelmann CM, Kim S, Allen JP, Zhang J, Klöckner C, Krey I, Fariselli P, May P, Lemke JR, Myers SJ, Yuan H, Traynelis SF, Lal D. Ligand distances as key predictors of pathogenicity and function in NMDA receptors. Hum Mol Genet 2025; 34:128-139. [PMID: 39535073 PMCID: PMC11780861 DOI: 10.1093/hmg/ddae156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2024] [Revised: 10/10/2024] [Accepted: 10/30/2024] [Indexed: 11/16/2024] Open
Abstract
Genetic variants in the genes GRIN1, GRIN2A, GRIN2B, and GRIN2D, which encode subunits of the N-methyl-D-aspartate receptor (NMDAR), have been associated with severe and heterogeneous neurologic and neurodevelopmental disorders, including early onset epilepsy, developmental and epileptic encephalopathy, intellectual disability, and autism spectrum disorders. Missense variants in these genes can result in gain or loss of the NMDAR function, requiring opposite therapeutic treatments. Computational methods that predict pathogenicity and molecular functional effects of missense variants are therefore crucial for therapeutic applications. We assembled 223 missense variants from patients, 631 control variants from the general population, and 160 missense variants characterized by electrophysiological readouts that show whether they can enhance or reduce the function of the receptor. This includes new functional data from 33 variants reported here, for the first time. By mapping these variants onto the NMDAR protein structures, we found that pathogenic/benign variants and variants that increase/decrease the channel function were distributed unevenly on the protein structure, with spatial proximity to ligands bound to the agonist and antagonist binding sites being a key predictive feature for both variant pathogenicity and molecular functional consequences. Leveraging distances from ligands, we developed two machine-learning based predictors for NMDA variants: a pathogenicity predictor which outperforms currently available predictors and the first molecular function (increase/decrease) predictor. Our findings can have direct application to patient care by improving diagnostic yield for genetic neurodevelopmental disorders and by guiding personalized treatment informed by the knowledge of the molecular disease mechanism.
Collapse
Affiliation(s)
- Ludovica Montanucci
- Department of Neurology, McGovern Medical School, The University of Texas Health Science Center at Houston, 1133 John Freeman Blvd, Houston, TX 77030, United States
| | - Tobias Brünger
- Cologne Center for Genomics, University of Cologne, University Hospital Cologne, Weyertal 115b, Cologne 50937, Germany
| | - Nisha Bhattarai
- Epilepsy Center, Neurological Institute, Cleveland Clinic, 9500 Euclid Ave, Cleveland, OH 44106, United States
| | - Christian M Boßelmann
- Epilepsy Center, Neurological Institute, Cleveland Clinic, 9500 Euclid Ave, Cleveland, OH 44106, United States
| | - Sukhan Kim
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
- Center for Functional Evaluation of Rare Variants (CFERV), Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
| | - James P Allen
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
| | - Jing Zhang
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
| | - Chiara Klöckner
- Institute of Human Genetics, University of Leipzig Hospitals and Clinics, Philipp-Rosenthal-street 55, Leipzig 04103, Germany
| | - Ilona Krey
- Institute of Human Genetics, University of Leipzig Hospitals and Clinics, Philipp-Rosenthal-street 55, Leipzig 04103, Germany
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Via Santena 19,Torino, 10123, Italy
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 7 Av. des Hauts-Fourneaux, Esch-sur-Alzette, 4362, Luxembourg
| | - Johannes R Lemke
- Institute of Human Genetics, University of Leipzig Hospitals and Clinics, Philipp-Rosenthal-street 55, Leipzig 04103, Germany
| | - Scott J Myers
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
- Center for Functional Evaluation of Rare Variants (CFERV), Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
| | - Hongjie Yuan
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
- Center for Functional Evaluation of Rare Variants (CFERV), Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
| | - Stephen F Traynelis
- Department of Pharmacology and Chemical Biology, Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
- Center for Functional Evaluation of Rare Variants (CFERV), Emory University School of Medicine, 100 Woodruff Circle, Atlanta, GA 30322, United States
| | - Dennis Lal
- Department of Neurology, McGovern Medical School, The University of Texas Health Science Center at Houston, 1133 John Freeman Blvd, Houston, TX 77030, United States
- Epilepsy Center, Neurological Institute, Cleveland Clinic, 9500 Euclid Ave, Cleveland, OH 44106, United States
- Program in Medical and Population Genetics, Broad Institute of Massachusetts Institute of Technology (M.I.T.) and Harvard, 415 Main St, Cambridge, MA 02142, United States
- Stanley Center for Psychiatric Research, Broad Institute of Harvard and M.I.T, 415 Main St., Cambridge, MA 02142, United States
| |
Collapse
|
3
|
Hermans P, Tsishyn M, Schwersensky M, Rooman M, Pucci F. Exploring Evolution to Uncover Insights Into Protein Mutational Stability. Mol Biol Evol 2025; 42:msae267. [PMID: 39786559 PMCID: PMC11721782 DOI: 10.1093/molbev/msae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 11/27/2024] [Accepted: 11/28/2024] [Indexed: 01/12/2025] Open
Abstract
Determining the impact of mutations on the thermodynamic stability of proteins is essential for a wide range of applications such as rational protein design and genetic variant interpretation. Since protein stability is a major driver of evolution, evolutionary data are often used to guide stability predictions. Many state-of-the-art stability predictors extract evolutionary information from multiple sequence alignments of proteins homologous to a query protein, and leverage it to predict the effects of mutations on protein stability. To evaluate the power and the limitations of such methods, we used the massive amount of stability data recently obtained by deep mutational scanning to study how best to construct multiple sequence alignments and optimally extract evolutionary information from them. We tested different evolutionary models and found that, unexpectedly, independent-site models achieve similar accuracy to more complex epistatic models. A detailed analysis of the latter models suggests that their inference often results in noisy couplings, which do not appear to add predictive power over the independent-site contribution, at least in the context of stability prediction. Interestingly, by combining any of the evolutionary features with a simple structural feature, the relative solvent accessibility of the mutated residue, we achieved similar prediction accuracy to supervised, machine learning-based, protein stability change predictors. Our results provide new insights into the relationship between protein evolution and stability, and show how evolutionary information can be exploited to improve the performance of mutational stability prediction.
Collapse
Affiliation(s)
- Pauline Hermans
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels 1050, Belgium
| | - Matsvei Tsishyn
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels 1050, Belgium
| | - Martin Schwersensky
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels 1050, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels 1050, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Brussels 1050, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels 1050, Belgium
| |
Collapse
|
4
|
Sun J, Zhu T, Cui Y, Wu B. Structure-based self-supervised learning enables ultrafast protein stability prediction upon mutation. Innovation (N Y) 2025; 6:100750. [PMID: 39872490 PMCID: PMC11763918 DOI: 10.1016/j.xinn.2024.100750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 12/02/2024] [Indexed: 01/30/2025] Open
Abstract
Predicting free energy changes (ΔΔG) is essential for enhancing our understanding of protein evolution and plays a pivotal role in protein engineering and pharmaceutical development. While traditional methods offer valuable insights, they are often constrained by computational speed and reliance on biased training datasets. These constraints become particularly evident when aiming for accurate ΔΔG predictions across a diverse array of protein sequences. Herein, we introduce Pythia, a self-supervised graph neural network specifically designed for zero-shot ΔΔG predictions. Our comparative benchmarks demonstrate that Pythia outperforms other self-supervised pretraining models and force field-based approaches while also exhibiting competitive performance with fully supervised models. Notably, Pythia shows strong correlations and achieves a remarkable increase in computational speed of up to 105-fold. We further validated Pythia's performance in predicting the thermostabilizing mutations of limonene epoxide hydrolase, leading to higher experimental success rates. This exceptional efficiency has enabled us to explore 26 million high-quality protein structures, marking a significant advancement in our ability to navigate the protein sequence space and enhance our understanding of the relationships between protein genotype and phenotype. In addition, we established a web server at https://pythia.wulab.xyz to allow users to easily perform such predictions.
Collapse
Affiliation(s)
- Jinyuan Sun
- AIM Center, College of Life Sciences and Technology, Beijing University of Chemical Technology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Tong Zhu
- AIM Center, College of Life Sciences and Technology, Beijing University of Chemical Technology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yinglu Cui
- AIM Center, College of Life Sciences and Technology, Beijing University of Chemical Technology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| | - Bian Wu
- AIM Center, College of Life Sciences and Technology, Beijing University of Chemical Technology, Institute of Microbiology, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
5
|
Dieckhaus H, Kuhlman B. Protein stability models fail to capture epistatic interactions of double point mutations. Protein Sci 2025; 34:e70003. [PMID: 39704075 DOI: 10.1002/pro.70003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 11/06/2024] [Accepted: 12/05/2024] [Indexed: 12/21/2024]
Abstract
There is strong interest in accurate methods for predicting changes in protein stability resulting from amino acid mutations to the protein sequence. Recombinant proteins must often be stabilized to be used as therapeutics or reagents, and destabilizing mutations are implicated in a variety of diseases. Due to increased data availability and improved modeling techniques, recent studies have shown advancements in predicting changes in protein stability when a single-point mutation is made. Less focus has been directed toward predicting changes in protein stability when there are two or more mutations. Here, we analyze the largest available dataset of double point mutation stability and benchmark several widely used protein stability models on this and other datasets. We find that additive models of protein stability perform surprisingly well on this task, achieving similar performance to comparable non-additive predictors according to most metrics. Accordingly, we find that neither artificial intelligence-based nor physics-based protein stability models consistently capture epistatic interactions between single mutations. We observe one notable deviation from this trend, which is that epistasis-aware models provide marginally better predictions than additive models on stabilizing double point mutations. We develop an extension of the ThermoMPNN framework for double mutant modeling, as well as a novel data augmentation scheme, which mitigates some of the limitations in currently available datasets. Collectively, our findings indicate that current protein stability models fail to capture the nuanced epistatic interactions between concurrent mutations due to several factors, including training dataset limitations and insufficient model sensitivity.
Collapse
Affiliation(s)
- Henry Dieckhaus
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Division of Chemical Biology and Medicinal Chemistry, University of North Carolina Eshelman School of Pharmacy, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| |
Collapse
|
6
|
Savojardo C, Manfredi M, Martelli PL, Casadio R. DDGemb: predicting protein stability change upon single- and multi-point variations with embeddings and deep learning. Bioinformatics 2024; 41:btaf019. [PMID: 39799516 PMCID: PMC11783275 DOI: 10.1093/bioinformatics/btaf019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2024] [Revised: 11/14/2024] [Accepted: 01/10/2025] [Indexed: 01/15/2025] Open
Abstract
MOTIVATION The knowledge of protein stability upon residue variation is an important step for functional protein design and for understanding how protein variants can promote disease onset. Computational methods are important to complement experimental approaches and allow a fast screening of large datasets of variations. RESULTS In this work, we present DDGemb, a novel method combining protein language model embeddings and transformer architectures to predict protein ΔΔG upon both single- and multi-point variations. DDGemb has been trained on a high-quality dataset derived from literature and tested on available benchmark datasets of single- and multi-point variations. DDGemb performs at the state of the art in both single- and multi-point variations. AVAILABILITY AND IMPLEMENTATION DDGemb is available as web server at https://ddgemb.biocomp.unibo.it. Datasets used in this study are available at https://ddgemb.biocomp.unibo.it/datasets.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Via San Giacomo 9/2, Bologna, 40126, Italy
| | - Matteo Manfredi
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Via San Giacomo 9/2, Bologna, 40126, Italy
| | - Pier Luigi Martelli
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Via San Giacomo 9/2, Bologna, 40126, Italy
| | - Rita Casadio
- Biocomputing Group, Department of Pharmacy and Biotechnology, University of Bologna, Via San Giacomo 9/2, Bologna, 40126, Italy
- The Alma Climate Institute, Interdepartmental Center, University of Bologna, Bologna, 40100, Italy
| |
Collapse
|
7
|
Peka M, Balatsky V. Bioinformatic approach to identifying causative missense polymorphisms in animal genomes. BMC Genomics 2024; 25:1226. [PMID: 39701989 DOI: 10.1186/s12864-024-11126-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2024] [Accepted: 12/05/2024] [Indexed: 12/21/2024] Open
Abstract
BACKGROUND Trends in the development of genetic markers for the purposes of genomic and marker-assisted selection primarily focus on identifying causative polymorphisms. Using these polymorphisms as markers enables a more accurate association between genotype and phenotype. Bioinformatic analysis allows calculating the impact of missense polymorphisms on the structural and functional characteristics of proteins, which makes it promising for identifying causative polymorphisms. In this study, a bioinformatic approach is applied to evaluate and differentiate polymorphisms based on their causality in genes that affect the production traits of pigs and cows, which are two important livestock species. RESULTS The influence of both known causative and candidate missense polymorphisms in the MC4R, NR6A1, PRKAG3, RYR1, and SYNGR2 genes of pigs, as well as the ABCG2, DGAT1, GHR, and MSTN genes of cows, was assessed. The study included an evaluation of the effect of polymorphisms on protein functions, considering the evolutionary and physicochemical characteristics of amino acids at polymorphic sites. Additionally, it examined the impact of polymorphisms on the stability of tertiary protein structures, including changes in folding, binding of protein monomers, and interaction with ligands. CONCLUSIONS The comprehensive bioinformatic analysis used in this study enables the differentiation of polymorphisms into neutral, where both amino acids in the polymorphic site do not significantly affect the structure and function of the protein, and causative, where one of the amino acids significantly impacts the protein's properties. This approach can be employed in future research to screen extensive sets of polymorphisms in animal genomes, identifying the most promising polymorphisms for further investigation in association studies.
Collapse
Affiliation(s)
- Mykyta Peka
- Institute of Pig Breeding and Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, 1 Shvedska Mohyla St, Poltava, 36013, Ukraine.
- V. N. Karazin Kharkiv National University, 4 Svobody Sq, Kharkiv, 61022, Ukraine.
| | - Viktor Balatsky
- Institute of Pig Breeding and Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, 1 Shvedska Mohyla St, Poltava, 36013, Ukraine
| |
Collapse
|
8
|
Alzahrani AK, Imran M, Alshrari AS. Investigating the impact of SOD1 mutations on amyotrophic lateral sclerosis progression and potential drug repurposing through in silico analysis. J Biomol Struct Dyn 2024:1-16. [PMID: 39673548 DOI: 10.1080/07391102.2024.2439577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 05/29/2024] [Indexed: 12/16/2024]
Abstract
Superoxide dismutase 1 (SOD1) is a vital enzyme responsible for attenuating oxidative stress through its ability to facilitate the dismutation of the superoxide radical into oxygen and hydrogen peroxide. The progressive loss of motor neurons characterize amyotrophic lateral sclerosis (ALS), a crippling neurodegenerative disease that is caused by mutations in the SOD1 gene. In this study, in silico mutational analysis was performed to study the various mutations, the pathogenicity and stability ΔΔG (binding free energy) of the variant of SOD1. x in the protein variant analysis showed a considerable destabilizing effect with a ΔΔG value of -4.2 kcal/mol, signifying a notable impact on protein stability. Molecular dynamics simulations were conducted on both wild-type and C146R mutant SOD1. RMSD profiles indicated that both maintained consistent structural conformation over time. Additionally, virtual screening of 3067 FDA-approved drugs against the mutant SOD1 identified two potential binders, Tucatinib (51039094) and Regorafenib (11167602), which interacted with Leu106, similar to the control drug, Ebselen. Further simulations assessed the dynamic properties of SOD1 in monomeric and dimeric forms while bound to these compounds. 11167602 maintained stable interaction with the monomeric SOD1 mutant, whereas 51039094 and Ebselen dissociated from the monomeric protein's binding site. However, all three compounds were stably bound to the dimeric SOD1. MM/GBSA analysis revealed similar negative binding free energies for 11167602 and 51039094, identifying them as strong binders due to their interaction with Cys111. Experimental validation, including in vitro, cell-based, and in vivo assays are essential to confirm these candidates before advancing to clinical trials.
Collapse
Affiliation(s)
- A Khuzaim Alzahrani
- Department of Medical Laboratory Technology, Faculty of Medical Applied Science, Northern Border University, Arar, Saudi Arabia
| | - Mohd Imran
- Department of Pharmaceutical Chemistry, College of Pharmacy, Northern Border University, Rafha, Saudi Arabia
| | - Ahmed S Alshrari
- Department of Medical Laboratory Technology, Faculty of Medical Applied Science, Northern Border University, Arar, Saudi Arabia
| |
Collapse
|
9
|
Gebbia M, Zimmerman D, Jiang R, Nguyen M, Weile J, Li R, Gavac M, Kishore N, Sun S, Boonen RA, Hamilton R, Dines JN, Wahl A, Reuter J, Johnson B, Fowler DM, Couch FJ, van Attikum H, Roth FP. A missense variant effect map for the human tumor-suppressor protein CHK2. Am J Hum Genet 2024; 111:2675-2692. [PMID: 39642869 PMCID: PMC11639082 DOI: 10.1016/j.ajhg.2024.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 10/18/2024] [Accepted: 10/22/2024] [Indexed: 12/09/2024] Open
Abstract
The tumor suppressor CHEK2 encodes the serine/threonine protein kinase CHK2 which, upon DNA damage, is important for pausing the cell cycle, initiating DNA repair, and inducing apoptosis. CHK2 phosphorylation of the tumor suppressor BRCA1 is also important for mitotic spindle assembly and chromosomal stability. Consistent with its cell-cycle checkpoint role, both germline and somatic variants in CHEK2 have been linked to breast and other cancers. Over 90% of clinical germline CHEK2 missense variants are classified as variants of uncertain significance, complicating diagnosis of CHK2-dependent cancer. We therefore sought to test the functional impact of all possible missense variants in CHK2. Using a scalable multiplexed assay based on the ability of human CHK2 to complement DNA sensitivity of Saccharomyces cerevisiae cells lacking the CHEK2 ortholog, RAD53, we generated a systematic "missense variant effect map" for CHEK2 missense variation. The map reflects known biochemical features of CHK2 while offering new biological insights. It also provides strong evidence toward pathogenicity for some clinical missense variants and supporting evidence toward benignity for others. Overall, this comprehensive missense variant effect map contributes to understanding of both known and yet-to-be-observed CHK2 variants.
Collapse
Affiliation(s)
- Marinella Gebbia
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Daniel Zimmerman
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Rosanna Jiang
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Maria Nguyen
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Jochen Weile
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Roujia Li
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Michelle Gavac
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Nishka Kishore
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Song Sun
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Rick A Boonen
- Leiden University Medical Center, Leiden, the Netherlands
| | - Rayna Hamilton
- Woods Hole Oceanographic Institution, Woods Hole, MA, USA
| | - Jennifer N Dines
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | | | | | | | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA; Department of Bioengineering, University of Washington, Seattle, WA, USA
| | | | | | - Frederick P Roth
- The Donnelly Centre, University of Toronto, Toronto, ON, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada; Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
10
|
Xu Y, Liu D, Gong H. Improving the prediction of protein stability changes upon mutations by geometric learning and a pre-training strategy. NATURE COMPUTATIONAL SCIENCE 2024; 4:840-850. [PMID: 39455825 DOI: 10.1038/s43588-024-00716-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 10/03/2024] [Indexed: 10/28/2024]
Abstract
Accurate prediction of protein mutation effects is of great importance in protein engineering and design. Here we propose GeoStab-suite, a suite of three geometric learning-based models-GeoFitness, GeoDDG and GeoDTm-for the prediction of fitness score, ΔΔG and ΔTm of a protein upon mutations, respectively. GeoFitness engages a specialized loss function to allow supervised training of a unified model using the large amount of multi-labeled fitness data in the deep mutational scanning database. To further improve the downstream tasks of ΔΔG and ΔTm prediction, the encoder of GeoFitness is reutilized as a pre-trained module in GeoDDG and GeoDTm to overcome the challenge of lacking sufficient labeled data. This pre-training strategy, in combination with data expansion, markedly improves model performance and generalizability. In the benchmark test, GeoDDG and GeoDTm outperform the other state-of-the-art methods by at least 30% and 70%, respectively, in terms of the Spearman correlation coefficient.
Collapse
Affiliation(s)
- Yunxin Xu
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Frontier Research Center for Biological Structure, Tsinghua University, Beijing, China
| | - Di Liu
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Frontier Research Center for Biological Structure, Tsinghua University, Beijing, China
| | - Haipeng Gong
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China.
- Beijing Frontier Research Center for Biological Structure, Tsinghua University, Beijing, China.
| |
Collapse
|
11
|
Tian R, Nájera-González HR, Nigam D, Khan A, Chen J, Xin Z, Herrera-Estrella L, Jiao Y. Leucine-rich repeat receptor kinase BM41 regulates cuticular wax deposition in sorghum. JOURNAL OF EXPERIMENTAL BOTANY 2024; 75:6331-6345. [PMID: 39041593 DOI: 10.1093/jxb/erae319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 07/22/2024] [Indexed: 07/24/2024]
Abstract
Cuticular wax (CW) is the first defensive barrier of plants that forms a waterproof barrier, protects the plant from desiccation, and defends against insects, pathogens, and UV radiation. Sorghum, an important grass crop with high heat and drought tolerance, exhibits a much higher wax load than other grasses and the model plant Arabidopsis. In this study, we explored the regulation of sorghum CW biosynthesis using a bloomless mutant. The CW on leaf sheaths of the bloomless 41 (bm41) mutant showed significantly reduced very long-chain fatty acids (VLCFAs), triterpenoids, alcohols, and other wax components, with an overall 86% decrease in total wax content compared with the wild type. Notably, the 28-carbon and 30-carbon VLCFAs were decreased in the mutants. Using bulk segregant analysis, we identified the causal gene of the bloomless phenotype as a leucine-rich repeat transmembrane protein kinase. Transcriptome analysis of the wild-type and bm41 mutant leaf sheaths revealed BM41 as a positive regulator of lipid biosynthesis and steroid metabolism. BM41 may regulate CW biosynthesis by regulating the expression of the gene encoding 3-ketoacyl-CoA synthase 6. Identification of BM41 as a new regulator of CW biosynthesis provides fundamental knowledge for improving grass crops' heat and drought tolerance by increasing CW.
Collapse
Affiliation(s)
- Ran Tian
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX 79409, USA
| | - Héctor-Rogelio Nájera-González
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX 79409, USA
| | - Deepti Nigam
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX 79409, USA
| | - Adil Khan
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX 79409, USA
| | - Junping Chen
- Plant Stress and Germplasm Development Unit, Crop Systems Research Laboratory, USDA-ARS, 3810, 4th Street, Lubbock, TX 79415, USA
| | - Zhanguo Xin
- Plant Stress and Germplasm Development Unit, Crop Systems Research Laboratory, USDA-ARS, 3810, 4th Street, Lubbock, TX 79415, USA
| | - Luis Herrera-Estrella
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX 79409, USA
| | - Yinping Jiao
- Department of Plant and Soil Science, Institute of Genomics for Crop Abiotic Stress Tolerance (IGCAST), Texas Tech University, Lubbock, TX 79409, USA
| |
Collapse
|
12
|
AlSaeed MJ, Ramdhan P, Malave JG, Eljilany I, Langaee T, McDonough CW, Seabra G, Li C, Cavallari LH. Assessing the Performance of In silico Tools and Molecular Dynamics Simulations for Predicting Pharmacogenetic Variant Impact. Clin Pharmacol Ther 2024; 116:1082-1089. [PMID: 38894625 DOI: 10.1002/cpt.3348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 06/02/2024] [Indexed: 06/21/2024]
Abstract
The ability of freely available in silico tools to predict the effect of non-synonymous single nucleotide polymorphisms (nsSNPs) in pharmacogenes on protein function is not well defined. We assessed the performance of seven sequence-based (SIFT, PolyPhen2, mutation accessor, FATHMM, PhD-SNP, MutPred2, and SNPs & Go) and five structure-based (mCSM, SDM, DDGun, CupSat, and MAESTROweb) tools in predicting the impact of 118 nsSNPs in the CYP2C19, CYP2C9, CYP2B6, CYP2D6, and DPYD genes with known function (24 normal, one increased, 42 decreased, and 51 no-function). Sequence-based tools had a higher median (IQR) positive predictive value (89% [89-94%] vs. 12% [10-15%], P < 0.001) and lower negative predictive value (30% [24-34%] vs. 90% [80-93%], P < 0.001) than structure-based tools. Accuracy did not significantly differ between sequence-based (59% [37-67%]) and structure-based (34% [23-44%]) tools (P = 0.070). Notably, the no-function CYP2C9*3 allele and decreased function CYP2C9*8 allele were predicted incorrectly as tolerated by 100% of sequenced-based tools and as stabilizing by 60% and 20% of structure-based tools, respectively. As a case study, we performed mutational analysis for the CYP2C9*1, *3 (I359L), and *8 (R150H) proteins through molecular dynamic (MD) simulations using S-warfarin as the substrate. The I359L variant increased the distance of the major metabolic site of S-warfarin to the oxy-ferryl center of CYP2C9, and I359L and R150H caused shifts in the conformation of S-warfarin to a position less favorable for metabolism. These data suggest that MD simulations may better capture the impact of nsSNPs in pharmacogenes than other tools.
Collapse
Affiliation(s)
- Maryam Jamal AlSaeed
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics and Precision Medicine, College of Pharmacy, University of Florida, Gainesville, Florida, USA
- Department of Pharmacy Practice, College of Clinical Pharmacy, King Faisal University, Al Hofuf, Saudi Arabia
| | - Peter Ramdhan
- Department of Medicinal Chemistry, College of Pharmacy, University of Florida, Gainesville, Florida, USA
| | - Jean Gabriel Malave
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics and Precision Medicine, College of Pharmacy, University of Florida, Gainesville, Florida, USA
| | - Islam Eljilany
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics and Precision Medicine, College of Pharmacy, University of Florida, Gainesville, Florida, USA
| | - Taimour Langaee
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics and Precision Medicine, College of Pharmacy, University of Florida, Gainesville, Florida, USA
| | - Caitrin W McDonough
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics and Precision Medicine, College of Pharmacy, University of Florida, Gainesville, Florida, USA
| | - Gustavo Seabra
- Department of Medicinal Chemistry, College of Pharmacy, University of Florida, Gainesville, Florida, USA
| | - Chenglong Li
- Department of Medicinal Chemistry, College of Pharmacy, University of Florida, Gainesville, Florida, USA
| | - Larisa H Cavallari
- Department of Pharmacotherapy and Translational Research and Center for Pharmacogenomics and Precision Medicine, College of Pharmacy, University of Florida, Gainesville, Florida, USA
| |
Collapse
|
13
|
Jakhan J, Kojom Foko LP, Narang G, Singh V. Glucose-6-phosphate Dehydrogenase Variants: Analysing in Indian Plasmodium vivax Patients. Acta Parasitol 2024; 69:1522-1529. [PMID: 39164542 DOI: 10.1007/s11686-024-00883-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 07/30/2024] [Indexed: 08/22/2024]
Abstract
PURPOSE Primaquine (PQ) is recommended for radical cure of Plasmodium vivax (Pv) malaria, but its utilization is still limited due to high risk of severe haemolytic anaemia in patients with glucose-6-phosphate dehydrogenase deficiency (G6PD-d). The aim of the present study is to assess the different genotypic variants leading to G6PD-d in Delhi and Goa regions of India. METHODS A total of 46 samples (34 retrospective Pv-mono-infected samples and 12 Pv-uninfected samples) were included in the study. Various genetic variants leading to G6PD-d were analysed by PCR amplification and DNA sequencing of different targeted exons of G6PD gene. RESULTS Molecular analysis showed presence of four mutations in study population viz. 1311 C > T, 34.1% & IVSXI 93T > C, 45.5% and two novel mutations 1388G > T, 2.3% and 1398 C > T, 2.3% (silent mutation). The bioinformatics and computational analysis demonstrate that the slight conformational changes caused by R643L mutation in protein are deleterious in nature. CONCLUSION The observed mutations do not clarify the role or association between G6PD-d and Pv-infected cases. Further investigation is required in order to fully comprehend and analyse the precise role of these mutations with context to malaria infections.
Collapse
Affiliation(s)
- Jahnvi Jakhan
- ICMR-National Institute of Malaria Research (NIMR), Dwarka, Sector-8, New Delhi, 110077, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Loick Pradel Kojom Foko
- ICMR-National Institute of Malaria Research (NIMR), Dwarka, Sector-8, New Delhi, 110077, India
| | - Geetika Narang
- ICMR-National Institute of Malaria Research (NIMR), Dwarka, Sector-8, New Delhi, 110077, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Vineeta Singh
- ICMR-National Institute of Malaria Research (NIMR), Dwarka, Sector-8, New Delhi, 110077, India.
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India.
| |
Collapse
|
14
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors. Hum Genomics 2024; 18:90. [PMID: 39198917 PMCID: PMC11360829 DOI: 10.1186/s40246-024-00663-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Accepted: 08/19/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). RESULTS The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past three decades, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 190 VIPs, resulting in a total of 407 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. CONCLUSIONS VIPdb version 2 summarizes 407 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. VIPdb is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA
| | - Arul S Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA
- Illumina, Foster City, CA, 94404, USA
| | - Steven E Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA.
- Center for Computational Biology, University of California, Berkeley, CA, 94720, USA.
- College of Computing, Data Science, and Society, University of California, Berkeley, CA, 94720, USA.
- Department of Plant and Microbial Biology, University of California, 111 Koshland Hall #3102, Berkeley, CA, 94720-3102, USA.
| |
Collapse
|
15
|
Li SS, Liu ZM, Li J, Ma YB, Dong ZY, Hou JW, Shen FJ, Wang WB, Li QM, Su JG. Prediction of mutation-induced protein stability changes based on the geometric representations learned by a self-supervised method. BMC Bioinformatics 2024; 25:282. [PMID: 39198740 PMCID: PMC11360314 DOI: 10.1186/s12859-024-05876-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 07/19/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Thermostability is a fundamental property of proteins to maintain their biological functions. Predicting protein stability changes upon mutation is important for our understanding protein structure-function relationship, and is also of great interest in protein engineering and pharmaceutical design. RESULTS Here we present mutDDG-SSM, a deep learning-based framework that uses the geometric representations encoded in protein structure to predict the mutation-induced protein stability changes. mutDDG-SSM consists of two parts: a graph attention network-based protein structural feature extractor that is trained with a self-supervised learning scheme using large-scale high-resolution protein structures, and an eXtreme Gradient Boosting model-based stability change predictor with an advantage of alleviating overfitting problem. The performance of mutDDG-SSM was tested on several widely-used independent datasets. Then, myoglobin and p53 were used as case studies to illustrate the effectiveness of the model in predicting protein stability changes upon mutations. Our results show that mutDDG-SSM achieved high performance in estimating the effects of mutations on protein stability. In addition, mutDDG-SSM exhibited good unbiasedness, where the prediction accuracy on the inverse mutations is as well as that on the direct mutations. CONCLUSION Meaningful features can be extracted from our pre-trained model to build downstream tasks and our model may serve as a valuable tool for protein engineering and drug design.
Collapse
Affiliation(s)
- Shan Shan Li
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Zhao Ming Liu
- National Engineering Center for New Vaccine Research, Beijing, China
- The Sixth Laboratory, National Vaccine and Serum Institute (NVSI), Beijing, China
| | - Jiao Li
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Yi Bo Ma
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Ze Yuan Dong
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Jun Wei Hou
- National Engineering Center for New Vaccine Research, Beijing, China
- The Sixth Laboratory, National Vaccine and Serum Institute (NVSI), Beijing, China
| | - Fu Jie Shen
- National Engineering Center for New Vaccine Research, Beijing, China
- The Sixth Laboratory, National Vaccine and Serum Institute (NVSI), Beijing, China
| | - Wei Bu Wang
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Qi Ming Li
- National Engineering Center for New Vaccine Research, Beijing, China.
- The Sixth Laboratory, National Vaccine and Serum Institute (NVSI), Beijing, China.
| | - Ji Guo Su
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China.
- National Engineering Center for New Vaccine Research, Beijing, China.
| |
Collapse
|
16
|
Dieckhaus H, Kuhlman B. Protein stability models fail to capture epistatic interactions of double point mutations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.20.608844. [PMID: 39229177 PMCID: PMC11370451 DOI: 10.1101/2024.08.20.608844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
There is strong interest in accurate methods for predicting changes in protein stability resulting from amino acid mutations to the protein sequence. Recombinant proteins must often be stabilized to be used as therapeutics or reagents, and destabilizing mutations are implicated in a variety of diseases. Due to increased data availability and improved modeling techniques, recent studies have shown advancements in predicting changes in protein stability when a single point mutation is made. Less focus has been directed toward predicting changes in protein stability when there are two or more mutations, despite the significance of mutation clusters for disease pathways and protein design studies. Here, we analyze the largest available dataset of double point mutation stability and benchmark several widely used protein stability models on this and other datasets. We identify a blind spot in how predictors are typically evaluated on multiple mutations, finding that, contrary to assumptions in the field, current stability models are unable to consistently capture epistatic interactions between double mutations. We observe one notable deviation from this trend, which is that epistasis-aware models provide marginally better predictions on stabilizing double point mutations. We develop an extension of the ThermoMPNN framework for double mutant modeling as well as a novel data augmentation scheme which mitigates some of the limitations in available datasets. Collectively, our findings indicate that current protein stability models fail to capture the nuanced epistatic interactions between concurrent mutations due to several factors, including training dataset limitations and insufficient model sensitivity.
Collapse
Affiliation(s)
- Henry Dieckhaus
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Division of Chemical Biology and Medicinal Chemistry, University of North Carolina Eshelman School of Pharmacy, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| |
Collapse
|
17
|
Li G, Zhang N, Dai X, Fan L. EnzyACT: A Novel Deep Learning Method to Predict the Impacts of Single and Multiple Mutations on Enzyme Activity. J Chem Inf Model 2024; 64:5912-5921. [PMID: 39038814 PMCID: PMC11323264 DOI: 10.1021/acs.jcim.4c00920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2024] [Revised: 07/01/2024] [Accepted: 07/09/2024] [Indexed: 07/24/2024]
Abstract
Enzyme engineering involves the customization of enzymes by introducing mutations to expand the application scope of natural enzymes. One limitation of that is the complex interaction between two key properties, activity and stability, where the enhancement of one often leads to the reduction of the other, also called the trade-off mechanism. Although dozens of methods that predict the change of protein stability upon mutations have been developed, the prediction of the effect on activity is still in its early stage. Therefore, developing a fast and accurate method to predict the impact of the mutations on enzyme activity is helpful for enzyme design and understanding of the trade-off mechanism. Here, we introduce a novel approach, EnzyACT, a deep learning method that fuses graph technique and protein embedding to predict activity changes upon single or multiple mutations. Our model combines graph-based techniques and language models to predict the activity changes. Moreover, EnzyACT is trained on a new curated data set including both single- and multiple-point mutations. When benchmarked on multiple independent data sets, it shows uniform performance on problems affected by mutations. This work also provides insights into the impact of distant mutations within activity design, which could also be useful for predicting catalytic residues and developing improved enzyme-engineering strategies.
Collapse
Affiliation(s)
- Gen Li
- Production
and R&D Center I of LSS, GenScript (Shanghai)
Biotech Co.,Ltd., Shanghai 200131, China
| | - Ning Zhang
- Production
and R&D Center I of LSS, GenScript Biotech
Corporation, Nanjing 211122, China
| | - Xiaowen Dai
- Production
and R&D Center I of LSS, GenScript Biotech
Corporation, Nanjing 211122, China
| | - Long Fan
- Production
and R&D Center I of LSS, GenScript (Shanghai)
Biotech Co.,Ltd., Shanghai 200131, China
| |
Collapse
|
18
|
Diaz DJ, Gong C, Ouyang-Zhang J, Loy JM, Wells J, Yang D, Ellington AD, Dimakis AG, Klivans AR. Stability Oracle: a structure-based graph-transformer framework for identifying stabilizing mutations. Nat Commun 2024; 15:6170. [PMID: 39043654 PMCID: PMC11266546 DOI: 10.1038/s41467-024-49780-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Accepted: 06/14/2024] [Indexed: 07/25/2024] Open
Abstract
Engineering stabilized proteins is a fundamental challenge in the development of industrial and pharmaceutical biotechnologies. We present Stability Oracle: a structure-based graph-transformer framework that achieves SOTA performance on accurately identifying thermodynamically stabilizing mutations. Our framework introduces several innovations to overcome well-known challenges in data scarcity and bias, generalization, and computation time, such as: Thermodynamic Permutations for data augmentation, structural amino acid embeddings to model a mutation with a single structure, a protein structure-specific attention-bias mechanism that makes transformers a viable alternative to graph neural networks. We provide training/test splits that mitigate data leakage and ensure proper model evaluation. Furthermore, to examine our data engineering contributions, we fine-tune ESM2 representations (Prostata-IFML) and achieve SOTA for sequence-based models. Notably, Stability Oracle outperforms Prostata-IFML even though it was pretrained on 2000X less proteins and has 548X less parameters. Our framework establishes a path for fine-tuning structure-based transformers to virtually any phenotype, a necessary task for accelerating the development of protein-based biotechnologies.
Collapse
Affiliation(s)
- Daniel J Diaz
- UT Austin, Department of Computer Science, Austin, TX, 78712, USA.
- Intelligent Proteins, LLC, Austin, TX, 78712, USA.
- UT Austin, Department of Chemistry, Austin, TX, 78712, USA.
| | - Chengyue Gong
- UT Austin, Department of Computer Science, Austin, TX, 78712, USA
| | | | - James M Loy
- Intelligent Proteins, LLC, Austin, TX, 78712, USA
- UT Austin, Department of Molecular Biosciences, Austin, TX, 78712, USA
| | - Jordan Wells
- UT Austin, McKetta Department of Chemical Engineering, Austin, TX, 78712, USA
| | - David Yang
- UT Austin, Department of Molecular Biosciences, Austin, TX, 78712, USA
| | | | - Alexandros G Dimakis
- UT Austin, Chandra Family Department of Electrical and Computer Engineering, Austin, TX, 78712, USA
| | - Adam R Klivans
- UT Austin, Department of Computer Science, Austin, TX, 78712, USA
| |
Collapse
|
19
|
Montemorano L, Shultz ZB, Farooque A, Hyun M, Chappell RJ, Hartenbach EM, Lang JD. TP53 mutations and the association with platinum resistance in high grade serous ovarian carcinoma. Gynecol Oncol 2024; 186:26-34. [PMID: 38555766 PMCID: PMC11216889 DOI: 10.1016/j.ygyno.2024.03.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 03/21/2024] [Accepted: 03/24/2024] [Indexed: 04/02/2024]
Abstract
OBJECTIVES Alterations in the tumor suppressor TP53 gene are the most common mutations in high grade serous ovarian carcinoma. The impact of TP53 mutations on clinical outcomes and platinum resistance is controversial. We sought to evaluate the genomic profile of high grade serous ovarian carcinoma and explore the association of TP53 mutations with platinum resistance. METHODS Next generation sequencing data was obtained from our institutional database for patients with high grade serous ovarian carcinoma undergoing primary treatment. Sequencing data, demographic, and clinical information was reviewed. The primary outcome analyzed was time to recurrence or refractory diagnosis. Associations between the primary outcome and different classification schemes for TP53 mutations (structural, functional, hot spot, pathogenicity scores, immunohistochemical staining patterns) were performed. RESULTS 209 patients met inclusion criteria. TP53 mutations were the most common mutation. There were no differences in platinum response with TP53 hotspot mutations or high pathogenicity scores. Presence of TP53 gain-of-function mutations or measure of TP53 gain-of function activity were not associated with platinum resistance. Immunohistochemical staining patterns correlated with expected TP53 protein function and were not associated with platinum resistance. CONCLUSIONS TP53 hotspot mutations or high pathogenicity scores were not associated with platinum resistance or refractory disease. Contrary to prior studies, TP53 gain-of-function mutations were not associated with platinum resistance. Estimation of TP53 gain-of-function effect using missense mutation phenotype scores was not associated with platinum resistance. The polymorphic nature of TP53 mutations may be too complex to demonstrate effect using simple models, or response to platinum therapy may be independent of initiating TP53 mutation.
Collapse
Affiliation(s)
- Lauren Montemorano
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, University of Wisconsin, Madison, WI, USA.
| | - Zoey B Shultz
- Department of Obstetrics and Gynecology, University of Minnesota, Minneapolis, MN, USA
| | - Alma Farooque
- Department of Obstetrics and Gynecology, University of Wisconsin, Madison, WI, USA
| | - Meredith Hyun
- Department of Biostatistics and Medical Informatics, School of Medicine and Public Health, University of Wisconsin, Madison, WI, USA
| | - Richard J Chappell
- Department of Biostatistics and Medical Informatics, School of Medicine and Public Health, University of Wisconsin, Madison, WI, USA
| | - Ellen M Hartenbach
- Division of Gynecologic Oncology, Department of Obstetrics and Gynecology, University of Wisconsin, Madison, WI, USA
| | - Jessica D Lang
- Center for Human Genomics & Precision Medicine, Department of Pathology & Laboratory Medicine, University of Wisconsin, Madison, WI, USA
| |
Collapse
|
20
|
Lin YJ, Menon AS, Hu Z, Brenner SE. Variant Impact Predictor database (VIPdb), version 2: Trends from 25 years of genetic variant impact predictors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.25.600283. [PMID: 38979289 PMCID: PMC11230257 DOI: 10.1101/2024.06.25.600283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Background Variant interpretation is essential for identifying patients' disease-causing genetic variants amongst the millions detected in their genomes. Hundreds of Variant Impact Predictors (VIPs), also known as Variant Effect Predictors (VEPs), have been developed for this purpose, with a variety of methodologies and goals. To facilitate the exploration of available VIP options, we have created the Variant Impact Predictor database (VIPdb). Results The Variant Impact Predictor database (VIPdb) version 2 presents a collection of VIPs developed over the past 25 years, summarizing their characteristics, ClinGen calibrated scores, CAGI assessment results, publication details, access information, and citation patterns. We previously summarized 217 VIPs and their features in VIPdb in 2019. Building upon this foundation, we identified and categorized an additional 186 VIPs, resulting in a total of 403 VIPs in VIPdb version 2. The majority of the VIPs have the capacity to predict the impacts of single nucleotide variants and nonsynonymous variants. More VIPs tailored to predict the impacts of insertions and deletions have been developed since the 2010s. In contrast, relatively few VIPs are dedicated to the prediction of splicing, structural, synonymous, and regulatory variants. The increasing rate of citations to VIPs reflects the ongoing growth in their use, and the evolving trends in citations reveal development in the field and individual methods. Conclusions VIPdb version 2 summarizes 403 VIPs and their features, potentially facilitating VIP exploration for various variant interpretation applications. Availability VIPdb version 2 is available at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Yu-Jen Lin
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
| | - Arul S. Menon
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
| | - Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Currently at: Illumina, Foster City, California 94404, USA
| | - Steven E. Brenner
- Department of Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
- Center for Computational Biology, University of California, Berkeley, California 94720, USA
- College of Computing, Data Science, and Society, University of California, Berkeley, California 94720, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
21
|
Yu G, Zhao Q, Bi X, Wang J. DDAffinity: predicting the changes in binding affinity of multiple point mutations using protein 3D structure. Bioinformatics 2024; 40:i418-i427. [PMID: 38940145 PMCID: PMC11211828 DOI: 10.1093/bioinformatics/btae232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION Mutations are the crucial driving force for biological evolution as they can disrupt protein stability and protein-protein interactions which have notable impacts on protein structure, function, and expression. However, existing computational methods for protein mutation effects prediction are generally limited to single point mutations with global dependencies, and do not systematically take into account the local and global synergistic epistasis inherent in multiple point mutations. RESULTS To this end, we propose a novel spatial and sequential message passing neural network, named DDAffinity, to predict the changes in binding affinity caused by multiple point mutations based on protein 3D structures. Specifically, instead of being on the whole protein, we perform message passing on the k-nearest neighbor residue graphs to extract pocket features of the protein 3D structures. Furthermore, to learn global topological features, a two-step additive Gaussian noising strategy during training is applied to blur out local details of protein geometry. We evaluate DDAffinity on benchmark datasets and external validation datasets. Overall, the predictive performance of DDAffinity is significantly improved compared with state-of-the-art baselines on multiple point mutations, including end-to-end and pre-training based methods. The ablation studies indicate the reasonable design of all components of DDAffinity. In addition, applications in nonredundant blind testing, predicting mutation effects of SARS-CoV-2 RBD variants, and optimizing human antibody against SARS-CoV-2 illustrate the effectiveness of DDAffinity. AVAILABILITY AND IMPLEMENTATION DDAffinity is available at https://github.com/ak422/DDAffinity.
Collapse
Affiliation(s)
- Guanglei Yu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
- Medical Engineering and Technology College, Xinjiang Medical University, Urumqi 830017, China
| | - Qichang Zhao
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| | - Xuehua Bi
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
- Medical Engineering and Technology College, Xinjiang Medical University, Urumqi 830017, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
- Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha 410083, China
| |
Collapse
|
22
|
Sun X, Yang S, Wu Z, Su J, Hu F, Chang F, Li C. PMSPcnn: Predicting protein stability changes upon single point mutations with convolutional neural network. Structure 2024; 32:838-848.e3. [PMID: 38508191 DOI: 10.1016/j.str.2024.02.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 12/19/2023] [Accepted: 02/22/2024] [Indexed: 03/22/2024]
Abstract
Protein missense mutations and resulting protein stability changes are important causes for many human genetic diseases. However, the accurate prediction of stability changes due to mutations remains a challenging problem. To address this problem, we have developed an unbiased effective model: PMSPcnn that is based on a convolutional neural network. We have included an anti-symmetry property to build a balanced training dataset, which improves the prediction, in particular for stabilizing mutations. Persistent homology, which is an effective approach for characterizing protein structures, is used to obtain topological features. Additionally, a regression stratification cross-validation scheme has been proposed to improve the prediction for mutations with extreme ΔΔG. For three test datasets: Ssym, p53, and myoglobin, PMSPcnn achieves a better performance than currently existing predictors. PMSPcnn also outperforms currently available methods for membrane proteins. Overall, PMSPcnn is a promising method for the prediction of protein stability changes caused by single point mutations.
Collapse
Affiliation(s)
- Xiaohan Sun
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Shuang Yang
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Zhixiang Wu
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Jingjie Su
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Fangrui Hu
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Fubin Chang
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- College of Chemistry and Life Science, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
23
|
Wang D, Huot M, Mohanty V, Shakhnovich EI. Biophysical principles predict fitness of SARS-CoV-2 variants. Proc Natl Acad Sci U S A 2024; 121:e2314518121. [PMID: 38820002 PMCID: PMC11161772 DOI: 10.1073/pnas.2314518121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 04/19/2024] [Indexed: 06/02/2024] Open
Abstract
SARS-CoV-2 employs its spike protein's receptor binding domain (RBD) to enter host cells. The RBD is constantly subjected to immune responses, while requiring efficient binding to host cell receptors for successful infection. However, our understanding of how RBD's biophysical properties contribute to SARS-CoV-2's epidemiological fitness remains largely incomplete. Through a comprehensive approach, comprising large-scale sequence analysis of SARS-CoV-2 variants and the identification of a fitness function based on binding thermodynamics, we unravel the relationship between the biophysical properties of RBD variants and their contribution to viral fitness. We developed a biophysical model that uses statistical mechanics to map the molecular phenotype space, characterized by dissociation constants of RBD to ACE2, LY-CoV016, LY-CoV555, REGN10987, and S309, onto an epistatic fitness landscape. We validate our findings through experimentally measured and machine learning (ML) estimated binding affinities, coupled with infectivity data derived from population-level sequencing. Our analysis reveals that this model effectively predicts the fitness of novel RBD variants and can account for the epistatic interactions among mutations, including explaining the later reversal of Q493R. Our study sheds light on the impact of specific mutations on viral fitness and delivers a tool for predicting the future epidemiological trajectory of previously unseen or emerging low-frequency variants. These insights offer not only greater understanding of viral evolution but also potentially aid in guiding public health decisions in the battle against COVID-19 and future pandemics.
Collapse
Affiliation(s)
- Dianzhuo Wang
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA02138
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA02138
| | - Marian Huot
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA02138
- École Polytechnique, Institut Polytechnique de Paris, Palaiseau91128, France
| | - Vaibhav Mohanty
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA02138
- Harvard/MIT MD-PhD Program, Harvard Medical School, Boston, MA02115
- Massachusetts Institute of Technology, Cambridge, MA02139
| | - Eugene I. Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA02138
| |
Collapse
|
24
|
Montanucci L, Brünger T, Bhattarai N, Boßelmann CM, Kim S, Allen JP, Zhang J, Klöckner C, Fariselli P, May P, Lemke JR, Myers SJ, Yuan H, Traynelis SF, Lal D. Distances from ligands as main predictive features for pathogenicity and functional effect of variants in NMDA receptors. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.06.24306939. [PMID: 38766179 PMCID: PMC11100844 DOI: 10.1101/2024.05.06.24306939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Genetic variants in genes GRIN1 , GRIN2A , GRIN2B , and GRIN2D , which encode subunits of the N-methyl-D-aspartate receptor (NMDAR), have been associated with severe and heterogeneous neurologic diseases. Missense variants in these genes can result in gain or loss of the NMDAR function, requiring opposite therapeutic treatments. Computational methods that predict pathogenicity and molecular functional effects are therefore crucial for accurate diagnosis and therapeutic applications. We assembled missense variants: 201 from patients, 631 from general population, and 159 characterized by electrophysiological readouts showing whether they can enhance or reduce the receptor function. This includes new functional data from 47 variants reported here, for the first time. We found that pathogenic/benign variants and variants that increase/decrease the channel function were distributed unevenly on the protein structure, with spatial proximity to ligands bound to the agonist and antagonist binding sites being key predictive features. Leveraging distances from ligands, we developed two independent machine learning-based predictors for NMDAR missense variants: a pathogenicity predictor which outperforms currently available predictors (AUC=0.945, MCC=0.726), and the first binary predictor of molecular function (increase or decrease) (AUC=0.809, MCC=0.523). Using these, we reclassified variants of uncertain significance in the ClinVar database and refined a previous genome-informed epidemiological model to estimate the birth incidence of molecular mechanism-defined GRIN disorders. Our findings demonstrate that distance from ligands is an important feature in NMDARs that can enhance variant pathogenicity prediction and enable functional prediction. Further studies with larger numbers of phenotypically and functionally characterized variants will enhance the potential clinical utility of this method.
Collapse
|
25
|
Orlando G, Serrano L, Schymkowitz J, Rousseau F. Integrating physics in deep learning algorithms: a force field as a PyTorch module. Bioinformatics 2024; 40:btae160. [PMID: 38514422 PMCID: PMC11007235 DOI: 10.1093/bioinformatics/btae160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 02/08/2024] [Accepted: 03/19/2024] [Indexed: 03/23/2024] Open
Abstract
MOTIVATION Deep learning algorithms applied to structural biology often struggle to converge to meaningful solutions when limited data is available, since they are required to learn complex physical rules from examples. State-of-the-art force-fields, however, cannot interface with deep learning algorithms due to their implementation. RESULTS We present MadraX, a forcefield implemented as a differentiable PyTorch module, able to interact with deep learning algorithms in an end-to-end fashion. AVAILABILITY AND IMPLEMENTATION MadraX documentation, together with tutorials and installation guide, is available at madrax.readthedocs.io.
Collapse
Affiliation(s)
- Gabriele Orlando
- Switch Laboratory, VIB Center for Brain and Disease Research, VIB, Leuven 3000, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven 3000, Belgium
- Switch Laboratory, VIB Center for AI & Computational Biology, VIB, Leuven 3000, Belgium
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- IC REA, Pg. Lluis Companys 23, Barcelona 08010, Spain
| | - Joost Schymkowitz
- Switch Laboratory, VIB Center for Brain and Disease Research, VIB, Leuven 3000, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven 3000, Belgium
- Switch Laboratory, VIB Center for AI & Computational Biology, VIB, Leuven 3000, Belgium
| | - Frederic Rousseau
- Switch Laboratory, VIB Center for Brain and Disease Research, VIB, Leuven 3000, Belgium
- Switch Laboratory, Department of Cellular and Molecular Medicine, KU Leuven, Leuven 3000, Belgium
- Switch Laboratory, VIB Center for AI & Computational Biology, VIB, Leuven 3000, Belgium
| |
Collapse
|
26
|
Kulshreshtha A, Bhatnagar S. Structural effect of the H992D/H418D mutation of angiotensin-converting enzyme in the Indian population: implications for health and disease. J Biomol Struct Dyn 2024:1-18. [PMID: 38411559 DOI: 10.1080/07391102.2024.2321246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 02/14/2024] [Indexed: 02/28/2024]
Abstract
The Non synonymous SNPs (nsSNPs) of the renin-angiotensin-system (RAS) pathway, unique to the Indian population were investigated in view of its importance as an endocrine system. nsSNPs of the RAS pathway genes were mined from the IndiGenome database. Damaging nsSNPs were predicted using SIFT, PredictSNP, SNP and GO, Snap2 and Protein Variation Effect Analyzer. Loss of function was predicted based on protein stability change using I mutant, PremPS and CONSURF. The structural impact of the nsSNPs was predicted using HOPE and Missense3d followed by modeling, refinement, and energy minimization. Molecular Dynamics studies were carried out using Gromacsv2021.1. 23 Indian nsSNPs of the RAS pathway genes were selected for structural analysis and 8 were predicted to be damaging. Further sequence analysis showed that HEMGH zinc binding motif changes to HEMGD in somatic ACE-C domain (sACE-C) H992D and Testis ACE (tACE) H418D resulted in loss of zinc coordination, which is essential for enzymatic activity in this metalloprotease. There was a loss of internal interactions around the zinc coordination residues in the protein structural network. This was also confirmed by Principal Component Analysis, Free Energy Landscape and residue contact maps. Both mutations lead to broadening of the AngI binding cavity. The H992D mutation in sACE-C is likely to be favorable for cardiovascular health, but may lead to renal abnormalities with secondary impact on the heart. H418D in tACE is potentially associated with male infertility.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Akanksha Kulshreshtha
- Computational and Structural Biology Laboratory, Department of Biological Sciences and Engineering, Netaji Subhas University of Technology, Dwarka, New Delhi, India
| | - Sonika Bhatnagar
- Computational and Structural Biology Laboratory, Department of Biological Sciences and Engineering, Netaji Subhas University of Technology, Dwarka, New Delhi, India
| |
Collapse
|
27
|
Shan MA, Khan MU, Ishtiaq W, Rehman R, Khan S, Javed MA, Ali Q. In silico analysis of the Val66Met mutation in BDNF protein: implications for psychological stress. AMB Express 2024; 14:11. [PMID: 38252222 PMCID: PMC10803716 DOI: 10.1186/s13568-024-01664-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 01/08/2024] [Indexed: 01/23/2024] Open
Abstract
The brain-derived neurotrophic factor (BDNF) involves stress regulation and psychiatric disorders. The Val66Met polymorphism in the BDNF gene has been linked to altered protein function and susceptibility to stress-related conditions. This in silico analysis aimed to predict and analyze the consequences of the Val66Met mutation in the BDNF gene of stressed individuals. Computational techniques, including ab initio, comparative, and I-TASSER modeling, were used to evaluate the functional and stability effects of the Val66Met mutation in BDNF. The accuracy and reliability of the models were validated. Sequence alignment and secondary structure analysis compared amino acid residues and structural components. The phylogenetic analysis assessed the conservation of the mutation site. Functional and stability prediction analyses provided mixed results, suggesting potential effects on protein function and stability. Structural models revealed the importance of BDNF in key biological processes. Sequence alignment analysis showed the conservation of amino acid residues across species. Secondary structure analysis indicated minor differences between the wild-type and mutant forms. Phylogenetic analysis supported the evolutionary conservation of the mutation site. This computational study suggests that the Val66Met mutation in BDNF may have implications for protein stability, structural conformation, and function. Further experimental validation is needed to confirm these findings and elucidate the precise effects of this mutation on stress-related disorders.
Collapse
Affiliation(s)
- Muhammad Adnan Shan
- Center for Applied Molecular Biology, University of the Punjab, Lahore, Pakistan
| | - Muhammad Umer Khan
- Institute of Molecular Biology and Biotechnology, The University of Lahore, Lahore, Pakistan.
| | - Warda Ishtiaq
- Center for Applied Molecular Biology, University of the Punjab, Lahore, Pakistan
| | - Raima Rehman
- Center for Applied Molecular Biology, University of the Punjab, Lahore, Pakistan
| | - Samiullah Khan
- Institute of Molecular Biology and Biotechnology, The University of Lahore, Lahore, Pakistan
| | - Muhammad Arshad Javed
- Department of Plant Breeding and Genetics, Faculty of Agricultural Sciences, University of the Punjab Lahore, Lahore, Pakistan
| | - Qurban Ali
- Department of Plant Breeding and Genetics, Faculty of Agricultural Sciences, University of the Punjab Lahore, Lahore, Pakistan.
| |
Collapse
|
28
|
Wang D, Huot M, Mohanty V, Shakhnovich EI. Biophysical principles predict fitness of SARS-CoV-2 variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.23.549087. [PMID: 37577536 PMCID: PMC10418099 DOI: 10.1101/2023.07.23.549087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
SARS-CoV-2 employs its spike protein's receptor binding domain (RBD) to enter host cells. The RBD is constantly subjected to immune responses, while requiring efficient binding to host cell receptors for successful infection. However, our understanding of how RBD's biophysical properties contribute to SARS-CoV-2's epidemiological fitness remains largely incomplete. Through a comprehensive approach, comprising large-scale sequence analysis of SARS-CoV-2 variants and the discovery of a fitness function based on binding thermodynamics, we unravel the relationship between the biophysical properties of RBD variants and their contribution to viral fitness. We developed a biophysical model that uses statistical mechanics to map the molecular phenotype space, characterized by binding constants of RBD to ACE2, LY-CoV016, LY-CoV555, REGN10987, and S309, onto a epistatic fitness landscape. We validate our findings through experimentally measured and machine learning (ML) estimated binding affinities, coupled with infectivity data derived from population-level sequencing. Our analysis reveals that this model effectively predicts the fitness of novel RBD variants and can account for the epistatic interactions among mutations, including explaining the later reversal of Q493R. Our study sheds light on the impact of specific mutations on viral fitness and delivers a tool for predicting the future epidemiological trajectory of previously unseen or emerging low frequency variants. These insights offer not only greater understanding of viral evolution but also potentially aid in guiding public health decisions in the battle against COVID-19 and future pandemics.
Collapse
Affiliation(s)
- Dianzhuo Wang
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA
| | - Marian Huot
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA
- Ecole Polytechnique, Institut Polytechnique de Paris
| | - Vaibhav Mohanty
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA
- Harvard-MIT MD-PhD Program and Program in Health Sciences and Technology, Harvard Medical School, Boston, MA and Massachusetts Institute of Technology, Cambridge, MA
| | | |
Collapse
|
29
|
Yépez Y, Marcano-Ruiz M, Bortolini MC. Adaptive strategies of aquatic mammals: Exploring the role of the HIF pathway and hypoxia tolerance. Genet Mol Biol 2024; 46:e20230140. [PMID: 38252060 PMCID: PMC10802827 DOI: 10.1590/1678-4685-gmb-2023-0140] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Accepted: 12/06/2023] [Indexed: 01/23/2024] Open
Abstract
Aquatic mammals (marine and freshwater species) share significant and similar adaptations, enabling them to tolerate hypoxia during regular breath-hold diving. Despite the established importance of HIF1A, a master regulator in the molecular mechanism of hypoxia response, and other associated genes, their role in the evolutionary adaptation of aquatic mammals is not fully understood. In this study, we investigated this topic by employing a candidate gene approach to analyze 11 critical genes involved in the HIF1A signaling pathway in aquatic mammals. Our gene analyses included evaluating positive and negative selection, relaxation or constriction of selection, and molecular convergence compared to other terrestrial mammals, including subterranean mammals. Evidence of selection suggested a significant role of negative selection, as well as relaxation of the selective regime in cetaceans for most of these genes. We found that the glutamine 68 variant in the HIF3α protein is unique to cetaceans and initial evaluations indicated a destabilizing effect on protein structure. However, further analyses are necessary to evaluate its functional impact and adaptive relevance in this taxon.
Collapse
Affiliation(s)
- Yuri Yépez
- Universidade Federal do Rio Grande do Sul, Departamento de Genética, Laboratório de Evolução Humana e Molecular, Porto Alegre, RS, Brazil
| | - Mariana Marcano-Ruiz
- Universidade Federal do Rio Grande do Sul, Departamento de Genética, Laboratório de Evolução Humana e Molecular, Porto Alegre, RS, Brazil
| | - Maria Cátira Bortolini
- Universidade Federal do Rio Grande do Sul, Departamento de Genética, Laboratório de Evolução Humana e Molecular, Porto Alegre, RS, Brazil
| |
Collapse
|
30
|
Liu B, Jiang Y, Yang Y, Chen JX. OmeDDG: Improved Protein Mutation Stability Prediction Based on Predicted 3D Structures. J Phys Chem B 2024; 128:67-76. [PMID: 38130113 DOI: 10.1021/acs.jpcb.3c05601] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Determining changes in the protein's thermal stability following mutations is critical in protein engineering and understanding pathogenic missense mutations. Despite the development of various computational methods to predict the effects of single-point mutations, their accuracy remains limited. In this study, we propose a new computational method, OmeDDG, that more accurately predicts mutation-induced Gibbs free energy changes in protein folding (ΔΔG). OmeDDG takes the sequences of wild-type and mutant proteins as input, utilizes OmegaFold to obtain the 3D structure, employs a convolutional neural network to extract structural features, and combines them with protein mutation features and pretraining features to predict the stability of single-point mutations in proteins. We performed a comprehensive comparison between OmeDDG and other available prediction methods on four blind test datasets, confirming that OmeDDG can effectively enhance protein mutation prediction performance. Notably, on the antisymmetric dataset Ssym, OmeDDG achieves the best performance, demonstrating favorable antisymmetry with PCC = 0.79 and RMSE = 0.96 for forward mutations and PCC = 0.77 and RMSE = 0.97 for reverse mutant types.
Collapse
Affiliation(s)
- Baoying Liu
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
| | - Yongquan Jiang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
- Artificial Intelligence Research Institute, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
| | - Yan Yang
- School of Computing and Artificial Intelligence, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
- Artificial Intelligence Research Institute, Southwest Jiaotong University, Chengdu 611756, Sichuan, China
| | - Jim X Chen
- Department of Computer Science, George Mason University, Fairfax, Virginia 22030-4444, United States
| |
Collapse
|
31
|
Wang T, Jin X, Lu X, Min X, Ge S, Li S. Empirical validation of ProteinMPNN's efficiency in enhancing protein fitness. Front Genet 2024; 14:1347667. [PMID: 38274106 PMCID: PMC10808456 DOI: 10.3389/fgene.2023.1347667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 12/20/2023] [Indexed: 01/27/2024] Open
Abstract
Introduction: Protein engineering, which aims to improve the properties and functions of proteins, holds great research significance and application value. However, current models that predict the effects of amino acid substitutions often perform poorly when evaluated for precision. Recent research has shown that ProteinMPNN, a large-scale pre-training sequence design model based on protein structure, performs exceptionally well. It is capable of designing mutants with structures similar to the original protein. When applied to the field of protein engineering, the diverse designs for mutation positions generated by this model can be viewed as a more precise mutation range. Methods: We collected three biological experimental datasets and compared the design results of ProteinMPNN for wild-type proteins with the experimental datasets to verify the ability of ProteinMPNN in improving protein fitness. Results: The validation on biological experimental datasets shows that ProteinMPNN has the ability to design mutation types with higher fitness in single and multi-point mutations. We have verified the high accuracy of ProteinMPNN in protein engineering tasks from both positive and negative perspectives. Discussion: Our research indicates that using large-scale pre trained models to design protein mutants provides a new approach for protein engineering, providing strong support for guiding biological experiments and applications in biotechnology.
Collapse
Affiliation(s)
- Tianshu Wang
- School of Informatics, Institute of Artificial Intelligence, Xiamen University, Xiamen, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiamen University, Xiamen, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, Xiamen, China
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Xiamen University, Xiamen, China
| | - Xiaocheng Jin
- State Key Laboratory of Vaccines for Infectious Diseases, Xiamen University, Xiamen, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, Xiamen, China
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Xiamen University, Xiamen, China
- School of Public Health, Xiamen University, Xiamen, China
| | - Xiaoli Lu
- Information and Networking Center, Xiamen University, Xiamen, China
| | - Xiaoping Min
- School of Informatics, Institute of Artificial Intelligence, Xiamen University, Xiamen, China
- State Key Laboratory of Vaccines for Infectious Diseases, Xiamen University, Xiamen, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, Xiamen, China
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Xiamen University, Xiamen, China
| | - Shengxiang Ge
- State Key Laboratory of Vaccines for Infectious Diseases, Xiamen University, Xiamen, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, Xiamen, China
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Xiamen University, Xiamen, China
- School of Public Health, Xiamen University, Xiamen, China
| | - Shaowei Li
- State Key Laboratory of Vaccines for Infectious Diseases, Xiamen University, Xiamen, China
- National Institute of Diagnostics and Vaccine Development in Infectious Diseases, Xiamen University, Xiamen, China
- State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics, Xiamen University, Xiamen, China
- School of Public Health, Xiamen University, Xiamen, China
| |
Collapse
|
32
|
Zheng F, Liu Y, Yang Y, Wen Y, Li M. Assessing computational tools for predicting protein stability changes upon missense mutations using a new dataset. Protein Sci 2024; 33:e4861. [PMID: 38084013 PMCID: PMC10751734 DOI: 10.1002/pro.4861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 11/14/2023] [Accepted: 12/06/2023] [Indexed: 12/28/2023]
Abstract
Insight into how mutations affect protein stability is crucial for protein engineering, understanding genetic diseases, and exploring protein evolution. Numerous computational methods have been developed to predict the impact of amino acid substitutions on protein stability. Nevertheless, comparing these methods poses challenges due to variations in their training data. Moreover, it is observed that they tend to perform better at predicting destabilizing mutations than stabilizing ones. Here, we meticulously compiled a new dataset from three recently published databases: ThermoMutDB, FireProtDB, and ProThermDB. This dataset, which does not overlap with the well-established S2648 dataset, consists of 4038 single-point mutations, including over 1000 stabilizing mutations. We assessed these mutations using 27 computational methods, including the latest ones utilizing mega-scale stability datasets and transfer learning. We excluded entries with overlap or similarity to training datasets to ensure fairness. Pearson correlation coefficients for the tested tools ranged from 0.20 to 0.53 on unseen data, and none of the methods could accurately predict stabilizing mutations, even those performing well in anti-symmetric property analysis. While most methods present consistent trends for predicting destabilizing mutations across various properties such as solvent exposure and secondary conformation, stabilizing mutations do not exhibit a clear pattern. Our study also suggests that solely addressing training dataset bias may not significantly enhance accuracy of predicting stabilizing mutations. These findings emphasize the importance of developing precise predictive methods for stabilizing mutations.
Collapse
Affiliation(s)
- Feifan Zheng
- MOE Key Laboratory of Geriatric Diseases and ImmunologySchool of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow UniversitySuzhouChina
| | - Yang Liu
- MOE Key Laboratory of Geriatric Diseases and ImmunologySchool of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow UniversitySuzhouChina
| | - Yan Yang
- MOE Key Laboratory of Geriatric Diseases and ImmunologySchool of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow UniversitySuzhouChina
| | - Yuhao Wen
- MOE Key Laboratory of Geriatric Diseases and ImmunologySchool of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow UniversitySuzhouChina
| | - Minghui Li
- MOE Key Laboratory of Geriatric Diseases and ImmunologySchool of Biology and Basic Medical Sciences, Suzhou Medical College of Soochow UniversitySuzhouChina
| |
Collapse
|
33
|
Wang S, Tang H, Shan P, Wu Z, Zuo L. ProS-GNN: Predicting effects of mutations on protein stability using graph neural networks. Comput Biol Chem 2023; 107:107952. [PMID: 37643501 DOI: 10.1016/j.compbiolchem.2023.107952] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 08/18/2023] [Accepted: 08/25/2023] [Indexed: 08/31/2023]
Abstract
Predicting protein stability change upon variation through a computational approach is a valuable tool to unveil the mechanisms of mutation-induced drug failure and develop immunotherapy strategies. Some previous machine learning-based techniques exhibit anti-symmetric bias toward destabilizing situations, whereas others struggle with generalization to unseen examples. To address these issues, we propose a gated graph neural network-based approach to predict changes in protein stability upon mutation. The model uses message passing to encode the links between the molecular structure and property after eliminating the non-mutant structure and creating input feature vectors. While doing so, it also incorporates the coordinates of the raw atoms to provide spatial insights into the chemical systems. We test the model on the Ssym, Myoglobin, Broom, and p53 datasets to demonstrate the generalization performance. Compared to existing approaches, our proposed method achieves improved linearity with symmetry in less time. The code for this study is available at: https://github.com/HongzhouTang/Pros-GNN.
Collapse
Affiliation(s)
- Shuyu Wang
- Department of Control Engineering, Northeastern University, Qinhuangdao Campus, Qinhuangdao 066001, China.
| | - Hongzhou Tang
- Department of Control Engineering, Northeastern University, Qinhuangdao Campus, Qinhuangdao 066001, China
| | - Peng Shan
- Department of Control Engineering, Northeastern University, Qinhuangdao Campus, Qinhuangdao 066001, China
| | - Zhaoxia Wu
- Department of Control Engineering, Northeastern University, Qinhuangdao Campus, Qinhuangdao 066001, China
| | - Lei Zuo
- Department of Marine Engineering, University of Michigan, Ann Arbor 48109, USA
| |
Collapse
|
34
|
Stein D, Kars ME, Wu Y, Bayrak ÇS, Stenson PD, Cooper DN, Schlessinger A, Itan Y. Genome-wide prediction of pathogenic gain- and loss-of-function variants from ensemble learning of a diverse feature set. Genome Med 2023; 15:103. [PMID: 38037155 PMCID: PMC10688473 DOI: 10.1186/s13073-023-01261-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Accepted: 11/16/2023] [Indexed: 12/02/2023] Open
Abstract
Gain-of-function (GOF) variants give rise to increased/novel protein functions whereas loss-of-function (LOF) variants lead to diminished protein function. Experimental approaches for identifying GOF and LOF are generally slow and costly, whilst available computational methods have not been optimized to discriminate between GOF and LOF variants. We have developed LoGoFunc, a machine learning method for predicting pathogenic GOF, pathogenic LOF, and neutral genetic variants, trained on a broad range of gene-, protein-, and variant-level features describing diverse biological characteristics. LoGoFunc outperforms other tools trained solely to predict pathogenicity for identifying pathogenic GOF and LOF variants and is available at https://itanlab.shinyapps.io/goflof/ .
Collapse
Affiliation(s)
- David Stein
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Meltem Ece Kars
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Yiming Wu
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- College of Life Science, China West Normal University, Nan Chong, Si Chuan, 637009, China
| | - Çiğdem Sevim Bayrak
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Peter D Stenson
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, CF14 4XN, UK
| | - Avner Schlessinger
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
- Department of Artificial Intelligence and Human Health, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
| | - Yuval Itan
- The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
| |
Collapse
|
35
|
Musil M, Jezik A, Horackova J, Borko S, Kabourek P, Damborsky J, Bednar D. FireProt 2.0: web-based platform for the fully automated design of thermostable proteins. Brief Bioinform 2023; 25:bbad425. [PMID: 38018911 PMCID: PMC10685400 DOI: 10.1093/bib/bbad425] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 10/25/2023] [Accepted: 11/01/2023] [Indexed: 11/30/2023] Open
Abstract
Thermostable proteins find their use in numerous biomedical and biotechnological applications. However, the computational design of stable proteins often results in single-point mutations with a limited effect on protein stability. However, the construction of stable multiple-point mutants can prove difficult due to the possibility of antagonistic effects between individual mutations. FireProt protocol enables the automated computational design of highly stable multiple-point mutants. FireProt 2.0 builds on top of the previously published FireProt web, retaining the original functionality and expanding it with several new stabilization strategies. FireProt 2.0 integrates the AlphaFold database and the homology modeling for structure prediction, enabling calculations starting from a sequence. Multiple-point designs are constructed using the Bron-Kerbosch algorithm minimizing the antagonistic effect between the individual mutations. Users can newly limit the FireProt calculation to a set of user-defined mutations, run a saturation mutagenesis of the whole protein or select rigidifying mutations based on B-factors. Evolution-based back-to-consensus strategy is complemented by ancestral sequence reconstruction. FireProt 2.0 is significantly faster and a reworked graphical user interface broadens the tool's availability even to users with older hardware. FireProt 2.0 is freely available at http://loschmidt.chemi.muni.cz/fireprotweb.
Collapse
Affiliation(s)
- Milos Musil
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Masaryk University, Brno, Czech Republic
- Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
- International Clinical Research Centre, St. Anne’s University Hospital Brno, Brno, Czech Republic
| | - Andrej Jezik
- Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
| | - Jana Horackova
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Masaryk University, Brno, Czech Republic
| | - Simeon Borko
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Masaryk University, Brno, Czech Republic
- Department of Information Systems, Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic
- International Clinical Research Centre, St. Anne’s University Hospital Brno, Brno, Czech Republic
| | - Petr Kabourek
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Masaryk University, Brno, Czech Republic
- International Clinical Research Centre, St. Anne’s University Hospital Brno, Brno, Czech Republic
| | - Jiri Damborsky
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Masaryk University, Brno, Czech Republic
- International Clinical Research Centre, St. Anne’s University Hospital Brno, Brno, Czech Republic
| | - David Bednar
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Masaryk University, Brno, Czech Republic
- International Clinical Research Centre, St. Anne’s University Hospital Brno, Brno, Czech Republic
| |
Collapse
|
36
|
Kurniawan J, Ishida T. Comparing Supervised Learning and Rigorous Approach for Predicting Protein Stability upon Point Mutations in Difficult Targets. J Chem Inf Model 2023; 63:6778-6788. [PMID: 37897811 DOI: 10.1021/acs.jcim.3c00750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/30/2023]
Abstract
Accurate prediction of protein stability upon a point mutation has important applications in drug discovery and personalized medicine. It remains a challenging issue in computational biology. Existing computational prediction methods, which range from mechanistic to supervised learning approaches, have experienced limited progress over the last few decades. This stagnation is largely due to their heavy reliance on both the quantity and quality of the training data. This is evident in recent state-of-the-art methods that continue to yield substantial errors on two challenging blind test sets: frataxin and p53, with average root-mean-square errors exceeding 3 and 1.5 kcal/mol, respectively, which is still above the theoretical 1 kcal/mol prediction barrier. Rigorous approaches, on the other hand, offer greater potential for accuracy without relying on training data but are computationally demanding and require both wild-type and mutant structure information. Although they showed high accuracy for conserving mutations, their performance is still limited for charge-changing mutation cases. This might be due to the lack of an available mutant structure, often represented by a simplified capped peptide. The recent advances in protein structure prediction methods now make it possible to obtain structures comparable to experimental ones, including complete mutant structure information. In this work, we compare the performance of supervised learning-based methods and rigorous approaches for predicting protein stability on point mutations in difficult targets: frataxin and p53. The rigorous alchemical method significantly surpasses state-of-the-art techniques in terms of both the root-mean-squared error and Pearson correlation coefficient in these two challenging blind test sets. Additionally, we propose an improved alchemical method that employs the pmx double-system/single-box approach to accurately predict the folding free energy change upon both conserving and charge-changing mutations. The enhanced protocol can accurately predict both types of mutations, thereby outperforming existing state-of-the-art methods in overall performance.
Collapse
Affiliation(s)
- Jason Kurniawan
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Tokyo 152-8550, Japan
| | - Takashi Ishida
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Tokyo 152-8550, Japan
| |
Collapse
|
37
|
Azmi MB, Sehgal SA, Asif U, Musani S, Abedin MFE, Suri A, Ahmed SDH, Qureshi SA. Genetic insights into obesity: in silico identification of pathogenic SNPs in MBOAT4 gene and their structural molecular dynamics consequences. J Biomol Struct Dyn 2023; 42:13074-13090. [PMID: 37921712 DOI: 10.1080/07391102.2023.2274970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Accepted: 10/18/2023] [Indexed: 11/04/2023]
Abstract
Membrane Bound O-Acyltransferase Domain-Containing 4 (MBOAT4) protein catalyzes ghrelin acylation, leading to prominent ghrelin activity, hence characterizing its role as an anti-obesity target. We extracted 625 exonic SNPs from the ENSEMBL database and one phenotype-based missense mutation associated with obesity (A46T) from the HGMD (Human Gene Mutation Database). These were differentiated on deleterious missense SNPs of the MBOAT4 gene through MAF (minor allele frequency: <0.01) cut-off criteria in relation to some bioinformatics-based supervised machine learning tools. We found 8 rare-coding and harmful missense SNPs. The consensus classifier (PredictSNP) tool predicted that the SNP (G57S, C: rs561065025) was the most pathogenic. Several trained in silico algorithms have predicted decreased protein stability [ΔΔG (kcal/mol)] function in the presence of these rare-coding pathogenic mutations in the MBOAT4 gene. Then, a stereochemical quality check (i.e. validation and assessment) of the 3D model was performed, followed by a blind cavity docking approach, used to search for druggable cavities and molecular interactions with citrus flavonoids of the Rutaceae family, ranked with energetic estimations. Significant interactions with Phloretin 3',5'-Di-C-Glucoside were also observed at R304, W306, N307, A311, L314 and H338 with (iGEMDOCK: -95.82 kcal/mol and AutoDock: -7.80 kcal/mol). The RMSD values and other variables of MD simulation analyses on this protein further validated its significant interactions with the above flavonoids. The MBOAT4 gene and its molecular interactions could serve as an interventional future anti-obesity target. The current study's findings will benefit future prospects for large population-based studies and drug development, particularly for generating personalized medicine.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Muhammad Bilal Azmi
- Department of Biochemistry, Dow Medical College, Dow University of Health Sciences, Karachi, Pakistan
| | - Sheikh Arslan Sehgal
- Department of Bioinformatics, Institute of Biochemistry, Biotechnology and Bioinformatics, The Islamia University of Bahawalpur, Bahawalpur, Pakistan
| | - Uzma Asif
- Department of Biochemistry, Medicine Program, Batterjee Medical College, Jeddah, Saudi Arabia
| | - Sarah Musani
- Dow Medical College, Dow University of Health Sciences, Karachi, Pakistan
| | | | - Azeema Suri
- Dow Medical College, Dow University of Health Sciences, Karachi, Pakistan
| | - Syed Danish Haseen Ahmed
- Department of Biochemistry, Dow Medical College, Dow University of Health Sciences, Karachi, Pakistan
| | | |
Collapse
|
38
|
Gong J, Jiang L, Chen Y, Zhang Y, Li X, Ma Z, Fu Z, He F, Sun P, Ren Z, Tian M. THPLM: a sequence-based deep learning framework for protein stability changes prediction upon point variations using pretrained protein language model. Bioinformatics 2023; 39:btad646. [PMID: 37874953 PMCID: PMC10627365 DOI: 10.1093/bioinformatics/btad646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 09/25/2023] [Accepted: 10/22/2023] [Indexed: 10/26/2023] Open
Abstract
MOTIVATION Quantitative determination of protein thermodynamic stability is a critical step in protein and drug design. Reliable prediction of protein stability changes caused by point variations contributes to developing-related fields. Over the past decades, dozens of structure-based and sequence-based methods have been proposed, showing good prediction performance. Despite the impressive progress, it is necessary to explore wild-type and variant protein representations to address the problem of how to represent the protein stability change in view of global sequence. With the development of structure prediction using learning-based methods, protein language models (PLMs) have shown accurate and high-quality predictions of protein structure. Because PLM captures the atomic-level structural information, it can help to understand how single-point variations cause functional changes. RESULTS Here, we proposed THPLM, a sequence-based deep learning model for stability change prediction using Meta's ESM-2. With ESM-2 and a simple convolutional neural network, THPLM achieved comparable or even better performance than most methods, including sequence-based and structure-based methods. Furthermore, the experimental results indicate that the PLM's ability to generate representations of sequence can effectively improve the ability of protein function prediction. AVAILABILITY AND IMPLEMENTATION The source code of THPLM and the testing data can be accessible through the following links: https://github.com/FPPGroup/THPLM.
Collapse
Affiliation(s)
- Jianting Gong
- School of Information Science and Technology, Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
- Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun 130122, China
| | - Lili Jiang
- School of Information Science and Technology, Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
- Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun 130122, China
| | - Yongbing Chen
- School of Information Science and Technology, Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
- Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun 130122, China
| | - Yixiang Zhang
- School of Information Science and Technology, Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
- Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun 130122, China
| | - Xue Li
- Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun 130122, China
| | - Zhiqiang Ma
- School of Information Science and Technology, Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
- Department of Computer Science, College of Humanities and Sciences of Northeast Normal University, Changchun 130117, China
| | - Zhiguo Fu
- School of Information Science and Technology, Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Fei He
- School of Information Science and Technology, Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Pingping Sun
- School of Information Science and Technology, Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
| | - Zilin Ren
- School of Information Science and Technology, Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
- Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun 130122, China
| | - Mingyao Tian
- Changchun Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Changchun 130122, China
| |
Collapse
|
39
|
Sharma R, Oak N, Chen W, Gogal R, Kirschner M, Beier F, Schnieders MJ, Spies M, Nichols KE, Wlodarski M. Germline landscape of RPA1, RPA2 and RPA3 variants in pediatric malignancies: identification of RPA1 as a novel cancer predisposition candidate gene. Front Oncol 2023; 13:1229507. [PMID: 37869077 PMCID: PMC10588448 DOI: 10.3389/fonc.2023.1229507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 09/18/2023] [Indexed: 10/24/2023] Open
Abstract
Replication Protein A (RPA) is single-strand DNA binding protein that plays a key role in the replication and repair of DNA. RPA is a heterotrimer made of 3 subunits - RPA1, RPA2, and RPA3. Germline pathogenic variants affecting RPA1 were recently described in patients with Telomere Biology Disorders (TBD), also known as dyskeratosis congenita or short telomere syndrome. Premature telomere shortening is a hallmark of TBD and results in bone marrow failure and predisposition to hematologic malignancies. Building on the finding that somatic mutations in RPA subunit genes occur in ~1% of cancers, we hypothesized that germline RPA alterations might be enriched in human cancers. Because germline RPA1 mutations are linked to early onset TBD with predisposition to myelodysplastic syndromes, we interrogated pediatric cancer cohorts to define the prevalence and spectrum of rare/novel and putative damaging germline RPA1, RPA2, and RPA3 variants. In this study of 5,993 children with cancer, 75 (1.25%) harbored heterozygous rare (non-cancer population allele frequency (AF) < 0.1%) variants in the RPA heterotrimer genes, of which 51 cases (0.85%) had ultra-rare (AF < 0.005%) or novel variants. Compared with Genome Aggregation Database (gnomAD) non-cancer controls, there was significant enrichment of ultra-rare and novel RPA1, but not RPA2 or RPA3, germline variants in our cohort (adjusted p-value < 0.05). Taken together, these findings suggest that germline putative damaging variants affecting RPA1 are found in excess in children with cancer, warranting further investigation into the functional role of these variants in oncogenesis.
Collapse
Affiliation(s)
- Richa Sharma
- Department of Hematology, St. Jude Children´s Research Hospital, Memphis, TN, United States
| | - Ninad Oak
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN, United States
| | - Wenan Chen
- Center for Applied Bioinformatics, St. Jude Children’s Research Hospital, Memphis, TN, United States
| | - Rose Gogal
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, IA, United States
| | - Martin Kirschner
- Department of Hematology, Oncology, Hemostaseology and Stem Cell Transplantation, Medical Faculty, RWTH Aachen University, Aachen, Germany
- Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Bonn, Germany
| | - Fabian Beier
- Department of Hematology, Oncology, Hemostaseology and Stem Cell Transplantation, Medical Faculty, RWTH Aachen University, Aachen, Germany
- Center for Integrated Oncology Aachen Bonn Cologne Düsseldorf (CIO ABCD), Bonn, Germany
| | - Michael J. Schnieders
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, IA, United States
| | - Maria Spies
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, IA, United States
| | - Kim E. Nichols
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN, United States
| | - Marcin Wlodarski
- Department of Hematology, St. Jude Children´s Research Hospital, Memphis, TN, United States
| |
Collapse
|
40
|
van Loggerenberg W, Sowlati-Hashjin S, Weile J, Hamilton R, Chawla A, Sheykhkarimli D, Gebbia M, Kishore N, Frésard L, Mustajoki S, Pischik E, Di Pierro E, Barbaro M, Floderus Y, Schmitt C, Gouya L, Colavin A, Nussbaum R, Friesema ECH, Kauppinen R, To-Figueras J, Aarsand AK, Desnick RJ, Garton M, Roth FP. Systematically testing human HMBS missense variants to reveal mechanism and pathogenic variation. Am J Hum Genet 2023; 110:1769-1786. [PMID: 37729906 PMCID: PMC10577081 DOI: 10.1016/j.ajhg.2023.08.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 08/15/2023] [Accepted: 08/21/2023] [Indexed: 09/22/2023] Open
Abstract
Defects in hydroxymethylbilane synthase (HMBS) can cause acute intermittent porphyria (AIP), an acute neurological disease. Although sequencing-based diagnosis can be definitive, ∼⅓ of clinical HMBS variants are missense variants, and most clinically reported HMBS missense variants are designated as "variants of uncertain significance" (VUSs). Using saturation mutagenesis, en masse selection, and sequencing, we applied a multiplexed validated assay to both the erythroid-specific and ubiquitous isoforms of HMBS, obtaining confident functional impact scores for >84% of all possible amino acid substitutions. The resulting variant effect maps generally agreed with biochemical expectations and provide further evidence that HMBS can function as a monomer. Additionally, the maps implicated specific residues as having roles in active site dynamics, which was further supported by molecular dynamics simulations. Most importantly, these maps can help discriminate pathogenic from benign HMBS variants, proactively providing evidence even for yet-to-be-observed clinical missense variants.
Collapse
Affiliation(s)
- Warren van Loggerenberg
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON M5G 1X5, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada
| | | | - Jochen Weile
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON M5G 1X5, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada
| | - Rayna Hamilton
- Advanced Academic Programs, Johns Hopkins University, Washington, DC 20036, USA
| | - Aditya Chawla
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON M5G 1X5, Canada
| | - Dayag Sheykhkarimli
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON M5G 1X5, Canada
| | - Marinella Gebbia
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON M5G 1X5, Canada
| | - Nishka Kishore
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON M5G 1X5, Canada
| | | | - Sami Mustajoki
- Research Program in Molecular Medicine, Biomedicum-Helsinki, University of Helsinki, 00290 Helsinki, Finland
| | - Elena Pischik
- Research Program in Molecular Medicine, Biomedicum-Helsinki, University of Helsinki, 00290 Helsinki, Finland
| | - Elena Di Pierro
- Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, Unit of Medicine and Metabolic Diseases, 20122 Milano, Italy
| | - Michela Barbaro
- Porphyria Centre Sweden, Centre for Inherited Metabolic Diseases, Karolinska Institutet, Karolinska University Hospital, 17176 Stockholm, Sweden
| | - Ylva Floderus
- Porphyria Centre Sweden, Centre for Inherited Metabolic Diseases, Karolinska Institutet, Karolinska University Hospital, 17176 Stockholm, Sweden
| | - Caroline Schmitt
- Centre français des porphyries, hôpital Louis-Mourier, Assistance Publique-Hopitaux de Paris, 92701 Colombes, France; Centre de recherche sur l'inflammation, Université Paris Cité, UMR1149 INSERM, 75018 Paris, France
| | - Laurent Gouya
- Centre français des porphyries, hôpital Louis-Mourier, Assistance Publique-Hopitaux de Paris, 92701 Colombes, France; Centre de recherche sur l'inflammation, Université Paris Cité, UMR1149 INSERM, 75018 Paris, France
| | | | | | - Edith C H Friesema
- Porphyria Expertcenter Rotterdam, Center for Lysosomal and Metabolic Diseases, Department of Internal Medicine, Erasmus MC, 3015 Rotterdam, the Netherlands
| | - Raili Kauppinen
- Research Program in Molecular Medicine, Biomedicum-Helsinki, University of Helsinki, 00290 Helsinki, Finland
| | - Jordi To-Figueras
- Biochemistry and Molecular Genetics Department, Hospital Clínic, IDIBAPS, University of Barcelona, 08036 Barcelona, Spain
| | - Aasne K Aarsand
- Norwegian Porphyria Centre, Department of Medical Biochemistry and Pharmacology, Haukeland University Hospital, 5021 Bergen, Norway
| | - Robert J Desnick
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Michael Garton
- Institute Biomedical Engineering, University of Toronto, Toronto, ON M5S 3G9, Canada.
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada; Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON M5G 1X5, Canada; Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada.
| |
Collapse
|
41
|
Gong H, Zhang Y, Dong C, Wang Y, Chen G, Liang B, Li H, Liu L, Xu J, Li G. Unbiased curriculum learning enhanced global-local graph neural network for protein thermodynamic stability prediction. Bioinformatics 2023; 39:btad589. [PMID: 37740312 PMCID: PMC10918760 DOI: 10.1093/bioinformatics/btad589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/04/2023] [Accepted: 09/21/2023] [Indexed: 09/24/2023] Open
Abstract
MOTIVATION Proteins play crucial roles in biological processes, with their functions being closely tied to thermodynamic stability. However, measuring stability changes upon point mutations of amino acid residues using physical methods can be time-consuming. In recent years, several computational methods for protein thermodynamic stability prediction (PTSP) based on deep learning have emerged. Nevertheless, these approaches either overlook the natural topology of protein structures or neglect the inherent noisy samples resulting from theoretical calculation or experimental errors. RESULTS We propose a novel Global-Local Graph Neural Network powered by Unbiased Curriculum Learning for the PTSP task. Our method first builds a Siamese graph neural network to extract protein features before and after mutation. Since the graph's topological changes stem from local node mutations, we design a local feature transformation module to make the model focus on the mutated site. To address model bias caused by noisy samples, which represent unavoidable errors from physical experiments, we introduce an unbiased curriculum learning method. This approach effectively identifies and re-weights noisy samples during the training process. Extensive experiments demonstrate that our proposed method outperforms advanced protein stability prediction methods, and surpasses state-of-the-art learning methods for regression prediction tasks. AVAILABILITY AND IMPLEMENTATION All code and data is available at https://github.com/haifangong/UCL-GLGNN.
Collapse
Affiliation(s)
- Haifan Gong
- Shanghai Artificial Intelligence Laboratory, Shanghai 200000, China
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
- SRIBD, Chinese University of Hong Kong (Shenzhen), Shenzhen 518000, China
| | - Yumeng Zhang
- Shanghai Jiao Tong University, Shanghai 200000, China
| | - Chenhe Dong
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Yue Wang
- Qilu Hospital, Shandong University, Shandong 250000, China
| | - Guanqi Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Bilin Liang
- Shanghai Artificial Intelligence Laboratory, Shanghai 200000, China
| | - Haofeng Li
- SRIBD, Chinese University of Hong Kong (Shenzhen), Shenzhen 518000, China
| | - Lanxuan Liu
- Shanghai Artificial Intelligence Laboratory, Shanghai 200000, China
| | - Jie Xu
- Shanghai Artificial Intelligence Laboratory, Shanghai 200000, China
| | - Guanbin Li
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| |
Collapse
|
42
|
Ramakrishna Reddy P, Kulandaisamy A, Michael Gromiha M. TMH Stab-pred: Predicting the stability of α-helical membrane proteins using sequence and structural features. Methods 2023; 218:118-124. [PMID: 37572768 DOI: 10.1016/j.ymeth.2023.08.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/14/2023] Open
Abstract
The folding and stability of transmembrane proteins (TMPs) are governed by the insertion of secondary structural elements into the cell membrane followed by their assembly. Understanding the important features that dictate the stability of TMPs is important for elucidating their functions. In this work, we related sequence and structure-based parameters with free energy (ΔG0) of α-helical membrane proteins. Our results showed that the free energy transfer of hydrophobic peptides, relative contact order, total interaction energy, number of hydrogen bonds and lipid accessibility of transmembrane regions are important for stability. Further, we have developed multiple-regression models to predict the stability of α-helical membrane proteins using these features and our method can predict the stability with a correlation and mean absolute error (MAE) of 0.89 and 1.21 kcal/mol, respectively, on jack-knife test. The method was validated with a blind test set of three recently reported experimental ΔG0, which could predict the stability within an average MAE of 0.51 kcal/mol. Further, we developed a webserver for predicting the stability and it is freely available at (https://web.iitm.ac.in/bioinfo2/TMHS/). The importance of selected parameters and limitations are discussed.
Collapse
Affiliation(s)
- P Ramakrishna Reddy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India
| | - A Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Basic and Translational Research Division, Department of Cardiology, Boston Children's Hospital, Boston, MA 02115, USA
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamil Nadu, India; Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
43
|
Niazi SK. The Coming of Age of AI/ML in Drug Discovery, Development, Clinical Testing, and Manufacturing: The FDA Perspectives. Drug Des Devel Ther 2023; 17:2691-2725. [PMID: 37701048 PMCID: PMC10493153 DOI: 10.2147/dddt.s424991] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 08/24/2023] [Indexed: 09/14/2023] Open
Abstract
Artificial intelligence (AI) and machine learning (ML) represent significant advancements in computing, building on technologies that humanity has developed over millions of years-from the abacus to quantum computers. These tools have reached a pivotal moment in their development. In 2021 alone, the U.S. Food and Drug Administration (FDA) received over 100 product registration submissions that heavily relied on AI/ML for applications such as monitoring and improving human performance in compiling dossiers. To ensure the safe and effective use of AI/ML in drug discovery and manufacturing, the FDA and numerous other U.S. federal agencies have issued continuously updated, stringent guidelines. Intriguingly, these guidelines are often generated or updated with the aid of AI/ML tools themselves. The overarching goal is to expedite drug discovery, enhance the safety profiles of existing drugs, introduce novel treatment modalities, and improve manufacturing compliance and robustness. Recent FDA publications offer an encouraging outlook on the potential of these tools, emphasizing the need for their careful deployment. This has expanded market opportunities for retraining personnel handling these technologies and enabled innovative applications in emerging therapies such as gene editing, CRISPR-Cas9, CAR-T cells, mRNA-based treatments, and personalized medicine. In summary, the maturation of AI/ML technologies is a testament to human ingenuity. Far from being autonomous entities, these are tools created by and for humans designed to solve complex problems now and in the future. This paper aims to present the status of these technologies, along with examples of their present and future applications.
Collapse
|
44
|
Efraimidis E, Krokidis MG, Exarchos TP, Lazar T, Vlamos P. In Silico Structural Analysis Exploring Conformational Folding of Protein Variants in Alzheimer's Disease. Int J Mol Sci 2023; 24:13543. [PMID: 37686347 PMCID: PMC10487466 DOI: 10.3390/ijms241713543] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 08/26/2023] [Accepted: 08/30/2023] [Indexed: 09/10/2023] Open
Abstract
Accurate protein structure prediction using computational methods remains a challenge in molecular biology. Recent advances in AI-powered algorithms provide a transformative effect in solving this problem. Even though AlphaFold's performance has improved since its release, there are still limitations that apply to its efficacy. In this study, a selection of proteins related to the pathology of Alzheimer's disease was modeled, with Presenilin-1 (PSN1) and its mutated variants in the foreground. Their structural predictions were evaluated using the ColabFold implementation of AlphaFold, which utilizes MMseqs2 for the creation of multiple sequence alignments (MSAs). A higher number of recycles than the one used in the AlphaFold DB was selected, and no templates were used. In addition, prediction by RoseTTAFold was also applied to address how structures from the two deep learning frameworks match reality. The resulting conformations were compared with the corresponding experimental structures, providing potential insights into the predictive ability of this approach in this particular group of proteins. Furthermore, a comprehensive examination was performed on features such as predicted regions of disorder and the potential effect of mutations on PSN1. Our findings consist of highly accurate superpositions with little or no deviation from experimentally determined domain-level models.
Collapse
Affiliation(s)
- Evangelos Efraimidis
- Bioinformatics and Neuroinformatics MSc Program, Hellenic Open University, 26335 Patras, Greece;
| | - Marios G. Krokidis
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, 49100 Corfu, Greece; (M.G.K.); (T.P.E.)
| | - Themis P. Exarchos
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, 49100 Corfu, Greece; (M.G.K.); (T.P.E.)
| | - Tamas Lazar
- VIB–VUB Center for Structural Biology, Vlaams Instituut voor Biotechnologie (VIB), B1050 Brussels, Belgium;
- Structural Biology Brussels, Department of Bioengineering, Vrije Universiteit Brussel, B1050 Brussels, Belgium
| | - Panagiotis Vlamos
- Bioinformatics and Human Electrophysiology Laboratory, Department of Informatics, Ionian University, 49100 Corfu, Greece; (M.G.K.); (T.P.E.)
| |
Collapse
|
45
|
Peka M, Balatsky V, Saienko A, Tsereniuk O. Bioinformatic analysis of the effect of SNPs in the pig TERT gene on the structural and functional characteristics of the enzyme to develop new genetic markers of productivity traits. BMC Genomics 2023; 24:487. [PMID: 37626279 PMCID: PMC10463782 DOI: 10.1186/s12864-023-09592-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Accepted: 08/16/2023] [Indexed: 08/27/2023] Open
Abstract
BACKGROUND Telomerase reverse transcriptase (TERT) plays a crucial role in synthesizing telomeric repeats that safeguard chromosomes from damage and fusion, thereby maintaining genome stability. Mutations in the TERT gene can lead to a deviation in gene expression, impaired enzyme activity, and, as a result, abnormal telomere shortening. Genetic markers of productivity traits in livestock can be developed based on the TERT gene polymorphism for use in marker-associated selection (MAS). In this study, a bioinformatic-based approach is proposed to evaluate the effect of missense single-nucleotide polymorphisms (SNPs) in the pig TERT gene on enzyme function and structure, with the prospect of developing genetic markers. RESULTS A comparative analysis of the coding and amino acid sequences of the pig TERT was performed with corresponding sequences of other species. The distribution of polymorphisms in the pig TERT gene, with respect to the enzyme's structural-functional domains, was established. A three-dimensional model of the pig TERT structure was obtained through homological modeling. The potential impact of each of the 23 missense SNPs in the pig TERT gene on telomerase function and stability was assessed using predictive bioinformatic tools utilizing data on the amino acid sequence and structure of pig TERT. CONCLUSIONS According to bioinformatic analysis of 23 missense SNPs of the pig TERT gene, a predictive effect of rs789641834 (TEN domain), rs706045634 (TEN domain), rs325294961 (TRBD domain) and rs705602819 (RTD domain) on the structural and functional parameters of the enzyme was established. These SNPs hold the potential to serve as genetic markers of productivity traits. Therefore, the possibility of their application in MAS should be further evaluated in associative analysis studies.
Collapse
Affiliation(s)
- Mykyta Peka
- Institute of Pig Breeding and Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, 1 Shvedska Mohyla St, Poltava, 36013 Ukraine
- V. N. Karazin Kharkiv National University, 4 Svobody Sq, Kharkiv, 61022 Ukraine
| | - Viktor Balatsky
- Institute of Pig Breeding and Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, 1 Shvedska Mohyla St, Poltava, 36013 Ukraine
- V. N. Karazin Kharkiv National University, 4 Svobody Sq, Kharkiv, 61022 Ukraine
| | - Artem Saienko
- Institute of Pig Breeding and Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, 1 Shvedska Mohyla St, Poltava, 36013 Ukraine
| | - Oleksandr Tsereniuk
- Institute of Pig Breeding and Agroindustrial Production, National Academy of Agrarian Sciences of Ukraine, 1 Shvedska Mohyla St, Poltava, 36013 Ukraine
| |
Collapse
|
46
|
Brenner EP, Sreevatsan S. Attenuated but immunostimulatory Mycobacterium tuberculosis variant bovis strain Ravenel shows variation in T cell epitopes. Sci Rep 2023; 13:12402. [PMID: 37524777 PMCID: PMC10390569 DOI: 10.1038/s41598-023-39578-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 07/27/2023] [Indexed: 08/02/2023] Open
Abstract
Tuberculosis, caused by Mycobacterium tuberculosis complex (MTBC) organisms, affects a range of humans and animals globally. Mycobacterial pathogenesis involves manipulation of the host immune system, partially through antigen presentation. Epitope sequences across the MTBC are evolutionarily hyperconserved, suggesting their recognition is advantageous for the bacterium. Mycobacterium tuberculosis var. bovis (MBO) strain Ravenel is an isolate known to provoke a robust immune response in cattle, but typically fails to produce lesions and persist. Unlike attenuated MBO BCG strains that lack the critical RD1 genomic region, Ravenel is classic-type MBO structurally, suggesting genetic variation is responsible for defective pathogenesis. This work explores variation in epitope sequences in MBO Ravenel by whole genome sequencing, and contrasts such variation against a fully virulent clinical isolate, MBO strain 10-7428. Validated MTBC epitopes (n = 4818) from the Immune Epitope Database were compared to their sequences in MBO Ravenel and MBO 10-7428. Ravenel yielded 3 modified T cell epitopes, in genes rpfB, argC, and rpoA. These modifications were predicted to have little effect on protein stability. In contrast, T cells epitopes in 10-7428 were all WT. Considering T cell epitope hyperconservation across MTBC variants, these altered MBO Ravenel epitopes support their potential contribution to overall strain attenuation. The affected genes may provide clues on basic pathogenesis, and if so, be feasible targets for reverse vaccinology.
Collapse
Affiliation(s)
- Evan P Brenner
- Department of Pathobiology and Diagnostic Investigation, College of Veterinary Medicine, Michigan State University, 784 Wilson Road, East Lansing, MI, 48824, USA
| | - Srinand Sreevatsan
- Department of Pathobiology and Diagnostic Investigation, College of Veterinary Medicine, Michigan State University, 784 Wilson Road, East Lansing, MI, 48824, USA.
| |
Collapse
|
47
|
Ikelle L, Makia M, Lewis T, Crane R, Kakakhel M, Conley SM, Birtley JR, Arshavsky VY, Al-Ubaidi MR, Naash MI. Comparative study of PRPH2 D2 loop mutants reveals divergent disease mechanism in rods and cones. Cell Mol Life Sci 2023; 80:214. [PMID: 37466729 PMCID: PMC10356684 DOI: 10.1007/s00018-023-04851-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 05/10/2023] [Accepted: 06/28/2023] [Indexed: 07/20/2023]
Abstract
Mutations in the photoreceptor-specific tetraspanin gene peripherin-2 (PRPH2) lead to widely varying forms of retinal degeneration ranging from retinitis pigmentosa to macular dystrophy. Both inter- and intra-familial phenotypic heterogeneity has led to much interest in uncovering the complex pathogenic mechanisms of PRPH2-associated disease. Majority of disease-causing mutations in PRPH2 reside in the second intradiscal loop, wherein seven cysteines control protein folding and oligomerization. Here, we utilize knockin models to evaluate the role of three D2 loop cysteine mutants (Y141C, C213Y and C150S), alone or in combination. We elucidated how these mutations affect PRPH2 properties, including oligomerization and subcellular localization, and contribute to disease processes. Results from our structural, functional and molecular studies revealed that, in contrast to our understanding from prior investigations, rods are highly affected by PRPH2 mutations interfering with oligomerization and not merely by the haploinsufficiency associated with these mutations. On the other hand, cones are less affected by the toxicity of the mutant protein and significantly reduced protein levels, suggesting that knockdown therapeutic strategies may sustain cone functionality for a longer period. This observation provides useful data to guide and simplify the current development of effective therapeutic approaches for PRPH2-associated diseases that combine knockdown with high levels of gene supplementation needed to generate prolonged rod improvement.
Collapse
Affiliation(s)
- Larissa Ikelle
- Department of Biomedical Engineering, University of Houston, 3517 Cullen Blvd. Room 2027, Houston, TX, 77204-5060, USA
| | - Mustafa Makia
- Department of Biomedical Engineering, University of Houston, 3517 Cullen Blvd. Room 2027, Houston, TX, 77204-5060, USA
| | - Tylor Lewis
- Department of Ophthalmology, Duke University Medical Center, Durham, NC, USA
| | - Ryan Crane
- Department of Biomedical Engineering, University of Houston, 3517 Cullen Blvd. Room 2027, Houston, TX, 77204-5060, USA
| | - Mashal Kakakhel
- Department of Biomedical Engineering, University of Houston, 3517 Cullen Blvd. Room 2027, Houston, TX, 77204-5060, USA
| | - Shannon M Conley
- Department of Cell Biology, University of Oklahoma Health Sciences Center, Oklahoma City, OK, 73104, USA
| | | | - Vadim Y Arshavsky
- Department of Ophthalmology, Duke University Medical Center, Durham, NC, USA
| | - Muayyad R Al-Ubaidi
- Department of Biomedical Engineering, University of Houston, 3517 Cullen Blvd. Room 2027, Houston, TX, 77204-5060, USA.
| | - Muna I Naash
- Department of Biomedical Engineering, University of Houston, 3517 Cullen Blvd. Room 2027, Houston, TX, 77204-5060, USA.
| |
Collapse
|
48
|
Tollefson MR, Gogal RA, Weaver AM, Schaefer AM, Marini RJ, Azaiez H, Kolbe DL, Wang D, Weaver AE, Casavant TL, Braun TA, Smith RJH, Schnieders MJ. Assessing variants of uncertain significance implicated in hearing loss using a comprehensive deafness proteome. Hum Genet 2023; 142:819-834. [PMID: 37086329 PMCID: PMC10182131 DOI: 10.1007/s00439-023-02559-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 04/11/2023] [Indexed: 04/23/2023]
Abstract
Hearing loss is the leading sensory deficit, affecting ~ 5% of the population. It exhibits remarkable heterogeneity across 223 genes with 6328 pathogenic missense variants, making deafness-specific expertise a prerequisite for ascribing phenotypic consequences to genetic variants. Deafness-implicated variants are curated in the Deafness Variation Database (DVD) after classification by a genetic hearing loss expert panel and thorough informatics pipeline. However, seventy percent of the 128,167 missense variants in the DVD are "variants of uncertain significance" (VUS) due to insufficient evidence for classification. Here, we use the deep learning protein prediction algorithm, AlphaFold2, to curate structures for all DVD genes. We refine these structures with global optimization and the AMOEBA force field and use DDGun3D to predict folding free energy differences (∆∆GFold) for all DVD missense variants. We find that 5772 VUSs have a large, destabilizing ∆∆GFold that is consistent with pathogenic variants. When also filtered for CADD scores (> 25.7), we determine 3456 VUSs are likely pathogenic at a probability of 99.0%. Of the 224 genes in the DVD, 166 genes (74%) exhibit one or more missense variants predicted to cause a pathogenic change in protein folding stability. The VUSs prioritized here affect 119 patients (~ 3% of cases) sequenced by the OtoSCOPE targeted panel. Approximately half of these patients previously received an inconclusive report, and reclassification of these VUSs as pathogenic provides a new genetic diagnosis for six patients.
Collapse
Affiliation(s)
- Mallory R Tollefson
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Rose A Gogal
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
| | - A Monique Weaver
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Amanda M Schaefer
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Robert J Marini
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Hela Azaiez
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Diana L Kolbe
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Donghong Wang
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Amy E Weaver
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Thomas L Casavant
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
| | - Terry A Braun
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
| | - Richard J H Smith
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA.
| | - Michael J Schnieders
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Biochemistry and Molecular Biology, University of Iowa, Iowa City, IA, 52242, USA.
| |
Collapse
|
49
|
Finton KAK, Rupert PB, Friend DJ, Dinca A, Lovelace ES, Buerger M, Rusnac DV, Foote-McNabb U, Chour W, Heath JR, Campbell JS, Pierce RH, Strong RK. Effects of HLA single chain trimer design on peptide presentation and stability. Front Immunol 2023; 14:1170462. [PMID: 37207206 PMCID: PMC10189100 DOI: 10.3389/fimmu.2023.1170462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 04/21/2023] [Indexed: 05/21/2023] Open
Abstract
MHC class I "single-chain trimer" molecules, coupling MHC heavy chain, β2-microglobulin, and a specific peptide into a single polypeptide chain, are widely used in research. To more fully understand caveats associated with this design that may affect its use for basic and translational studies, we evaluated a set of engineered single-chain trimers with combinations of stabilizing mutations across eight different classical and non-classical human class I alleles with 44 different peptides, including a novel human/murine chimeric design. While, overall, single-chain trimers accurately recapitulate native molecules, care was needed in selecting designs for studying peptides longer or shorter than 9-mers, as single-chain trimer design could affect peptide conformation. In the process, we observed that predictions of peptide binding were often discordant with experiment and that yields and stabilities varied widely with construct design. We also developed novel reagents to improve the crystallizability of these proteins and confirmed novel modes of peptide presentation.
Collapse
Affiliation(s)
- Kathryn A. K. Finton
- Division of Basic Science, Fred Hutchinson Cancer Research Center (FHCC), Seattle, WA, United States
| | - Peter B. Rupert
- Division of Basic Science, Fred Hutchinson Cancer Research Center (FHCC), Seattle, WA, United States
| | - Della J. Friend
- Division of Basic Science, Fred Hutchinson Cancer Research Center (FHCC), Seattle, WA, United States
| | - Ana Dinca
- Clinical Research Division, Fred Hutchinson Cancer Center, Seattle, WA, United States
| | - Erica S. Lovelace
- Division of Basic Science, Fred Hutchinson Cancer Research Center (FHCC), Seattle, WA, United States
| | - Matthew Buerger
- Division of Basic Science, Fred Hutchinson Cancer Research Center (FHCC), Seattle, WA, United States
| | - Domnita V. Rusnac
- Division of Basic Science, Fred Hutchinson Cancer Research Center (FHCC), Seattle, WA, United States
| | - Ulysses Foote-McNabb
- Division of Basic Science, Fred Hutchinson Cancer Research Center (FHCC), Seattle, WA, United States
| | - William Chour
- Institute for Systems Biology, Seattle, WA, United States
| | - James R. Heath
- Institute for Systems Biology, Seattle, WA, United States
| | - Jean S. Campbell
- Clinical Research Division, Fred Hutchinson Cancer Center, Seattle, WA, United States
| | - Robert H. Pierce
- Clinical Research Division, Fred Hutchinson Cancer Center, Seattle, WA, United States
| | - Roland K. Strong
- Division of Basic Science, Fred Hutchinson Cancer Research Center (FHCC), Seattle, WA, United States
| |
Collapse
|
50
|
Li C, Hou I, Ma M, Wang G, Bai Y, Liu X. Orthogonal analysis of variants in APOE gene using in-silico approaches reveals novel disrupting variants. FRONTIERS IN BIOINFORMATICS 2023; 3:1122559. [PMID: 37091907 PMCID: PMC10117898 DOI: 10.3389/fbinf.2023.1122559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 03/31/2023] [Indexed: 04/08/2023] Open
Abstract
Introduction: Alzheimer's disease (AD) is one of the most prominent medical conditions in the world. Understanding the genetic component of the disease can greatly advance our knowledge regarding its progression, treatment and prognosis. Single amino-acid variants (SAVs) in the APOE gene have been widely investigated as a risk factor for AD Studies, including genome-wide association studies, meta-analysis based studies, and in-vivo animal studies, were carried out to investigate the functional importance and pathogenesis potential of APOE SAVs. However, given the high cost of such large-scale or experimental studies, there are only a handful of variants being reported that have definite explanations. The recent development of in-silico analytical approaches, especially large-scale deep learning models, has opened new opportunities for us to probe the structural and functional importance of APOE variants extensively. Method: In this study, we are taking an ensemble approach that simultaneously uses large-scale protein sequence-based models, including Evolutionary Scale Model and AlphaFold, together with a few in-silico functional prediction web services to investigate the known and possibly disease-causing SAVs in APOE and evaluate their likelihood of being functional and structurally disruptive. Results: As a result, using an ensemble approach with little to no prior field-specific knowledge, we reported 5 SAVs in APOE gene to be potentially disruptive, one of which (C112R) was classificed by previous studies as a key risk factor for AD. Discussion: Our study provided a novel framework to analyze and prioritize the functional and structural importance of SAVs for future experimental and functional validation.
Collapse
Affiliation(s)
- Chang Li
- USF Genomics and College of Public Health, University of South Florida, Tampa, FL, United States
| | - Ian Hou
- The John Cooper School, The Woodlands, TX, United States
| | - Mingjia Ma
- Novi High School, Novi, MI, United States
| | - Grace Wang
- Del Norte High School, San Diego, CA, United States
| | - Yongsheng Bai
- Next-Gen Intelligent Science Training, Ann Arbor, MI, United States
- Department of Biology, Eastern Michigan University, Ypsilanti, MI, United States
| | - Xiaoming Liu
- USF Genomics and College of Public Health, University of South Florida, Tampa, FL, United States
| |
Collapse
|