1
|
Ahmad MZ, Shah Z, Ullah A, Ahmed S, Ahmad B, Khan A. Genome wide and evolutionary analysis of heat shock protein 70 proteins in tomato and their role in response to heat and drought stress. Mol Biol Rep 2022; 49:11229-11241. [PMID: 35788950 DOI: 10.1007/s11033-022-07734-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 06/22/2022] [Indexed: 11/24/2022]
Abstract
Heat shock protein 70 (HSP70) proteins play a crucial role in mitigating the detrimental effects of abiotic stresses in plants. In the present study, 21 full length non-redundant SlHSP70 genes were detected and characterized in tomato (Solanum lycopersicum L.). The SlHSP70 genes were classified into four groups based on phylogenetic analysis. Similarities were observed in gene features and motif structures of SlHSP70s belonging to the same group. SlHSP70 genes were unevenly and unequally mapped on 11 chromosomes. Segmental and tandem duplication are the main events that have contributed to the expansion of the SlHSP70 genes. A large number of groups and sub-groups were generated during comparative analysis of HSP70 genes in multiple plant species including tomato. These findings indicated a common ancestor which created diverse sub-groups prior to a mono-dicot split. The selection pressure on specific codons was identified through a maximum-likelihood approach and we found some important coding sites in the coding region of all groups. Diversifying positive selection was indirectly associated with evolutionary changes in SlHSP70 proteins and suggests that gene evolution modulated the tomato domestication event. In addition, expression analysis using RNA-seq revealed that 21 SlHSP70 genes were differentially expressed in response to drought and heat stress. SlHSP70-5 was down-regulated by heat treatment and up-regulated by drought stress. Furthermore, the expression of some of the duplicate genes was partially redundant, while others showed functional diversity. Our results indicate the diverse role of HSP70 gene family in S. lycopersicum under drought and heat stress conditions and open the gate for further investigation of HSP70 gene family functions, especially under drought and heat stress.
Collapse
Affiliation(s)
- Muhammad Zulfiqar Ahmad
- Department of Plant Breeding and Genetics, Faculty of Agriculture, University of Agriculture, D.I. Khan, Pakistan.
| | - Zamarud Shah
- Department of Biotechnology, University of Science and Technology, Bannu, Pakistan
| | - Arif Ullah
- Department of Biotechnology, University of Science and Technology, Bannu, Pakistan
| | - Shakeel Ahmed
- Institute de Farmacia, Facultad de Ciencias, Universidad Austral de Chile, Campus Isla Teja, 5090000, Valdivia, Chile
| | - Bushra Ahmad
- Department of Biochemistry, Shaheed Benazir Bhutto Women University, Peshawar, Pakistan
| | - Afrasyab Khan
- Department of Biotechnology, University of Science and Technology, Bannu, Pakistan
| |
Collapse
|
2
|
Kachroo AH, Vandeloo M, Greco BM, Abdullah M. Humanized yeast to model human biology, disease and evolution. Dis Model Mech 2022; 15:275614. [PMID: 35661208 PMCID: PMC9194483 DOI: 10.1242/dmm.049309] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
For decades, budding yeast, a single-cellular eukaryote, has provided remarkable insights into human biology. Yeast and humans share several thousand genes despite morphological and cellular differences and over a billion years of separate evolution. These genes encode critical cellular processes, the failure of which in humans results in disease. Although recent developments in genome engineering of mammalian cells permit genetic assays in human cell lines, there is still a need to develop biological reagents to study human disease variants in a high-throughput manner. Many protein-coding human genes can successfully substitute for their yeast equivalents and sustain yeast growth, thus opening up doors for developing direct assays of human gene function in a tractable system referred to as 'humanized yeast'. Humanized yeast permits the discovery of new human biology by measuring human protein activity in a simplified organismal context. This Review summarizes recent developments showing how humanized yeast can directly assay human gene function and explore variant effects at scale. Thus, by extending the 'awesome power of yeast genetics' to study human biology, humanizing yeast reinforces the high relevance of evolutionarily distant model organisms to explore human gene evolution, function and disease.
Collapse
|
3
|
Hu Z, Yu C, Furutsuki M, Andreoletti G, Ly M, Hoskins R, Adhikari AN, Brenner SE. VIPdb, a genetic Variant Impact Predictor Database. Hum Mutat 2019; 40:1202-1214. [PMID: 31283070 PMCID: PMC7288905 DOI: 10.1002/humu.23858] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 06/27/2019] [Indexed: 12/30/2022]
Abstract
Genome sequencing identifies vast number of genetic variants. Predicting these variants' molecular and clinical effects is one of the preeminent challenges in human genetics. Accurate prediction of the impact of genetic variants improves our understanding of how genetic information is conveyed to molecular and cellular functions, and is an essential step towards precision medicine. Over one hundred tools/resources have been developed specifically for this purpose. We summarize these tools as well as their characteristics, in the genetic Variant Impact Predictor Database (VIPdb). This database will help researchers and clinicians explore appropriate tools, and inform the development of improved methods. VIPdb can be browsed and downloaded at https://genomeinterpretation.org/vipdb.
Collapse
Affiliation(s)
- Zhiqiang Hu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| | - Changhua Yu
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Department of Bioengineering, University of California, Berkeley, California 94720, USA
| | - Mabel Furutsuki
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720, USA
| | - Gaia Andreoletti
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| | - Melissa Ly
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
- Division of Data Sciences, University of California, Berkeley, California 94720, USA
| | - Roger Hoskins
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| | - Aashish N. Adhikari
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| | - Steven E. Brenner
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA
| |
Collapse
|
4
|
Naveed M, Tehreem S, Mehboob MZ. In-Silico analysis of missense SNPs in Human HPPD gene associated with Tyrosinemia type III and Hawkinsinuria. Comput Biol Chem 2019; 80:284-291. [DOI: 10.1016/j.compbiolchem.2019.04.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2018] [Revised: 04/14/2019] [Accepted: 04/18/2019] [Indexed: 11/24/2022]
|
5
|
Ahmad MJ, Ahmad HI, Adeel MM, Liang A, Hua G, Murtaza S, Mirza RH, Elokil A, Ullah F, Yang L. Evolutionary Analysis of Makorin Ring Finger Protein 3 Reveals Positive Selection in Mammals. Evol Bioinform Online 2019; 15:1176934319834612. [PMID: 31024214 PMCID: PMC6472170 DOI: 10.1177/1176934319834612] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Accepted: 01/17/2019] [Indexed: 01/12/2023] Open
Abstract
Makorin ring finger proteins (MKRNs) are part the of ubiquitin-proteasome system;
a complex system important for cell functions. Ubiquitin fate through
proteolytic, non-proteolytic pathways varies, depending on covalent linkage
between ubiquitin and protein substrates. Makorin ring finger protein 3 is an
integral part of covalent linkage of ubiquitin to protein substrates. Similar to
others imprinted genes, MKRN3 also evolve under positive selection; however,
which codons are specifically selected in MKRN3 during evolution are needed to
be explored. Different maximum-likelihood (ML) codon-based methodologies were
used to ascertain positive selection signatures in 22 mammalian sequences of
MKRN3 to probe an individual codon for positive selection signatures. By
applying the HyPhy software package implemented in the Data Monkey Web Server
and CODEML implemented in PAML, evolutionary analysis based on two Ml frameworks
were conducted. The analysis was executed by comparing M1a against M2a, M7
against M8, and PAML models and 2∆Lnl (LRT)
was resulted by likelihood logs. M1a contributed ω1 (dN/dS)
with LRT value (∆Lnl) 12.01, and positive
selection was found in M2a with ω3 = 2.23603. To further improve selection test,
M8 was compared to M7 with 2∆Lnl (LRT) 30.17,
and M8 showed positive selection with ω = 1.55759. The data were fit to M8 than
M7, which suggests that M8 was the most significant model of selection. M8 was
judged encouraging for this analysis and used to establish a positive selection
of MKRN3 proteins. We found Gly312 as a positively selected amino acid in a zinc
finger motif/Really Interesting New Gene (RING) finger motif; the former ones’
region is involved in RNA binding and the later ones in ubiquitin ligase
activity of the protein, vital for protein function. Selection analyses of MKRNs
might advance the developments in unique approaches that could lead to genetic
progress over the selection of superior individuals with the breeding values
higher for certain traits as ancestries to get the next generation.
Collapse
Affiliation(s)
- Muhammad Jamil Ahmad
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Hafiz Ishfaq Ahmad
- Guangdong Key Laboratory of Animal Conservation and Resource Utilization, Guangdong Public Laboratory of Wild Animal Conservation and Utilization, Guangdong Institute of Applied Biological Resources, Guangzhou, China
| | - Muhammad Muzammal Adeel
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Aixin Liang
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Guohua Hua
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, China
| | - Saeed Murtaza
- Faculty of veterinary sciences, Bahauddin Zakariya University Multan, Multan, Pakistan
| | - Riaz Hussain Mirza
- Faculty of veterinary sciences, Bahauddin Zakariya University Multan, Multan, Pakistan
| | - Abdelmotaleb Elokil
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, China.,Animal Production Department, Faculty of Agriculture, Benha University, Moshtohor, Egypt
| | - Farman Ullah
- Department of Animal Breeding and Genetics, Faculty of Veterinary and Animal Sciences, Lasbela University of Agriculture, Water and Marine Sciences, Uthal, Pakistan
| | - Liguo Yang
- Key Laboratory of Animal Genetics, Breeding and Reproduction, Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
6
|
Kono TJY, Lei L, Shih CH, Hoffman PJ, Morrell PL, Fay JC. Comparative Genomics Approaches Accurately Predict Deleterious Variants in Plants. G3 (BETHESDA, MD.) 2018; 8:3321-3329. [PMID: 30139765 PMCID: PMC6169392 DOI: 10.1534/g3.118.200563] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2018] [Accepted: 08/10/2018] [Indexed: 12/11/2022]
Abstract
Recent advances in genome resequencing have led to increased interest in prediction of the functional consequences of genetic variants. Variants at phylogenetically conserved sites are of particular interest, because they are more likely than variants at phylogenetically variable sites to have deleterious effects on fitness and contribute to phenotypic variation. Numerous comparative genomic approaches have been developed to predict deleterious variants, but the approaches are nearly always assessed based on their ability to identify known disease-causing mutations in humans. Determining the accuracy of deleterious variant predictions in nonhuman species is important to understanding evolution, domestication, and potentially to improving crop quality and yield. To examine our ability to predict deleterious variants in plants we generated a curated database of 2,910 Arabidopsis thaliana mutants with known phenotypes. We evaluated seven approaches and found that while all performed well, their relative ranking differed from prior benchmarks in humans. We conclude that deleterious mutations can be reliably predicted in A. thaliana and likely other plant species, but that the relative performance of various approaches does not necessarily translate from one species to another.
Collapse
Affiliation(s)
- Thomas J Y Kono
- Department of Agronomy & Plant Genetics, University of Minnesota, St. Paul, MN 551085
| | - Li Lei
- Department of Agronomy & Plant Genetics, University of Minnesota, St. Paul, MN 551085
| | - Ching-Hua Shih
- Department of Genetics, Washington University, St. Louis, MO 63110
| | - Paul J Hoffman
- Department of Agronomy & Plant Genetics, University of Minnesota, St. Paul, MN 551085
| | - Peter L Morrell
- Department of Agronomy & Plant Genetics, University of Minnesota, St. Paul, MN 551085
| | - Justin C Fay
- Department of Genetics, Washington University, St. Louis, MO 63110
| |
Collapse
|
7
|
Saini S, Jyoti-Thakur C, Kumar V, Suhag A, Jakhar N. In silico mutational analysis and identification of stability centers in human interleukin-4. MOLECULAR BIOLOGY RESEARCH COMMUNICATIONS 2018; 7:67-76. [PMID: 30046620 PMCID: PMC6054777 DOI: 10.22099/mbrc.2018.28855.1310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Interleukin-4 (IL-4) is a multifunctional cytokine that plays a critical role in apoptosis, differentiation and proliferation. The intensity of IL4 response depends upon binding to its receptor, IL-4R. The therapeutic efficiency of interleukins can be increased by generating structural mutants having greater stability. In the present work, attempts were made to increase the stability of human IL-4 using in-silico site directed mutagenesis. Different orthologous sequences of IL4 from Pan troglodytes, Aotusnigriceps, Macacamulatta, Papiohamadryas, Chlorocebusaethiops, Vicugnapacos, Susscrofa and Homo sapiens were aligned using Clustal Omega that revealed the conserved and non-conserved positions. For each non-conserved position, possible favorable and stabilizing mutations were found using CUPSAT with predicted ΔΔG (kcal/mol). The one with highest ΔΔG (kcal/mol) among all possible mutations, for each non-conserved position was selected and introduced manually in human IL-4 sequence resulting in multiple mutants of IL-4. Mutant proteins were modeled using structure of IL4 (PDB ID: 2B8U) as a template by SWISS MODEL. The mutants A49L and Q106T were identified to have stability centre using SCide. Molecular dynamics and docking analysis also confirmed the mutants stability and binding respectively. Mutants A49L and Q106T had -7.580079 kcal/mol and -39.418124 kcal/mol respectively lesser energy value than the wild type IL4. The result suggested that, the stability of human IL-4 has been increased by mutation.
Collapse
Affiliation(s)
- Sandeep Saini
- Department of Bioinformatics, G.G.D.S.D. College, Chandigarh, India
| | | | - Varinder Kumar
- Department of Bioinformatics, G.G.D.S.D. College, Chandigarh, India
| | - Akshay Suhag
- Department of Bioinformatics, G.G.D.S.D. College, Chandigarh, India
| | - Niharika Jakhar
- Department of Bioinformatics, G.G.D.S.D. College, Chandigarh, India
| |
Collapse
|
8
|
Schöneberg T, Meister J, Knierim AB, Schulz A. The G protein-coupled receptor GPR34 - The past 20 years of a grownup. Pharmacol Ther 2018; 189:71-88. [PMID: 29684466 DOI: 10.1016/j.pharmthera.2018.04.008] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Research on GPR34, which was discovered in 1999 as an orphan G protein-coupled receptor of the rhodopsin-like class, disclosed its physiologic relevance only piece by piece. Being present in all recent vertebrate genomes analyzed so far it seems to improve the fitness of species although it is not essential for life and reproduction as GPR34-deficient mice demonstrate. However, closer inspection of macrophages and microglia, where it is mainly expressed, revealed its relevance in immune cell function. Recent data clearly demonstrate that GPR34 function is required to arrest microglia in the M0 homeostatic non-phagocytic phenotype. Herein, we summarize the current knowledge on its evolution, genomic and structural organization, physiology, pharmacology and relevance in human diseases including neurodegenerative diseases and cancer, which accumulated over the last 20 years.
Collapse
Affiliation(s)
- Torsten Schöneberg
- Rudolf Schönheimer Institute of Biochemistry, Molecular Biochemistry, Medical Faculty, University of Leipzig, 04103 Leipzig, Germany.
| | - Jaroslawna Meister
- Molecular Signaling Section, Laboratory of Bioorganic Chemistry, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892, United States
| | - Alexander Bernd Knierim
- Rudolf Schönheimer Institute of Biochemistry, Molecular Biochemistry, Medical Faculty, University of Leipzig, 04103 Leipzig, Germany; Leipzig University Medical Center, IFB AdiposityDiseases, 04103 Leipzig, Germany
| | - Angela Schulz
- Rudolf Schönheimer Institute of Biochemistry, Molecular Biochemistry, Medical Faculty, University of Leipzig, 04103 Leipzig, Germany
| |
Collapse
|
9
|
Ahmad HI, Ahmad MJ, Adeel MM, Asif AR, Du X. Positive selection drives the evolution of endocrine regulatory bone morphogenetic protein system in mammals. Oncotarget 2018; 9:18435-18445. [PMID: 29719616 PMCID: PMC5915083 DOI: 10.18632/oncotarget.24240] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Accepted: 12/06/2017] [Indexed: 12/12/2022] Open
Abstract
The rapid evolution of reproductive proteins might be driven by positive Darwinian selection. The bone morphogenetic protein family is the largest within the transforming growth factor (TGF) superfamily. A little have been known about the molecular evolution of bone morphogenetic proteins exhibiting potential role in mammalian reproduction. In this study we investigated mammalian bone morphogenetic proteins using maximum likelihood approaches of codon substitutions to identify positive Darwinian selection in various species. The proportion of positively selected sites was tested by different likelihood models for individual codon, and M8 were found to be the best model. The percentage of positively elected sites under M8 are 2.20% with ω = 1.089 for BMP2, 1.6% with ω = 1.61 for BMP 4 0.53% for BMP15 with ω = 1.56 and 0.78% for GDF9 with ω = 1.93. The percentage of estimated selection sites under M8 is strong statistical confirmation that divergence of bone morphogenetic proteins is driven by Darwinian selection. For the proteins, model M8 was found significant for all proteins with ω > 1. To further test positive selection on particular amino acids, the evolutionary conservation of amino acid were measured based on phylogenetic linkage among sequences. For exploring the impact of these somatic substitution mutations in the selection region on human cancer, we identified one pathogenic mutation in human BMP4 and one in BMP15, possibly causing prostate cancer and six neutral mutations in BMPs. The comprehensive map of selection results allows the researchers to perform systematic approaches to detect the evolutionary footprints of selection on specific gene in specific species.
Collapse
Affiliation(s)
- Hafiz Ishfaq Ahmad
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of the Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Muhammad Jamil Ahmad
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of the Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Muhammad Muzammal Adeel
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Akhtar Rasool Asif
- University of Veterinary and Animal Sciences, Lahore, Sub Campus Jhang, Pakistan
| | - Xiaoyong Du
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction of the Ministry of Education, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan 430070, P.R. China.,Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| |
Collapse
|
10
|
Ahmad HI, Liu G, Jiang X, Edallew SG, Wassie T, Tesema B, Yun Y, Pan L, Liu C, Chong Y, Yu ZJ, Jilong H. Maximum-likelihood approaches reveal signatures of positive selection in BMP15 and GDF9 genes modulating ovarian function in mammalian female fertility. Ecol Evol 2017; 7:8895-8902. [PMID: 29177034 PMCID: PMC5689494 DOI: 10.1002/ece3.3336] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Revised: 07/11/2017] [Accepted: 07/18/2017] [Indexed: 02/06/2023] Open
Abstract
Bone morphogenetic proteins (BMPs) and the growth factors (GDFs) play an important role in ovarian folliculogenesis and essential regulator of processes of numerous granulosa cells. BMP15 gene variations linked to various ovarian phenotypic consequences subject to the species, from infertility to improved prolificacy in sheep, primary ovarian insufficiency in women or associated with minor subfertility in mouse. To study the evolving role of BMP15 and GDF9, a phylogenetic analysis was performed. To find out the candidate gene associated with prolificacy in mammals, the nucleotide sequence of BMP15 and GDF9 genes was recognized under positive selection in various mammalian species. Maximum‐likelihood approaches used on BMP15 and GDF9 genes exhibited a robust divergence and a prompted evolution as compared to other TGFβ family members. Furthermore, among 32 mammalian species, we identified positive selection signals in the hominidae clade resulting to 132D, 147E, 163Y, 191W, and 236P codon sites of BMP15 and 162F, 188K, 206R, 240A, 244L, 246H, 248S, 251D, 253L, 254F and other codon sites of GDF9. The positively selected amino acid sites such as Alanine, Lucien, Arginine, and lysine are important for signaling. In conclusion, this study evidences that GDF9 and BMP15 genes have rapid evolution than other TGFß family members and was subjected to positive selection in the mammalian clade. Selected sites under the positive selection are of remarkable significance for the particular functioning of the protein and consequently for female fertility.
Collapse
Affiliation(s)
- Hafiz Ishfaq Ahmad
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Guiqiong Liu
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Xunping Jiang
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Shishay Girmay Edallew
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Teketay Wassie
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Birhanu Tesema
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Yu Yun
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Liu Pan
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Chenhui Liu
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Yuqing Chong
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Zhao Jia Yu
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| | - Han Jilong
- Key Laboratory of Agricultural Animal Genetics Breeding and Reproduction of the Ministry of Education College of Animal Science and Technology Huazhong Agricultural University Wuhan China
| |
Collapse
|
11
|
Katsonis P, Lichtarge O. Objective assessment of the evolutionary action equation for the fitness effect of missense mutations across CAGI-blinded contests. Hum Mutat 2017; 38:1072-1084. [PMID: 28544059 DOI: 10.1002/humu.23266] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Revised: 03/13/2017] [Accepted: 05/17/2017] [Indexed: 01/09/2023]
Abstract
A major challenge in genome interpretation is to estimate the fitness effect of coding variants of unknown significance (VUS). Labor, limited understanding of protein functions, and lack of assays generally limit direct experimental assessment of VUS, and make robust and accurate computational approaches a necessity. Often, however, algorithms that predict mutational effect disagree among themselves and with experimental data, slowing their adoption for clinical diagnostics. To objectively assess such methods, the Critical Assessment of Genome Interpretation (CAGI) community organizes contests to predict unpublished experimental data, available only to CAGI assessors. We review here the CAGI performance of evolutionary action (EA) predictions of mutational impact. EA models the fitness effect of coding mutations analytically, as a product of the gradient of the fitness landscape times the perturbation size. In practice, these terms are computed from phylogenetic considerations as the functional sensitivity of the mutated site and as the magnitude of amino acid substitution, respectively, and yield the percentage loss of wild-type activity. In five CAGI challenges, EA consistently performed on par or better than sophisticated machine learning approaches. This objective assessment suggests that a simple differential model of evolution can interpret the fitness effect of coding variations, opening diverse clinical applications.
Collapse
Affiliation(s)
- Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas.,Department of Biochemistry & Molecular Biology, Baylor College of Medicine, Houston, Texas.,Department of Pharmacology, Baylor College of Medicine, Houston, Texas.,Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas
| |
Collapse
|
12
|
Gupta PSS, Banerjee S, Islam RNU, Sur VP, Bandyopadhyay AK. Substitutional Analysis of Orthologous Protein Families Using BLOCKS. Bioinformation 2017; 13:1-7. [PMID: 28479743 PMCID: PMC5405086 DOI: 10.6026/97320630013001] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Revised: 12/26/2016] [Accepted: 12/27/2016] [Indexed: 11/29/2022] Open
Abstract
Orthologous proteins, form due to divergence of parental sequence, perform similar function under different environmental and
biological conditions. Amino acid changes at locus specific positions form hetero-pairs whose role in BLOCK evolution is yet to be
understood. We involve eight protein BLOCKs of known divergence rate to gain insight into the role of hetero-pairs in evolution. Our
procedure APBEST uses BLOCK-FASTA file to extract BLOCK specific evolutionary parameters such as dominantly used hetero-pair
(D), usage of hetero-pairs (E), non-conservative to conservative substitution ratio (R), maximally-diverse residue (MDR), residue (RD)
and class (CD) specific diversity. All these parameters show BLOCK specific variation. Conservative nature of D points towards
restoration of function of BLOCK. While E sets the upper-limit of usage of hereto-pairs, strong correlation of R with divergence-rate
indicates that the later is directly dependent on non-conservative substitutions. The observation that MDR, measure of positional
diversity, occupy very limited positions in BLOCK indicates accommodation of diversity is positionally restricted. Overall, the study
extract observed hetero-pair related quantitative and multi-parametric details of BLOCK, which finds application in evolutionary
biology.
Collapse
Affiliation(s)
- Parth Sarthi Sen Gupta
- Department of Biotechnology, The University of Burdwan, Golapbag, Burdwan, 713104, West Bengal, India
| | - Shyamashree Banerjee
- Department of Biotechnology, The University of Burdwan, Golapbag, Burdwan, 713104, West Bengal, India
| | - Rifat Nawaz Ul Islam
- Department of Biotechnology, The University of Burdwan, Golapbag, Burdwan, 713104, West Bengal, India
| | - Vishma Pratap Sur
- Indian Institute of Chemical Biology, Animal House (IICB), Kolkata, West Bengal, India
| | - Amal K Bandyopadhyay
- Department of Biotechnology, The University of Burdwan, Golapbag, Burdwan, 713104, West Bengal, India
| |
Collapse
|
13
|
Schulz WL, Tormey CA, Torres R. Computational Approach to Annotating Variants of Unknown Significance in Clinical Next Generation Sequencing. Lab Med 2016; 46:285-9. [PMID: 26489672 DOI: 10.1309/lmwzh57brwopr5rq] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
Abstract
Next generation sequencing (NGS) has become a common technology in the clinical laboratory, particularly for the analysis of malignant neoplasms. However, most mutations identified by NGS are variants of unknown clinical significance (VOUS). Although the approach to define these variants differs by institution, software algorithms that predict variant effect on protein function may be used. However, these algorithms commonly generate conflicting results, potentially adding uncertainty to interpretation. In this review, we examine several computational tools used to predict whether a variant has clinical significance. In addition to describing the role of these tools in clinical diagnostics, we assess their efficacy in analyzing known pathogenic and benign variants in hematologic malignancies.
Collapse
Affiliation(s)
- Wade L Schulz
- Department of Laboratory Medicine, Yale University School of Medicine, West Haven, CT
| | - Christopher A Tormey
- Department of Laboratory Medicine, Yale University School of Medicine, West Haven, CT Pathology and Laboratory Medicine Service, VA Connecticut Healthcare System, West Haven, CT
| | - Richard Torres
- Department of Laboratory Medicine, Yale University School of Medicine, West Haven, CT
| |
Collapse
|
14
|
Tang H, Thomas PD. Tools for Predicting the Functional Impact of Nonsynonymous Genetic Variation. Genetics 2016; 203:635-47. [PMID: 27270698 PMCID: PMC4896183 DOI: 10.1534/genetics.116.190033] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2015] [Accepted: 04/01/2016] [Indexed: 01/09/2023] Open
Abstract
As personal genome sequencing becomes a reality, understanding the effects of genetic variants on phenotype-particularly the impact of germline variants on disease risk and the impact of somatic variants on cancer development and treatment-continues to increase in importance. Because of their clear potential for affecting phenotype, nonsynonymous genetic variants (variants that cause a change in the amino acid sequence of a protein encoded by a gene) have long been the target of efforts to predict the effects of genetic variation. Whole-genome sequencing is identifying large numbers of nonsynonymous variants in each genome, intensifying the need for computational methods that accurately predict which of these are likely to impact disease phenotypes. This review focuses on nonsynonymous variant prediction with two aims in mind: (1) to review the prioritization methods that have been developed to date and the principles on which they are based and (2) to discuss the challenges to further improving these methods.
Collapse
Affiliation(s)
- Haiming Tang
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, California 90033
| | - Paul D Thomas
- Division of Bioinformatics, Department of Preventive Medicine, University of Southern California, Los Angeles, California 90033
| |
Collapse
|
15
|
Tang H, Thomas PD. PANTHER-PSEP: predicting disease-causing genetic variants using position-specific evolutionary preservation. Bioinformatics 2016; 32:2230-2. [DOI: 10.1093/bioinformatics/btw222] [Citation(s) in RCA: 160] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2015] [Accepted: 04/18/2016] [Indexed: 12/31/2022] Open
|
16
|
Sun S, Yang F, Tan G, Costanzo M, Oughtred R, Hirschman J, Theesfeld CL, Bansal P, Sahni N, Yi S, Yu A, Tyagi T, Tie C, Hill DE, Vidal M, Andrews BJ, Boone C, Dolinski K, Roth FP. An extended set of yeast-based functional assays accurately identifies human disease mutations. Genome Res 2016; 26:670-80. [PMID: 26975778 PMCID: PMC4864455 DOI: 10.1101/gr.192526.115] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Accepted: 03/08/2016] [Indexed: 12/19/2022]
Abstract
We can now routinely identify coding variants within individual human genomes. A pressing challenge is to determine which variants disrupt the function of disease-associated genes. Both experimental and computational methods exist to predict pathogenicity of human genetic variation. However, a systematic performance comparison between them has been lacking. Therefore, we developed and exploited a panel of 26 yeast-based functional complementation assays to measure the impact of 179 variants (101 disease- and 78 non-disease-associated variants) from 22 human disease genes. Using the resulting reference standard, we show that experimental functional assays in a 1-billion-year diverged model organism can identify pathogenic alleles with significantly higher precision and specificity than current computational methods.
Collapse
Affiliation(s)
- Song Sun
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Lunenfeld-Tanenbaum Research Institute, Mt. Sinai Hospital, Toronto, Ontario M5G 1X5, Canada; Department of Medical Biochemistry and Microbiology, Uppsala University, SE-75123 Uppsala, Sweden
| | - Fan Yang
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Lunenfeld-Tanenbaum Research Institute, Mt. Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Guihong Tan
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Michael Costanzo
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Rose Oughtred
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Jodi Hirschman
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Chandra L Theesfeld
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Pritpal Bansal
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Lunenfeld-Tanenbaum Research Institute, Mt. Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Nidhi Sahni
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA; Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Song Yi
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA; Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Analyn Yu
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Lunenfeld-Tanenbaum Research Institute, Mt. Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Tanya Tyagi
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Lunenfeld-Tanenbaum Research Institute, Mt. Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - Cathy Tie
- Lunenfeld-Tanenbaum Research Institute, Mt. Sinai Hospital, Toronto, Ontario M5G 1X5, Canada
| | - David E Hill
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA; Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Marc Vidal
- Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA; Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Brenda J Andrews
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Charles Boone
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544, USA
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3E1, Canada; Lunenfeld-Tanenbaum Research Institute, Mt. Sinai Hospital, Toronto, Ontario M5G 1X5, Canada; Center for Cancer Systems Biology (CCSB), Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA; Canadian Institute for Advanced Research, Toronto, Ontario, M5G 1Z8, Canada
| |
Collapse
|
17
|
Abstract
Despite a billion years of divergent evolution, the baker’s yeast Saccharomyces cerevisiae has long proven to be an invaluable model organism for studying human biology. Given its tractability and ease of genetic manipulation, along with extensive genetic conservation with humans, it is perhaps no surprise that researchers have been able to expand its utility by expressing human proteins in yeast, or by humanizing specific yeast amino acids, proteins or even entire pathways. These methods are increasingly being scaled in throughput, further enabling the detailed investigation of human biology and disease-specific variations of human genes in a simplified model organism.
Collapse
|
18
|
AlloRep: A Repository of Sequence, Structural and Mutagenesis Data for the LacI/GalR Transcription Regulators. J Mol Biol 2015; 428:671-678. [PMID: 26410588 DOI: 10.1016/j.jmb.2015.09.015] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Revised: 09/04/2015] [Accepted: 09/17/2015] [Indexed: 11/20/2022]
Abstract
Protein families evolve functional variation by accumulating point mutations at functionally important amino acid positions. Homologs in the LacI/GalR family of transcription regulators have evolved to bind diverse DNA sequences and allosteric regulatory molecules. In addition to playing key roles in bacterial metabolism, these proteins have been widely used as a model family for benchmarking structural and functional prediction algorithms. We have collected manually curated sequence alignments for >3000 sequences, in vivo phenotypic and biochemical data for >5750 LacI/GalR mutational variants, and noncovalent residue contact networks for 65 LacI/GalR homolog structures. Using this rich data resource, we compared the noncovalent residue contact networks of the LacI/GalR subfamilies to design and experimentally validate an allosteric mutant of a synthetic LacI/GalR repressor for use in biotechnology. The AlloRep database (freely available at www.AlloRep.org) is a key resource for future evolutionary studies of LacI/GalR homologs and for benchmarking computational predictions of functional change.
Collapse
|
19
|
Complementation of Yeast Genes with Human Genes as an Experimental Platform for Functional Testing of Human Genetic Variants. Genetics 2015; 201:1263-74. [PMID: 26354769 PMCID: PMC4649650 DOI: 10.1534/genetics.115.181099] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2015] [Accepted: 09/01/2015] [Indexed: 12/11/2022] Open
Abstract
While the pace of discovery of human genetic variants in tumors, patients, and diverse populations has rapidly accelerated, deciphering their functional consequence has become rate-limiting. Using cross-species complementation, model organisms like the budding yeast, Saccharomyces cerevisiae, can be utilized to fill this gap and serve as a platform for testing human genetic variants. To this end, we performed two parallel screens, a one-to-one complementation screen for essential yeast genes implicated in chromosome instability and a pool-to-pool screen that queried all possible essential yeast genes for rescue of lethality by all possible human homologs. Our work identified 65 human cDNAs that can replace the null allele of essential yeast genes, including the nonorthologous pair yRFT1/hSEC61A1. We chose four human cDNAs (hLIG1, hSSRP1, hPPP1CA, and hPPP1CC) for which their yeast gene counterparts function in chromosome stability and assayed in yeast 35 tumor-specific missense mutations for growth defects and sensitivity to DNA-damaging agents. This resulted in a set of human–yeast gene complementation pairs that allow human genetic variants to be readily characterized in yeast, and a prioritized list of somatic mutations that could contribute to chromosome instability in human tumors. These data establish the utility of this cross-species experimental approach.
Collapse
|
20
|
Rockah-Shmuel L, Tóth-Petróczy Á, Tawfik DS. Systematic Mapping of Protein Mutational Space by Prolonged Drift Reveals the Deleterious Effects of Seemingly Neutral Mutations. PLoS Comput Biol 2015; 11:e1004421. [PMID: 26274323 PMCID: PMC4537296 DOI: 10.1371/journal.pcbi.1004421] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 06/30/2015] [Indexed: 11/18/2022] Open
Abstract
Systematic mappings of the effects of protein mutations are becoming increasingly popular. Unexpectedly, these experiments often find that proteins are tolerant to most amino acid substitutions, including substitutions in positions that are highly conserved in nature. To obtain a more realistic distribution of the effects of protein mutations, we applied a laboratory drift comprising 17 rounds of random mutagenesis and selection of M.HaeIII, a DNA methyltransferase. During this drift, multiple mutations gradually accumulated. Deep sequencing of the drifted gene ensembles allowed determination of the relative effects of all possible single nucleotide mutations. Despite being averaged across many different genetic backgrounds, about 67% of all nonsynonymous, missense mutations were evidently deleterious, and an additional 16% were likely to be deleterious. In the early generations, the frequency of most deleterious mutations remained high. However, by the 17th generation, their frequency was consistently reduced, and those remaining were accepted alongside compensatory mutations. The tolerance to mutations measured in this laboratory drift correlated with sequence exchanges seen in M.HaeIII’s natural orthologs. The biophysical constraints dictating purging in nature and in this laboratory drift also seemed to overlap. Our experiment therefore provides an improved method for measuring the effects of protein mutations that more closely replicates the natural evolutionary forces, and thereby a more realistic view of the mutational space of proteins. Understanding and predicting the effects of single nucleotide polymorphisms (SNPs) is of fundamental importance in many fields. Systematic experimental mappings of the effects of such mutations within a given gene/protein comprise an essential experimental tool for determining protein function and for refining models of protein evolution, as well as an important resource for improving prediction algorithms. Here, we present the results of a laboratory system that mimics the manner by which protein sequences diverge in nature: a prolonged process of gradually accumulating random mutations that retain the protein’s structure and function. The change in frequencies of mutations over generations, as obtained by deep sequencing, enabled us to assess the relative effects of all possible SNPs at the background of an accumulating number of mutations. Compared to previous reports, we found that > 80% of all possible amino acid exchanges have potential deleterious effects, with 67% being clearly deleterious. Tolerance vs. purging of mutations in our prolonged drift also showed better correlation with natural diversity. Overall, our experimental setup provides a better understanding of how protein sequences diverge in nature, plus a new basis for improving the prediction accuracy of the effects of protein mutations, and specifically of SNPs.
Collapse
Affiliation(s)
- Liat Rockah-Shmuel
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Ágnes Tóth-Petróczy
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Dan S. Tawfik
- Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
- * E-mail:
| |
Collapse
|
21
|
Kachroo AH, Laurent JM, Yellman CM, Meyer AG, Wilke CO, Marcotte EM. Evolution. Systematic humanization of yeast genes reveals conserved functions and genetic modularity. Science 2015; 348:921-5. [PMID: 25999509 PMCID: PMC4718922 DOI: 10.1126/science.aaa0769] [Citation(s) in RCA: 263] [Impact Index Per Article: 29.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
To determine whether genes retain ancestral functions over a billion years of evolution and to identify principles of deep evolutionary divergence, we replaced 414 essential yeast genes with their human orthologs, assaying for complementation of lethal growth defects upon loss of the yeast genes. Nearly half (47%) of the yeast genes could be successfully humanized. Sequence similarity and expression only partly predicted replaceability. Instead, replaceability depended strongly on gene modules: Genes in the same process tended to be similarly replaceable (e.g., sterol biosynthesis) or not (e.g., DNA replication initiation). Simulations confirmed that selection for specific function can maintain replaceability despite extensive sequence divergence. Critical ancestral functions of many essential genes are thus retained in a pathway-specific manner, resilient to drift in sequences, splicing, and protein interfaces.
Collapse
Affiliation(s)
- Aashiq H Kachroo
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Jon M Laurent
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Christopher M Yellman
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Austin G Meyer
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA. Center for Computational Biology and Bioinformatics, University of Texas at Austin, Austin, TX 78712, USA
| | - Claus O Wilke
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA. Center for Computational Biology and Bioinformatics, University of Texas at Austin, Austin, TX 78712, USA. Department of Integrative Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Edward M Marcotte
- Center for Systems and Synthetic Biology, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA. Center for Computational Biology and Bioinformatics, University of Texas at Austin, Austin, TX 78712, USA. Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA.
| |
Collapse
|
22
|
Melamed D, Young DL, Miller CR, Fields S. Combining natural sequence variation with high throughput mutational data to reveal protein interaction sites. PLoS Genet 2015; 11:e1004918. [PMID: 25671604 PMCID: PMC4335499 DOI: 10.1371/journal.pgen.1004918] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Accepted: 11/24/2014] [Indexed: 12/29/2022] Open
Abstract
Many protein interactions are conserved among organisms despite changes in the amino acid sequences that comprise their contact sites, a property that has been used to infer the location of these sites from protein homology. In an inter-species complementation experiment, a sequence present in a homologue is substituted into a protein and tested for its ability to support function. Therefore, substitutions that inhibit function can identify interaction sites that changed over evolution. However, most of the sequence differences within a protein family remain unexplored because of the small-scale nature of these complementation approaches. Here we use existing high throughput mutational data on the in vivo function of the RRM2 domain of the Saccharomyces cerevisiae poly(A)-binding protein, Pab1, to analyze its sites of interaction. Of 197 single amino acid differences in 52 Pab1 homologues, 17 reduce the function of Pab1 when substituted into the yeast protein. The majority of these deleterious mutations interfere with the binding of the RRM2 domain to eIF4G1 and eIF4G2, isoforms of a translation initiation factor. A large-scale mutational analysis of the RRM2 domain in a two-hybrid assay for eIF4G1 binding supports these findings and identifies peripheral residues that make a smaller contribution to eIF4G1 binding. Three single amino acid substitutions in yeast Pab1 corresponding to residues from the human orthologue are deleterious and eliminate binding to the yeast eIF4G isoforms. We create a triple mutant that carries these substitutions and other humanizing substitutions that collectively support a switch in binding specificity of RRM2 from the yeast eIF4G1 to its human orthologue. Finally, we map other deleterious substitutions in Pab1 to inter-domain (RRM2–RRM1) or protein-RNA (RRM2–poly(A)) interaction sites. Thus, the combined approach of large-scale mutational data and evolutionary conservation can be used to characterize interaction sites at single amino acid resolution. The interactions of proteins with each other are essential for almost all biological processes. Many of the sites of protein contact have evolved to maintain these interactions, but use different sets of amino acid residues. As a result, the residues at a contact site in a protein from one species might not allow a protein interaction when they are tested in a second species. This property underlies the idea of inter-species complementation assays, which test the effect of replacing protein segments from one species by their equivalents from another species. However, this approach has been highly limited in the number of changes that could be analyzed in a single study. Here, we present a novel approach that combines a high-throughput analysis of mutations in a single protein with the set of natural sequences corresponding to evolutionarily divergent variants of this protein. This integration step allows us to map at high resolution both sites of inter-protein interaction as well as intra-protein interaction. Our approach can be used with proteins that have limited functional and structural data, and it can be applied to improve the performance of computational tools that use sequence homology to predict function.
Collapse
Affiliation(s)
- Daniel Melamed
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington, United States of America
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - David L. Young
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Christina R. Miller
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington, United States of America
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Stanley Fields
- Howard Hughes Medical Institute, University of Washington, Seattle, Washington, United States of America
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Department of Medicine, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
23
|
Dunham MJ, Fowler DM. Contemporary, yeast-based approaches to understanding human genetic variation. Curr Opin Genet Dev 2013; 23:658-64. [PMID: 24252429 DOI: 10.1016/j.gde.2013.10.001] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2013] [Revised: 10/01/2013] [Accepted: 10/02/2013] [Indexed: 01/11/2023]
Abstract
Determining how genetic variation contributes to human health and disease is a critical challenge. As one of the most genetically tractable model organisms, yeast has played a central role in meeting this challenge. The advent of new technologies, including high-throughput DNA sequencing and synthesis, proteomics, and computational methods, has vastly increased the power of yeast-based approaches to determine the consequences of human genetic variation. Recent successes include systematic exploration of the effects of gene dosage, large-scale analysis of the effect of coding variation on gene function, and the use of humanized yeast to model disease. By virtue of its manipulability, small genome size, and genetic tractability, yeast is poised to help us understand human genetic variation.
Collapse
Affiliation(s)
- Maitreya J Dunham
- Department of Genome Sciences, University of Washington, Foege Building, Box 355065, 3720 15th Avenue NE, Seattle, WA 98195-5065, USA.
| | | |
Collapse
|
24
|
Mono and dual cofactor dependence of human cystathionine β-synthase enzyme variants in vivo and in vitro. G3-GENES GENOMES GENETICS 2013; 3:1619-28. [PMID: 23934999 PMCID: PMC3789787 DOI: 10.1534/g3.113.006916] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
Any two individuals differ from each other by an average of 3 million single-nucleotide polymorphisms. Some polymorphisms have a functional impact on cofactor-using enzymes and therefore represent points of possible therapeutic intervention through elevated-cofactor remediation. Because most known disease-causing mutations affect protein stability, we evaluated how the in vivo impact caused by single amino acid substitutions in a prototypical enzyme of this type compared with physical characteristics of the variant enzymes in vitro. We focused on cystathionine β-synthase (CBS) because of its clinical relevance in homocysteine metabolism and because some variants of the enzyme are clinically responsive to increased levels of its B6 cofactor. Single amino-acid substitutions throughout the CBS protein caused reduced function in vivo, and a subset of these altered sensitivity to limiting B6-cofactor. Some of these B6-sensitive substitutions also had altered sensitivity to limiting heme, another CBS cofactor. Limiting heme resulted in reduced incorporation of heme into these variants, and subsequently increased protease sensitivity of the enzyme in vitro. We hypothesize that these alleles caused a modest, yet significant, destabilization of the native state of the protein, and that the functional impact of the amino acid substitutions caused by these alleles can be influenced by cofactor(s) even when the affected amino acid is distant from the cofactor binding site.
Collapse
|
25
|
Riera C, Lois S, de la Cruz X. Prediction of pathological mutations in proteins: the challenge of integrating sequence conservation and structure stability principles. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2013. [DOI: 10.1002/wcms.1170] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Casandra Riera
- Laboratory of Translational Bioinformatics in Neuroscience; VHIR; Barcelona Spain
| | - Sergio Lois
- Laboratory of Translational Bioinformatics in Neuroscience; VHIR; Barcelona Spain
| | - Xavier de la Cruz
- Laboratory of Translational Bioinformatics in Neuroscience; VHIR; Barcelona Spain
- Institució Catalana per la Recerca i Estudis Avançats (ICREA); Barcelona Spain
| |
Collapse
|
26
|
Yakubu A, Salako AE, De Donato M, Takeet MI, Peters SO, Adefenwa MA, Okpeku M, Wheto M, Agaviezor BO, Sanni TM, Ajayi OO, Onasanya GO, Ekundayo OJ, Ilori BM, Amusan SA, Imumorin IG. Genetic Diversity in Exon 2 of the Major Histocompatibility Complex Class II DQB1 Locus in Nigerian Goats. Biochem Genet 2013; 51:954-66. [DOI: 10.1007/s10528-013-9620-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2012] [Accepted: 02/13/2013] [Indexed: 10/26/2022]
|
27
|
Meinhardt S, Manley MW, Becker NA, Hessman JA, Maher LJ, Swint-Kruse L. Novel insights from hybrid LacI/GalR proteins: family-wide functional attributes and biologically significant variation in transcription repression. Nucleic Acids Res 2012; 40:11139-54. [PMID: 22965134 PMCID: PMC3505978 DOI: 10.1093/nar/gks806] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
LacI/GalR transcription regulators have extensive, non-conserved interfaces between their regulatory domains and the 18 amino acids that serve as ‘linkers’ to their DNA-binding domains. These non-conserved interfaces might contribute to functional differences between paralogs. Previously, two chimeras created by domain recombination displayed novel functional properties. Here, we present a synthetic protein family, which was created by joining the LacI DNA-binding domain/linker to seven additional regulatory domains. Despite ‘mismatched’ interfaces, chimeras maintained allosteric response to their cognate effectors. Therefore, allostery in many LacI/GalR proteins does not require interfaces with precisely matched interactions. Nevertheless, the chimeric interfaces were not silent to mutagenesis, and preliminary comparisons suggest that the chimeras provide an ideal context for systematically exploring functional contributions of non-conserved positions. DNA looping experiments revealed higher order (dimer–dimer) oligomerization in several chimeras, which might be possible for the natural paralogs. Finally, the biological significance of repression differences was determined by measuring bacterial growth rates on lactose minimal media. Unexpectedly, moderate and strong repressors showed an apparent induction phase, even though inducers were not provided; therefore, an unknown mechanism might contribute to regulation of the lac operon. Nevertheless, altered growth correlated with altered repression, which indicates that observed functional modifications are significant.
Collapse
Affiliation(s)
- Sarah Meinhardt
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, KS 66160, USA
| | | | | | | | | | | |
Collapse
|
28
|
Needles in stacks of needles: finding disease-causal variants in a wealth of genomic data. Nat Rev Genet 2011; 12:628-40. [PMID: 21850043 DOI: 10.1038/nrg3046] [Citation(s) in RCA: 394] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Genome and exome sequencing yield extensive catalogues of human genetic variation. However, pinpointing the few phenotypically causal variants among the many variants present in human genomes remains a major challenge, particularly for rare and complex traits wherein genetic information alone is often insufficient. Here, we review approaches to estimate the deleteriousness of single nucleotide variants (SNVs), which can be used to prioritize disease-causal variants. We describe recent advances in comparative and functional genomics that enable systematic annotation of both coding and non-coding variants. Application and optimization of these methods will be essential to find the genetic answers that sequencing promises to hide in plain sight.
Collapse
|
29
|
Abstract
Model organisms have played a huge part in the history of studies of human genetic disease, both in identifying disease genes and characterizing their normal and abnormal functions. But is the importance of model organisms diminishing? The direct discovery of disease genes and variants in humans has been revolutionized, first by genome-wide association studies and now by whole-genome sequencing. Not only is it now much easier to directly identify potential disease genes in humans, but the genetic architecture that is being revealed in many cases is hard to replicate in model organisms. Furthermore, disease modelling can be done with increasing effectiveness using human cells. Where does this leave non-human models of disease?
Collapse
|
30
|
Smith CA, Kortemme T. Predicting the tolerated sequences for proteins and protein interfaces using RosettaBackrub flexible backbone design. PLoS One 2011; 6:e20451. [PMID: 21789164 PMCID: PMC3138746 DOI: 10.1371/journal.pone.0020451] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2011] [Accepted: 04/20/2011] [Indexed: 11/18/2022] Open
Abstract
Predicting the set of sequences that are tolerated by a protein or protein interface, while maintaining a desired function, is useful for characterizing protein interaction specificity and for computationally designing sequence libraries to engineer proteins with new functions. Here we provide a general method, a detailed set of protocols, and several benchmarks and analyses for estimating tolerated sequences using flexible backbone protein design implemented in the Rosetta molecular modeling software suite. The input to the method is at least one experimentally determined three-dimensional protein structure or high-quality model. The starting structure(s) are expanded or refined into a conformational ensemble using Monte Carlo simulations consisting of backrub backbone and side chain moves in Rosetta. The method then uses a combination of simulated annealing and genetic algorithm optimization methods to enrich for low-energy sequences for the individual members of the ensemble. To emphasize certain functional requirements (e.g. forming a binding interface), interactions between and within parts of the structure (e.g. domains) can be reweighted in the scoring function. Results from each backbone structure are merged together to create a single estimate for the tolerated sequence space. We provide an extensive description of the protocol and its parameters, all source code, example analysis scripts and three tests applying this method to finding sequences predicted to stabilize proteins or protein interfaces. The generality of this method makes many other applications possible, for example stabilizing interactions with small molecules, DNA, or RNA. Through the use of within-domain reweighting and/or multistate design, it may also be possible to use this method to find sequences that stabilize particular protein conformations or binding interactions over others.
Collapse
Affiliation(s)
- Colin A. Smith
- Graduate Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California, United States of America
- California Institute for Quantitative Biosciences, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
| | - Tanja Kortemme
- Graduate Program in Biological and Medical Informatics, University of California San Francisco, San Francisco, California, United States of America
- California Institute for Quantitative Biosciences, San Francisco, California, United States of America
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, California, United States of America
- * E-mail:
| |
Collapse
|
31
|
|