1
|
Lei H, Li J, Zhao B, Kou SH, Xiao F, Chen T, Wang SM. Evolutionary origin of germline pathogenic variants in human DNA mismatch repair genes. Hum Genomics 2024; 18:5. [PMID: 38287404 PMCID: PMC10823654 DOI: 10.1186/s40246-024-00573-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 01/17/2024] [Indexed: 01/31/2024] Open
Abstract
BACKGROUND Mismatch repair (MMR) system is evolutionarily conserved for genome stability maintenance. Germline pathogenic variants (PVs) in MMR genes that lead to MMR functional deficiency are associated with high cancer risk. Knowing the evolutionary origin of germline PVs in human MMR genes will facilitate understanding the biological base of MMR deficiency in cancer. However, systematic knowledge is lacking to address the issue. In this study, we performed a comprehensive analysis to know the evolutionary origin of human MMR PVs. METHODS We retrieved MMR gene variants from the ClinVar database. The genomes of 100 vertebrates were collected from the UCSC genome browser and ancient human sequencing data were obtained through comprehensive data mining. Cross-species conservation analysis was performed based on the phylogenetic relationship among 100 vertebrates. Rescaled ancient sequencing data were used to perform variant calling for archeological analysis. RESULTS Using the phylogenetic approach, we traced the 3369 MMR PVs identified in modern humans in 99 non-human vertebrate genomes but found no evidence for cross-species conservation as the source for human MMR PVs. Using the archeological approach, we searched the human MMR PVs in over 5000 ancient human genomes dated from 45,045 to 100 years before present and identified a group of MMR PVs shared between modern and ancient humans mostly within 10,000 years with similar quantitative patterns. CONCLUSION Our study reveals that MMR PVs in modern humans were arisen within the recent human evolutionary history.
Collapse
Affiliation(s)
- Huijun Lei
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310018, Zhejiang, China
- Department of Cancer Prevention, Zhejiang Cancer Hospital, Hangzhou, 310022, Zhejiang, China
| | - Jiaheng Li
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
| | - Bojin Zhao
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
| | - Si Hoi Kou
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
| | - Fengxia Xiao
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China
| | - Tianhui Chen
- Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou, 310018, Zhejiang, China.
- Department of Cancer Prevention, Zhejiang Cancer Hospital, Hangzhou, 310022, Zhejiang, China.
| | - San Ming Wang
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Taipa, Macau SAR, 999078, China.
| |
Collapse
|
2
|
Kou SH, Li J, Tam B, Lei H, Zhao B, Xiao F, Wang S. TP53 germline pathogenic variants in modern humans were likely originated during recent human history. NAR Cancer 2023; 5:zcad025. [PMID: 37304756 PMCID: PMC10251638 DOI: 10.1093/narcan/zcad025] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Revised: 05/05/2023] [Accepted: 05/16/2023] [Indexed: 06/13/2023] Open
Abstract
TP53 is crucial for maintaining genome stability and preventing oncogenesis. Germline pathogenic variation in TP53 damages its function, causing genome instability and increased cancer risk. Despite extensive study in TP53, the evolutionary origin of the human TP53 germline pathogenic variants remains largely unclear. In this study, we applied phylogenetic and archaeological approaches to identify the evolutionary origin of TP53 germline pathogenic variants in modern humans. In the phylogenic analysis, we searched 406 human TP53 germline pathogenic variants in 99 vertebrates distributed in eight clades of Primate, Euarchontoglires, Laurasiatheria, Afrotheria, Mammal, Aves, Sarcopterygii and Fish, but we observed no direct evidence for the cross-species conservation as the origin; in the archaeological analysis, we searched the variants in 5031 ancient human genomes dated between 45045 and 100 years before present, and identified 45 pathogenic variants in 62 ancient humans dated mostly within the last 8000 years; we also identified 6 pathogenic variants in 3 Neanderthals dated 44000 to 38515 years before present and 1 Denisovan dated 158 550 years before present. Our study reveals that TP53 germline pathogenic variants in modern humans were likely originated in recent human history and partially inherited from the extinct Neanderthals and Denisovans.
Collapse
Affiliation(s)
- Si Hoi Kou
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Department of Public Health and Medical Administration, Faculty of Health Sciences, University of Macau, Macao SAR, China
| | - Jiaheng Li
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Department of Public Health and Medical Administration, Faculty of Health Sciences, University of Macau, Macao SAR, China
| | - Benjamin Tam
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Department of Public Health and Medical Administration, Faculty of Health Sciences, University of Macau, Macao SAR, China
| | - Huijun Lei
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Department of Public Health and Medical Administration, Faculty of Health Sciences, University of Macau, Macao SAR, China
| | - Bojin Zhao
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Department of Public Health and Medical Administration, Faculty of Health Sciences, University of Macau, Macao SAR, China
| | - Fengxia Xiao
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Department of Public Health and Medical Administration, Faculty of Health Sciences, University of Macau, Macao SAR, China
| | - San Ming Wang
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Department of Public Health and Medical Administration, Faculty of Health Sciences, University of Macau, Macao SAR, China
| |
Collapse
|
3
|
Chian JS, Li J, Wang SM. Evolutionary Origin of Human PALB2 Germline Pathogenic Variants. Int J Mol Sci 2023; 24:11343. [PMID: 37511102 PMCID: PMC10379391 DOI: 10.3390/ijms241411343] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 07/09/2023] [Accepted: 07/10/2023] [Indexed: 07/30/2023] Open
Abstract
PALB2 (Partner and localizer of BRCA2) is crucial for repairing DNA double-stranded breaks (DSBs) through homologous recombination (HR). Germline pathogenic variation in PALB2 disrupts DNA damage repair and increases the risk of Fanconi Anemia, breast cancer, and ovarian cancer. Determination of the evolutionary origin of human PALB2 variants will promote a deeper understanding of the biological basis of PALB2 germline variation and its roles in human diseases. We tested the evolution origin for 1444 human PALB2 germline variants, including 484 pathogenic and 960 benign variants. We performed a phylogenic analysis by tracing the variants in 100 vertebrates. However, we found no evidence to show that cross-species conservation was the origin of PALB2 germline pathogenic variants, but it is indeed a rich source for PALB2 germline benign variants. We performed a paleoanthropological analysis by tracing the variants in over 5000 ancient humans. We identified 50 pathogenic in 71 ancient humans dated from 32,895 to 689 before the present, of which 90.1% were dated within the recent 10,000 years. PALB2 benign variants were also highly shared with ancient humans. Data from our study reveal that human PALB2 pathogenic variants mostly arose in recent human history.
Collapse
Affiliation(s)
- Jia Sheng Chian
- MoE Frontiers Science Center for Precision Oncology, Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macao
| | - Jiaheng Li
- MoE Frontiers Science Center for Precision Oncology, Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macao
| | - San Ming Wang
- MoE Frontiers Science Center for Precision Oncology, Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macao
| |
Collapse
|
4
|
Evolutionary Origin of Germline Pathogenic MUTYH Variations in Modern Humans. Biomolecules 2023; 13:biom13030429. [PMID: 36979362 PMCID: PMC10046817 DOI: 10.3390/biom13030429] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 12/30/2022] [Accepted: 01/04/2023] [Indexed: 03/02/2023] Open
Abstract
MUTYH plays an essential role in preventing oxidation-caused DNA damage. Pathogenic germline variations in MUTYH damage its function, causing intestinal polyposis and colorectal cancer. Determination of the evolutionary origin of the variation is essential to understanding the etiological relationship between MUTYH variation and cancer development. In this study, we analyzed the origins of pathogenic germline variants in human MUTYH. Using a phylogenic approach, we searched pathogenic MUTYH variants in modern humans in the MUTYH of 99 vertebrates across eight clades. We did not find pathogenic variants shared between modern humans and the non-human vertebrates following the evolutionary tree, ruling out the possibility of cross-species conservation as the origin of human pathogenic variants in MUTYH. We then searched the variants in the MUTYH of 5031 ancient humans and extinct Neanderthals and Denisovans. We identified 24 pathogenic variants in 42 ancient humans dated between 30,570 and 480 years before present (BP), and three pathogenic variants in Neanderthals dated between 65,000 and 38,310 years BP. Data from our study revealed that human pathogenic MUTYH variants mostly arose in recent human history and were partially inherited from Neanderthals.
Collapse
|
5
|
Azbukina N, Zharikova A, Ramensky V. Intragenic compensation through the lens of deep mutational scanning. Biophys Rev 2022; 14:1161-1182. [PMID: 36345285 PMCID: PMC9636336 DOI: 10.1007/s12551-022-01005-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/26/2022] [Indexed: 12/20/2022] Open
Abstract
A significant fraction of mutations in proteins are deleterious and result in adverse consequences for protein function, stability, or interaction with other molecules. Intragenic compensation is a specific case of positive epistasis when a neutral missense mutation cancels effect of a deleterious mutation in the same protein. Permissive compensatory mutations facilitate protein evolution, since without them all sequences would be extremely conserved. Understanding compensatory mechanisms is an important scientific challenge at the intersection of protein biophysics and evolution. In human genetics, intragenic compensatory interactions are important since they may result in variable penetrance of pathogenic mutations or fixation of pathogenic human alleles in orthologous proteins from related species. The latter phenomenon complicates computational and clinical inference of an allele's pathogenicity. Deep mutational scanning is a relatively new technique that enables experimental studies of functional effects of thousands of mutations in proteins. We review the important aspects of the field and discuss existing limitations of current datasets. We reviewed ten published DMS datasets with quantified functional effects of single and double mutations and described rates and patterns of intragenic compensation in eight of them. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-022-01005-w.
Collapse
Affiliation(s)
- Nadezhda Azbukina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
| | - Anastasia Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| | - Vasily Ramensky
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| |
Collapse
|
6
|
Li J, Zhao B, Huang T, Qin Z, Wang SM. Human BRCA pathogenic variants were originated during recent human history. Life Sci Alliance 2022; 5:5/5/e202101263. [PMID: 35165121 PMCID: PMC8860097 DOI: 10.26508/lsa.202101263] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 01/25/2022] [Accepted: 01/26/2022] [Indexed: 01/05/2023] Open
Abstract
BRCA1 and BRCA2 (BRCA) play essential roles in maintaining genome stability. BRCA germline pathogenic variants increase cancer risk. However, the evolutionary origin of human BRCA pathogenic variants remains largely elusive. We tested the 2,972 human BRCA1 and 3,652 human BRCA2 pathogenic variants from ClinVar database in 100 vertebrates across eight clades, but failed to find evidence to show cross-species evolution conservation as the origin; we searched the variants in 2,792 ancient human genome data, and identified 28 BRCA1 and 22 BRCA2 pathogenic variants in 44 cases dated from 45,000 to 300 yr ago; we analyzed the haplotype-dated human BRCA pathogenic founder variants, and observed that they were mostly arisen within the past 3,000 yr; we traced ethnic distribution of human BRCA pathogenic variants, and found that the majority were present in single or a few ethnic populations. Based on the data, we propose that human BRCA pathogenic variants were highly likely arisen in recent human history after the latest out-of-Africa migration, and the expansion of modern human population could largely increase the variation spectrum.
Collapse
Affiliation(s)
- Jiaheng Li
- MoE Frontiers Science Center for Precision Oncology, Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Bojin Zhao
- MoE Frontiers Science Center for Precision Oncology, Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Teng Huang
- MoE Frontiers Science Center for Precision Oncology, Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - Zixin Qin
- MoE Frontiers Science Center for Precision Oncology, Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| | - San Ming Wang
- MoE Frontiers Science Center for Precision Oncology, Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, China
| |
Collapse
|
7
|
Zhou X, Dou Q, Fan G, Zhang Q, Sanderford M, Kaya A, Johnson J, Karlsson EK, Tian X, Mikhalchenko A, Kumar S, Seluanov A, Zhang ZD, Gorbunova V, Liu X, Gladyshev VN. Beaver and Naked Mole Rat Genomes Reveal Common Paths to Longevity. Cell Rep 2021; 32:107949. [PMID: 32726638 DOI: 10.1016/j.celrep.2020.107949] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 02/20/2020] [Accepted: 07/02/2020] [Indexed: 11/29/2022] Open
Abstract
Long-lived rodents have become an attractive model for the studies on aging. To understand evolutionary paths to long life, we prepare chromosome-level genome assemblies of the two longest-lived rodents, Canadian beaver (Castor canadensis) and naked mole rat (NMR, Heterocephalus glaber), which were scaffolded with in vitro proximity ligation and chromosome conformation capture data and complemented with long-read sequencing. Our comparative genomic analyses reveal that amino acid substitutions at "disease-causing" sites are widespread in the rodent genomes and that identical substitutions in long-lived rodents are associated with common adaptive phenotypes, e.g., enhanced resistance to DNA damage and cellular stress. By employing a newly developed substitution model and likelihood ratio test, we find that energy and fatty acid metabolism pathways are enriched for signals of positive selection in both long-lived rodents. Thus, the high-quality genome resource of long-lived rodents can assist in the discovery of genetic factors that control longevity and adaptive evolution.
Collapse
Affiliation(s)
- Xuming Zhou
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA; Broad Institute of Harvard and MIT, Cambridge, Massachusetts, MA 02142, USA
| | - Qianhui Dou
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | | | - Quanwei Zhang
- Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Maxwell Sanderford
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
| | - Alaattin Kaya
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA; Broad Institute of Harvard and MIT, Cambridge, Massachusetts, MA 02142, USA; Department of Biology, Virginia Commonwealth University, Richmond, VA 23284 USA
| | - Jeremy Johnson
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, MA 02142, USA
| | - Elinor K Karlsson
- Broad Institute of Harvard and MIT, Cambridge, Massachusetts, MA 02142, USA; University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Xiao Tian
- Department of Biology, University of Rochester, Rochester, NY 14627, USA; Department of Medicine, University of Rochester, Rochester, NY 14627, USA
| | - Aleksei Mikhalchenko
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | - Sudhir Kumar
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA 19122, USA
| | - Andrei Seluanov
- Department of Biology, University of Rochester, Rochester, NY 14627, USA; Department of Medicine, University of Rochester, Rochester, NY 14627, USA
| | | | - Vera Gorbunova
- Department of Biology, University of Rochester, Rochester, NY 14627, USA; Department of Medicine, University of Rochester, Rochester, NY 14627, USA
| | - Xin Liu
- BGI-Shenzhen, Shenzhen 518083, China
| | - Vadim N Gladyshev
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA; Broad Institute of Harvard and MIT, Cambridge, Massachusetts, MA 02142, USA.
| |
Collapse
|
8
|
Ahrens JB, Teufel AI, Siltberg-Liberles J. A Phylogenetic Rate Parameter Indicates Different Sequence Divergence Patterns in Orthologs and Paralogs. J Mol Evol 2020; 88:720-730. [PMID: 33118098 DOI: 10.1007/s00239-020-09969-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Accepted: 10/15/2020] [Indexed: 10/23/2022]
Abstract
Heterotachy-the change in sequence evolutionary rate over time-is a common feature of protein molecular evolution. Decades of studies have shed light on the conditions under which heterotachy occurs, and there is evidence that site-specific evolutionary rate shifts are correlated with changes in protein function. Here, we present a large-scale, computational analysis using thousands of protein sequence alignments from animal and plant proteomes, representing genes related either by orthology (speciation events) or paralogy (gene duplication), to compare sequence divergence patterns in orthologous vs. paralogous sequence alignments. We use sequence-based phylogenetic analyses to infer overall sequence divergence (tree length/number of sequences) and to fit site-specific rates to a discrete gamma distribution with a shape parameter α. This inference method is applied to real protein sequence alignments, as well as alignments simulated under various models of protein sequence evolution. Our simulations indicate that sequence divergence and the α parameter are positively correlated when sequences evolve with heterotachy, meaning that inferred site rate distributions appear more uniform as sequences diverge. Divergence and α are also positively correlated in both orthologous and paralogous genes, but the average increase in α (as a function of divergence) is significantly higher in paralogous protein alignments than in orthologous alignments. This result is consistent with the widely held view that recently duplicated proteins initially evolve under relaxed selective pressure, promoting functional divergence by accumulation of amino acid replacements, and hence experience more evolutionary rate fluctuations than orthologous proteins. We discuss these findings in the context of the ortholog conjecture, a long-standing assumption in molecular evolution, which posits that protein sequences related by orthology tend to be more functionally conserved than paralogous proteins.
Collapse
Affiliation(s)
- Joseph B Ahrens
- Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, FL, USA. .,Department of Biochemistry and Molecular Genetics, Computational Bioscience Program, University of Colorado Denver, Aurora, CO, USA.
| | - Ashley I Teufel
- Department of Integrative Biology, The University of Texas At Austin, Austin, TX, USA.,Santa Fe Institute, Santa Fe, NM, USA
| | - Jessica Siltberg-Liberles
- Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, Miami, FL, USA.
| |
Collapse
|
9
|
Rochman ND, Wolf YI, Koonin EV. Deep phylogeny of cancer drivers and compensatory mutations. Commun Biol 2020; 3:551. [PMID: 33009502 PMCID: PMC7532533 DOI: 10.1038/s42003-020-01276-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 09/03/2020] [Indexed: 12/14/2022] Open
Abstract
Driver mutations (DM) are the genetic impetus for most cancers. The DM are assumed to be deleterious in species evolution, being eliminated by purifying selection unless compensated by other mutations. We present deep phylogenies for 84 cancer driver genes and investigate the prevalence of 434 DM across gene-species trees. The DM are rare in species evolution, and 181 are completely absent, validating their negative fitness effect. The DM are more common in unicellular than in multicellular eukaryotes, suggesting a link between these mutations and cell proliferation control. 18 DM appear as the ancestral state in one or more major clades, including 3 among mammals. We identify within-gene, compensatory mutations for 98 DM and infer likely interactions between the DM and compensatory sites in protein structures. These findings elucidate the evolutionary status of DM and are expected to advance the understanding of the functions and evolution of oncogenes and tumor suppressors. Rochman et al. present deep phylogenies for 84 cancer driver genes and examine the prevalence of driver mutations across gene-species trees. Their results show that driver mutations are rare in species evolution and give insight into the evolution of driver mutations and oncogenes.
Collapse
Affiliation(s)
- Nash D Rochman
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, Bethesda, MD, 20894, USA.
| |
Collapse
|
10
|
Domingo J, Baeza-Centurion P, Lehner B. The Causes and Consequences of Genetic Interactions (Epistasis). Annu Rev Genomics Hum Genet 2019; 20:433-460. [PMID: 31082279 DOI: 10.1146/annurev-genom-083118-014857] [Citation(s) in RCA: 124] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The same mutation can have different effects in different individuals. One important reason for this is that the outcome of a mutation can depend on the genetic context in which it occurs. This dependency is known as epistasis. In recent years, there has been a concerted effort to quantify the extent of pairwise and higher-order genetic interactions between mutations through deep mutagenesis of proteins and RNAs. This research has revealed two major components of epistasis: nonspecific genetic interactions caused by nonlinearities in genotype-to-phenotype maps, and specific interactions between particular mutations. Here, we provide an overview of our current understanding of the mechanisms causing epistasis at the molecular level, the consequences of genetic interactions for evolution and genetic prediction, and the applications of epistasis for understanding biology and determining macromolecular structures.
Collapse
Affiliation(s)
- Júlia Domingo
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , ,
| | - Pablo Baeza-Centurion
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , ,
| | - Ben Lehner
- Systems Biology Program, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, 08003 Barcelona, Spain; , , .,Universitat Pompeu Fabra, 08003 Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010 Barcelona, Spain
| |
Collapse
|
11
|
Zhu F, Nair RR, Fisher EMC, Cunningham TJ. Humanising the mouse genome piece by piece. Nat Commun 2019; 10:1845. [PMID: 31015419 PMCID: PMC6478830 DOI: 10.1038/s41467-019-09716-7] [Citation(s) in RCA: 66] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2018] [Accepted: 03/23/2019] [Indexed: 12/14/2022] Open
Abstract
To better understand human health and disease, researchers create a wide variety of mouse models that carry human DNA. With recent advances in genome engineering, the targeted replacement of mouse genomic regions with orthologous human sequences has become increasingly viable, ranging from finely tuned humanisation of individual nucleotides and amino acids to the incorporation of many megabases of human DNA. Here, we examine emerging technologies for targeted genomic humanisation, we review the spectrum of existing genomically humanised mouse models and the insights such models have provided, and consider the lessons learned for designing such models in the future. Generation of transgenic mice has become routine in studying gene function and disease mechanisms, but often this is not enough to fully understand human biology. Here, the authors review the current state of the art of targeted genomic humanisation strategies and their advantages over classic approaches.
Collapse
Affiliation(s)
- Fei Zhu
- Department of Neuromuscular Diseases, Institute of Neurology, University College London, London, WC1N 3BG, UK
| | - Remya R Nair
- Mammalian Genetics Unit, MRC Harwell Institute, Oxfordshire, OX11 0RD, UK
| | - Elizabeth M C Fisher
- Department of Neuromuscular Diseases, Institute of Neurology, University College London, London, WC1N 3BG, UK.
| | | |
Collapse
|
12
|
Storz JF. Compensatory mutations and epistasis for protein function. Curr Opin Struct Biol 2018; 50:18-25. [PMID: 29100081 PMCID: PMC5936477 DOI: 10.1016/j.sbi.2017.10.009] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Revised: 10/05/2017] [Accepted: 10/12/2017] [Indexed: 01/09/2023]
Abstract
Adaptive protein evolution may be facilitated by neutral amino acid mutations that confer no benefit when they first arise but which potentiate subsequent function-altering mutations via direct or indirect structural mechanisms. Theoretical and empirical results indicate that such compensatory interactions (intramolecular epistasis) can exert a strong influence on trajectories of protein evolution. For this reason, assessing the form and prevalence of intramolecular epistasis and characterizing biophysical mechanisms of compensatory interaction are important research goals at the nexus of structural biology and molecular evolution. Here I review recent insights derived from protein-engineering studies, and I describe an approach for identifying and characterizing mechanisms of epistasis that integrates experimental data on structure-function relationships with analyses of comparative sequence data.
Collapse
Affiliation(s)
- Jay F Storz
- University of Nebraska, School of Biological Sciences, Lincoln, NE 68588-0114, United States.
| |
Collapse
|
13
|
Improving the in silico assessment of pathogenicity for compensated variants. Eur J Hum Genet 2016; 25:2-7. [PMID: 27703146 DOI: 10.1038/ejhg.2016.129] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2016] [Revised: 08/24/2016] [Accepted: 08/25/2016] [Indexed: 11/08/2022] Open
Abstract
Understanding the functional sequelae of amino-acid replacements is of fundamental importance in medical genetics. Perhaps, the most intuitive way to assess the potential pathogenicity of a given human missense variant is by measuring the degree of evolutionary conservation of the substituted amino-acid residue, a feature that generally serves as a good proxy metric for the functional/structural importance of that residue. However, the presence of putatively compensated variants as the wild-type alleles in orthologous proteins of other mammalian species not only challenges this classical view of amino-acid essentiality but also precludes the accurate evaluation of the functional impact of this type of missense variant using currently available bioinformatic prediction tools. Compensated variants constitute at least 4% of all known missense variants causing human-inherited disease and hence represent an important potential source of error in that they are likely to be disproportionately misclassified as benign variants. The consequent under-reporting of compensated variants is exacerbated in the context of next-generation sequencing where their inappropriate exclusion constitutes an unfortunate natural consequence of the filtering and prioritization of the very large number of variants generated. Here we demonstrate the reduced performance of currently available pathogenicity prediction tools when applied to compensated variants and propose an alternative machine-learning approach to assess likely pathogenicity for this particular type of variant.
Collapse
|
14
|
Abstract
To what extent is the convergent evolution of protein function attributable to convergent or parallel changes at the amino acid level? The mutations that contribute to adaptive protein evolution may represent a biased subset of all possible beneficial mutations owing to mutation bias and/or variation in the magnitude of deleterious pleiotropy. A key finding is that the fitness effects of amino acid mutations are often conditional on genetic background. This context dependence (epistasis) can reduce the probability of convergence and parallelism because it reduces the number of possible mutations that are unconditionally acceptable in divergent genetic backgrounds. Here, I review factors that influence the probability of replicated evolution at the molecular level.
Collapse
Affiliation(s)
- Jay F Storz
- School of Biological Sciences, University of Nebraska, Lincoln, Nebraska 68588, USA
| |
Collapse
|
15
|
Identification of cis-suppression of human disease mutations by comparative genomics. Nature 2015; 524:225-9. [PMID: 26123021 DOI: 10.1038/nature14497] [Citation(s) in RCA: 96] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2014] [Accepted: 04/23/2015] [Indexed: 11/08/2022]
Abstract
Patterns of amino acid conservation have served as a tool for understanding protein evolution. The same principles have also found broad application in human genomics, driven by the need to interpret the pathogenic potential of variants in patients. Here we performed a systematic comparative genomics analysis of human disease-causing missense variants. We found that an appreciable fraction of disease-causing alleles are fixed in the genomes of other species, suggesting a role for genomic context. We developed a model of genetic interactions that predicts most of these to be simple pairwise compensations. Functional testing of this model on two known human disease genes revealed discrete cis amino acid residues that, although benign on their own, could rescue the human mutations in vivo. This approach was also applied to ab initio gene discovery to support the identification of a de novo disease driver in BTG2 that is subject to protective cis-modification in more than 50 species. Finally, on the basis of our data and models, we developed a computational tool to predict candidate residues subject to compensation. Taken together, our data highlight the importance of cis-genomic context as a contributor to protein evolution; they provide an insight into the complexity of allele effect on phenotype; and they are likely to assist methods for predicting allele pathogenicity.
Collapse
|
16
|
Xu J, Zhang J. Why human disease-associated residues appear as the wild-type in other species: genome-scale structural evidence for the compensation hypothesis. Mol Biol Evol 2014; 31:1787-92. [PMID: 24723421 DOI: 10.1093/molbev/msu130] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Many human-disease associated amino acid residues (DARs) appear as the wild-type in other species. This phenomenon is commonly explained by the presence of compensatory residues in these other species that alleviate the deleterious effects of the DARs. The general validity of this hypothesis, however, is unclear, because few compensatory residues have been identified. Here we test the compensation hypothesis by assembling and analyzing 1,077 DARs located in 177 proteins of known crystal structures. Because destabilizing protein structures is a primary reason why DARs are deleterious, we focus on protein stability in this analysis. We discover that, in species where a DAR represents the wild-type, the destabilizing effect of the DAR is generally lessened by the observed amino acid substitutions in the spatial proximity of the DAR. This and other findings provide genome-scale evidence for the compensation hypothesis and have important implications for understanding epistasis in protein evolution and for using animal models of human diseases.
Collapse
Affiliation(s)
- Jinrui Xu
- Department of Computational Medicine and Bioinformatics, University of Michigan
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan
| |
Collapse
|
17
|
Soylemez O, Kondrashov FA. Estimating the rate of irreversibility in protein evolution. Genome Biol Evol 2013; 4:1213-22. [PMID: 23132897 PMCID: PMC3542581 DOI: 10.1093/gbe/evs096] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Whether or not evolutionary change is inherently irreversible remains a controversial
topic. Some examples of evolutionary irreversibility are known; however, this question has
not been comprehensively addressed at the molecular level. Here, we use data from 221
human genes with known pathogenic mutations to estimate the rate of irreversibility in
protein evolution. For these genes, we reconstruct ancestral amino acid sequences along
the mammalian phylogeny and identify ancestral amino acid states that match known
pathogenic mutations. Such cases represent inherent evolutionary irreversibility because,
at the present moment, reversals to these ancestral amino acid states are impossible for
the human lineage. We estimate that approximately 10% of all amino acid
substitutions along the mammalian phylogeny are irreversible, such that a return to the
ancestral amino acid state would lead to a pathogenic phenotype. For a subset of 51 genes
with high rates of irreversibility, as much as 40% of all amino acid evolution was
estimated to be irreversible. Because pathogenic phenotypes do not resemble ancestral
phenotypes, the molecular nature of the high rate of irreversibility in proteins is best
explained by evolution with a high prevalence of compensatory, epistatic interactions
between amino acid sites. Under such mode of protein evolution, once an amino acid
substitution is fixed, the probability of its reversal declines as the protein sequence
accumulates changes that affect the phenotypic manifestation of the ancestral state. The
prevalence of epistasis in evolution indicates that the observed high rate of
irreversibility in protein evolution is an inherent property of protein structure and
function.
Collapse
Affiliation(s)
- Onuralp Soylemez
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation (CRG), Barcelona, Spain
| | | |
Collapse
|
18
|
Abstract
Parkinson's disease (PD) is a complex genetic disorder that is associated with environmental risk factors and aging. Vertebrate genetic models, especially mice, have aided the study of autosomal-dominant and autosomal-recessive PD. Mice are capable of showing a broad range of phenotypes and, coupled with their conserved genetic and anatomical structures, provide unparalleled molecular and pathological tools to model human disease. These models used in combination with aging and PD-associated toxins have expanded our understanding of PD pathogenesis. Attempts to refine PD animal models using conditional approaches have yielded in vivo nigrostriatal degeneration that is instructive in ordering pathogenic signaling and in developing therapeutic strategies to cure or halt the disease. Here, we provide an overview of the generation and characterization of transgenic and knockout mice used to study PD followed by a review of the molecular insights that have been gleaned from current PD mouse models. Finally, potential approaches to refine and improve current models are discussed.
Collapse
|
19
|
Morgan CC, Shakya K, Webb A, Walsh TA, Lynch M, Loscher CE, Ruskin HJ, O'Connell MJ. Colon cancer associated genes exhibit signatures of positive selection at functionally significant positions. BMC Evol Biol 2012; 12:114. [PMID: 22788692 PMCID: PMC3563467 DOI: 10.1186/1471-2148-12-114] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2012] [Accepted: 06/22/2012] [Indexed: 12/17/2022] Open
Abstract
Background Cancer, much like most human disease, is routinely studied by utilizing model organisms. Of these model organisms, mice are often dominant. However, our assumptions of functional equivalence fail to consider the opportunity for divergence conferred by ~180 Million Years (MY) of independent evolution between these species. For a given set of human disease related genes, it is therefore important to determine if functional equivalency has been retained between species. In this study we test the hypothesis that cancer associated genes have different patterns of substitution akin to adaptive evolution in different mammal lineages. Results Our analysis of the current literature and colon cancer databases identified 22 genes exhibiting colon cancer associated germline mutations. We identified orthologs for these 22 genes across a set of high coverage (>6X) vertebrate genomes. Analysis of these orthologous datasets revealed significant levels of positive selection. Evidence of lineage-specific positive selection was identified in 14 genes in both ancestral and extant lineages. Lineage-specific positive selection was detected in the ancestral Euarchontoglires and Hominidae lineages for STK11, in the ancestral primate lineage for CDH1, in the ancestral Murinae lineage for both SDHC and MSH6 genes and the ancestral Muridae lineage for TSC1. Conclusion Identifying positive selection in the Primate, Hominidae, Muridae and Murinae lineages suggests an ancestral functional shift in these genes between the rodent and primate lineages. Analyses such as this, combining evolutionary theory and predictions - along with medically relevant data, can thus provide us with important clues for modeling human diseases.
Collapse
Affiliation(s)
- Claire C Morgan
- Bioinformatics and Molecular Evolution Group, School of Biotechnology, Dublin City University, Glasnevin, Dublin 9, Ireland
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Enard W. Functional primate genomics—leveraging the medical potential. J Mol Med (Berl) 2012; 90:471-80. [DOI: 10.1007/s00109-012-0901-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2012] [Revised: 04/04/2012] [Accepted: 04/05/2012] [Indexed: 10/28/2022]
|
21
|
Zhang G, Pei Z, Ball EV, Mort M, Kehrer-Sawatzki H, Cooper DN. Cross-comparison of the genome sequences from human, chimpanzee, Neanderthal and a Denisovan hominin identifies novel potentially compensated mutations. Hum Genomics 2012; 5:453-84. [PMID: 21807602 PMCID: PMC3525967 DOI: 10.1186/1479-7364-5-5-453] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The recent publication of the draft genome sequences of the Neanderthal and a ~50,000-year-old archaic hominin from Denisova Cave in southern Siberia has ushered in a new age in molecular archaeology. We previously cross-compared the human, chimpanzee and Neanderthal genome sequences with respect to a set of disease-causing/disease-associated missense and regulatory mutations (Human Gene Mutation Database) and succeeded in identifying genetic variants which, although apparently pathogenic in humans, may represent a 'compensated' wild-type state in at least one of the other two species. Here, in an attempt to identify further 'potentially compensated mutations' (PCMs) of interest, we have compared our dataset of disease-causing/disease-associated mutations with their corresponding nucleotide positions in the Denisovan hominin, Neanderthal and chimpanzee genomes. Of the 15 human putatively disease-causing mutations that were found to be compensated in chimpanzee, Denisovan or Neanderthal, only a solitary F5 variant (Val1736Met) was specific to the Denisovan. In humans, this missense mutation is associated with activated protein C resistance and an increased risk of thromboembolism and recurrent miscarriage. It is unclear at this juncture whether this variant was indeed a PCM in the Denisovan or whether it could instead have been associated with disease in this ancient hominin.
Collapse
Affiliation(s)
- Guojie Zhang
- Bioinformatics Department, Beijing Genomics Institute at Shenzhen, China.
| | | | | | | | | | | |
Collapse
|
22
|
Experimental approaches to evaluate the contributions of candidate protein-coding mutations to phenotypic evolution. Methods Mol Biol 2012; 772:377-96. [PMID: 22065450 DOI: 10.1007/978-1-61779-228-1_22] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Identifying mechanisms of molecular adaptation can provide important insights into the process of phenotypic evolution, but it can be exceedingly difficult to quantify the phenotypic effects of specific mutational changes. To verify the adaptive significance of genetically based changes in protein function, it is necessary to document functional differences between the products of derived and wild-type alleles and to demonstrate that such differences impinge on higher-level physiological processes (and ultimately, fitness). In the case of metabolic enzymes, this requires documenting in vivo differences in reaction rate that give rise to differences in flux through the pathway in which the enzymes function. These measured differences in pathway flux should then give rise to differences in cellular or systemic physiology that affect fitness-related variation in whole-organism performance. Efforts to establish these causal connections between genotype, phenotype, and fitness require experiments that carefully control for environmental variation and background genetic variation. Here, we discuss experimental approaches to evaluate the contributions of amino-acid mutations to adaptive phenotypic change. We discuss conceptual and methodological issues associated with in vitro and in vivo studies of protein function, and the evolutionary insights that can be gleaned from such studies. We also discuss the importance of isolating the effects of individual mutations to distinguish between positively selected substitutions that directly contribute to improvements in protein function versus positively selected, compensatory substitutions that mitigate negative pleiotropic effects of antecedent changes.
Collapse
|
23
|
Kumar S, Dudley JT, Filipski A, Liu L. Phylomedicine: an evolutionary telescope to explore and diagnose the universe of disease mutations. Trends Genet 2011; 27:377-86. [PMID: 21764165 PMCID: PMC3272884 DOI: 10.1016/j.tig.2011.06.004] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2011] [Revised: 06/10/2011] [Accepted: 06/13/2011] [Indexed: 12/30/2022]
Abstract
Modern technologies have made the sequencing of personal genomes routine. They have revealed thousands of nonsynonymous (amino acid altering) single nucleotide variants (nSNVs) of protein-coding DNA per genome. What do these variants foretell about an individual's predisposition to diseases? The experimental technologies required to carry out such evaluations at a genomic scale are not yet available. Fortunately, the process of natural selection has lent us an almost infinite set of tests in nature. During long-term evolution, new mutations and existing variations have been evaluated for their biological consequences in countless species, and outcomes are readily revealed by multispecies genome comparisons. We review studies that have investigated evolutionary characteristics and in silico functional diagnoses of nSNVs found in thousands of disease-associated genes. We conclude that the patterns of long-term evolutionary conservation and permissible sequence divergence are essential and instructive modalities for functional assessment of human genetic variations.
Collapse
Affiliation(s)
- Sudhir Kumar
- School of Life Sciences, Arizona State University, Tempe, AZ 85287-4501, USA.
| | | | | | | |
Collapse
|
24
|
Pacheu-Grau D, Gómez-Durán A, López-Gallardo E, Pinós T, Andreu AL, López-Pérez MJ, Montoya J, Ruiz-Pesini E. 'Progress' renders detrimental an ancient mitochondrial DNA genetic variant. Hum Mol Genet 2011; 20:4224-31. [PMID: 21828074 DOI: 10.1093/hmg/ddr350] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A human mitochondrial DNA (mtDNA) transition, m.1555A>G, in the 12S rRNA gene causes non-syndromic hearing loss. However, this pathological mutation is the wild-type allele in orangutan mtDNA. Here we rule out different genetic factors as the reason for its fixation in orangutans and show that aminoglycosides negatively affect the oxidative phosphorylation function by decreasing the synthesis of mtDNA-encoded proteins and the amount and activity of respiratory complex IV. These drugs also diminish the growth rate of orangutan cells. The m.1555G nucleotide is also the wild-type allele in other mammal species and they might be at risk of suffering a mitochondrial disorder if treated with aminoglycosides. Therefore, pharmacogenomic approaches should be used to confirm this possibility. These observations are important for human health. Due to the fact that old age and high frequency are criteria widely used in mitochondrial medicine to rule out a genetic change as being a pathological mutation, our results prevent against simplistic genetic approaches that do not consider the potential effect of environmental conditions. Hence, these results suggest that some ancient and highly frequent human population polymorphisms, such as those defining mtDNA haplogroups, in mitochondrial rRNA genes can be deleterious in association with new environmental conditions. Therefore, as the discovery of ribosomal antibiotics has allowed to fight infectious diseases and this breakthrough can be considered an important scientific advance or 'progress', our results suggest that 'progress' can also have a negative counterpart and render detrimental many of these mtDNA genotypes.
Collapse
Affiliation(s)
- David Pacheu-Grau
- Departamento de Bioquímica, Biología Molecular y Celular, Universidad de Zaragoza, Zaragoza, Spain
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Abstract
Genes are generally assumed to be primary biological causes of biological phenotypes and their evolution. In just over a century, a research agenda that has built on Mendel's experiments and on Darwin's theory of natural selection as a law of nature has had unprecedented scientific success in isolating and characterizing many aspects of genetic causation. We revel in these successes, and yet the story is not quite so simple. The complex cooperative nature of genetic architecture and its evolution include teasingly tractable components, but much remains elusive. The proliferation of data generated in our "omics" age raises the question of whether we even have (or need) a unified theory or "law" of life, or even clear standards of inference by which to answer the question. If not, this not only has implications for the widely promulgated belief that we will soon be able to predict phenotypes like disease risk from genes, but also speaks to the limitations in the underlying science itself. Much of life seems to be characterized by ad hoc, ephemeral, contextual probabilism without proper underlying distributions. To the extent that this is true, causal effects are not asymptotically predictable, and new ways of understanding life may be required.
Collapse
Affiliation(s)
- Kenneth M Weiss
- Department of Anthropology, Pennsylvania State University, University Park, Pennsylvania 16802, USA.
| | | |
Collapse
|
26
|
Abstract
Despite the common assumption that orthologs usually share the same function, there have been various reports of divergence between orthologs, even among species as close as mammals. The comparison of mouse and human is of special interest, because mouse is often used as a model organism to understand human biology. We review the literature on evidence for divergence between human and mouse orthologous genes, and discuss it in the context of biomedical research.
Collapse
Affiliation(s)
- Walid H Gharib
- Department of Ecology and Evolution, Biophore, Swiss Institute of Bioinformatics, Lausanne University, CH-1015 Lausanne, Switzerland
| | | |
Collapse
|
27
|
|
28
|
Zhang G, Pei Z, Krawczak M, Ball EV, Mort M, Kehrer-Sawatzki H, Cooper DN. Triangulation of the human, chimpanzee, and Neanderthal genome sequences identifies potentially compensated mutations. Hum Mutat 2010; 31:1286-93. [DOI: 10.1002/humu.21389] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
29
|
Cooper DN, Chen JM, Ball EV, Howells K, Mort M, Phillips AD, Chuzhanova N, Krawczak M, Kehrer-Sawatzki H, Stenson PD. Genes, mutations, and human inherited disease at the dawn of the age of personalized genomics. Hum Mutat 2010; 31:631-55. [PMID: 20506564 DOI: 10.1002/humu.21260] [Citation(s) in RCA: 117] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The number of reported germline mutations in human nuclear genes, either underlying or associated with inherited disease, has now exceeded 100,000 in more than 3,700 different genes. The availability of these data has both revolutionized the study of the morbid anatomy of the human genome and facilitated "personalized genomics." With approximately 300 new "inherited disease genes" (and approximately 10,000 new mutations) being identified annually, it is pertinent to ask how many "inherited disease genes" there are in the human genome, how many mutations reside within them, and where such lesions are likely to be located? To address these questions, it is necessary not only to reconsider how we define human genes but also to explore notions of gene "essentiality" and "dispensability."Answers to these questions are now emerging from recent novel insights into genome structure and function and through complete genome sequence information derived from multiple individual human genomes. However, a change in focus toward screening functional genomic elements as opposed to genes sensu stricto will be required if we are to capitalize fully on recent technical and conceptual advances and identify new types of disease-associated mutation within noncoding regions remote from the genes whose function they disrupt.
Collapse
Affiliation(s)
- David N Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Heath Park, Cardiff CF14 4XN, United Kingdom.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
|
31
|
Marini NJ, Thomas PD, Rine J. The use of orthologous sequences to predict the impact of amino acid substitutions on protein function. PLoS Genet 2010; 6:e1000968. [PMID: 20523748 PMCID: PMC2877731 DOI: 10.1371/journal.pgen.1000968] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2009] [Accepted: 04/22/2010] [Indexed: 12/01/2022] Open
Abstract
Computational predictions of the functional impact of genetic variation play a critical role in human genetics research. For nonsynonymous coding variants, most prediction algorithms make use of patterns of amino acid substitutions observed among homologous proteins at a given site. In particular, substitutions observed in orthologous proteins from other species are often assumed to be tolerated in the human protein as well. We examined this assumption by evaluating a panel of nonsynonymous mutants of a prototypical human enzyme, methylenetetrahydrofolate reductase (MTHFR), in a yeast cell-based functional assay. As expected, substitutions in human MTHFR at sites that are well-conserved across distant orthologs result in an impaired enzyme, while substitutions present in recently diverged sequences (including a 9-site mutant that "resurrects" the human-macaque ancestor) result in a functional enzyme. We also interrogated 30 sites with varying degrees of conservation by creating substitutions in the human enzyme that are accepted in at least one ortholog of MTHFR. Quite surprisingly, most of these substitutions were deleterious to the human enzyme. The results suggest that selective constraints vary between phylogenetic lineages such that inclusion of distant orthologs to infer selective pressures on the human enzyme may be misleading. We propose that homologous proteins are best used to reconstruct ancestral sequences and infer amino acid conservation among only direct lineal ancestors of a particular protein. We show that such an "ancestral site preservation" measure outperforms other prediction methods, not only in our selected set for MTHFR, but also in an exhaustive set of E. coli LacI mutants.
Collapse
Affiliation(s)
- Nicholas J. Marini
- California Institute for Quantitative Biosciences, Department of Molecular and Cellular Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Paul D. Thomas
- Evolutionary Systems Biology Group, SRI International, Menlo Park, California, United States of America
| | - Jasper Rine
- California Institute for Quantitative Biosciences, Department of Molecular and Cellular Biology, University of California Berkeley, Berkeley, California, United States of America
| |
Collapse
|
32
|
Positional conservation and amino acids shape the correct diagnosis and population frequencies of benign and damaging personal amino acid mutations. Genome Res 2009; 19:1562-9. [PMID: 19546171 DOI: 10.1101/gr.091991.109] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
As the cost of DNA sequencing drops, we are moving beyond one genome per species to one genome per individual to improve prevention, diagnosis, and treatment of disease by using personal genotypes. Computational methods are frequently applied to predict impairment of gene function by nonsynonymous mutations in individual genomes and single nucleotide polymorphisms (nSNPs) in populations. These computational tools are, however, known to fail 15%-40% of the time. We find that accurate discrimination between benign and deleterious mutations is strongly influenced by the long-term (among species) history of positions that harbor those mutations. Successful prediction of known disease-associated mutations (DAMs) is much higher for evolutionarily conserved positions and for original-mutant amino acid pairs that are rarely seen among species. Prediction accuracies for nSNPs show opposite patterns, forecasting impediments to building diagnostic tools aiming to simultaneously reduce both false-positive and false-negative errors. The relative allele frequencies of mutations diagnosed as benign and damaging are predicted by positional evolutionary rates. These allele frequencies are modulated by the relative preponderance of the mutant allele in the set of amino acids found at homologous sites in other species (evolutionarily permissible alleles [EPAs]). The nSNPs found in EPAs are biochemically less severe than those missing from EPAs across all allele frequency categories. Therefore, it is important to consider position evolutionary rates and EPAs when interpreting the consequences and population frequencies of human mutations. The impending sequencing of thousands of human and many more vertebrate genomes will lead to more accurate classifiers needed in real-world applications.
Collapse
|
33
|
Azevedo L, Carneiro J, van Asch B, Moleirinho A, Pereira F, Amorim A. Epistatic interactions modulate the evolution of mammalian mitochondrial respiratory complex components. BMC Genomics 2009; 10:266. [PMID: 19523237 PMCID: PMC2711975 DOI: 10.1186/1471-2164-10-266] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2008] [Accepted: 06/13/2009] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND The deleterious effect of a mutation can be reverted by a second-site interacting residue. This is an epistatic compensatory process explaining why mutations that are deleterious in some species are tolerated in phylogenetically related lineages, rendering evident that those mutations are, by all means, only deleterious in the species-specific context. Although an extensive and refined theoretical framework on compensatory evolution does exist, the supporting evidence remains limited, especially for protein models. In this current study, we focused on the molecular mechanism underlying the epistatic compensatory process in mammalian mitochondrial OXPHOS proteins using a combination of in-depth structural and sequence analyses. RESULTS Modeled human structures were used in this study to predict the structural impairment and recovery of deleterious mutations alone and combined with an interacting compensatory partner, respectively. In two cases, COI and COIII, intramolecular interactions between spatially linked residues restore the folding pattern impaired by the deleterious mutation. In a third case, intermolecular contact between mitochondrial CYB and nuclear CYT1 encoded components of the cytochrome bc1 complex are likely to restore protein binding. Moreover, we observed different modes of compensatory evolution that have resulted in either a quasi-simultaneous occurrence of a mutation and corresponding compensatory partner, or in independent occurrences of mutations in distinct lineages that were always preceded by the compensatory site. CONCLUSION Epistatic interactions between individual replacements involving deleterious mutations seems to follow a parsimonious model of evolution in which genomes hold pre-compensating states that subsequently tolerate deleterious mutations. This phenomenon is likely to have been constraining the variability at coevolving sites and shaping the interaction between the mitochondrial and the nuclear genome.
Collapse
Affiliation(s)
- Luísa Azevedo
- IPATIMUP-Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
| | - João Carneiro
- IPATIMUP-Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
- Faculty of Sciences of the University of Porto, Porto, Portugal
| | - Barbara van Asch
- IPATIMUP-Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
- Faculty of Sciences of the University of Porto, Porto, Portugal
| | - Ana Moleirinho
- IPATIMUP-Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
| | - Filipe Pereira
- IPATIMUP-Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
- Faculty of Sciences of the University of Porto, Porto, Portugal
| | - António Amorim
- IPATIMUP-Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
- Faculty of Sciences of the University of Porto, Porto, Portugal
| |
Collapse
|
34
|
Why is the correlation between gene importance and gene evolutionary rate so weak? PLoS Genet 2009; 5:e1000329. [PMID: 19132081 PMCID: PMC2605560 DOI: 10.1371/journal.pgen.1000329] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2008] [Accepted: 12/03/2008] [Indexed: 01/01/2023] Open
Abstract
One of the few commonly believed principles of molecular evolution is that functionally more important genes (or DNA sequences) evolve more slowly than less important ones. This principle is widely used by molecular biologists in daily practice. However, recent genomic analysis of a diverse array of organisms found only weak, negative correlations between the evolutionary rate of a gene and its functional importance, typically measured under a single benign lab condition. A frequently suggested cause of the above finding is that gene importance determined in the lab differs from that in an organism's natural environment. Here, we test this hypothesis in yeast using gene importance values experimentally determined in 418 lab conditions or computationally predicted for 10,000 nutritional conditions. In no single condition or combination of conditions did we find a much stronger negative correlation, which is explainable by our subsequent finding that always-essential (enzyme) genes do not evolve significantly more slowly than sometimes-essential or always-nonessential ones. Furthermore, we verified that functional density, approximated by the fraction of amino acid sites within protein domains, is uncorrelated with gene importance. Thus, neither the lab-nature mismatch nor a potentially biased among-gene distribution of functional density explains the observed weakness of the correlation between gene importance and evolutionary rate. We conclude that the weakness is factual, rather than artifactual. In addition to being weakened by population genetic reasons, the correlation is likely to have been further weakened by the presence of multiple nontrivial rate determinants that are independent from gene importance. These findings notwithstanding, we show that the principle of slower evolution of more important genes does have some predictive power when genes with vastly different evolutionary rates are compared, explaining why the principle can be practically useful despite the weakness of the correlation. The fact that functionally more important genes or DNA sequences evolve more slowly than less important ones is commonly believed and frequently used by molecular biologists. However, previous genome-wide studies of a diverse array of organisms found only weak, negative correlations between the importance of a gene and its evolutionary rate. We show, here, that the weakness of the correlation is not because gene importance measured in lab conditions deviates from that in an organism's natural environments. Neither is it due to a potentially biased among-gene distribution of functional density. We suggest that the weakness of the correlation is factual, rather than artifactual. These findings notwithstanding, we show that the principle of slower evolution of more important genes does have some predictive power when genes with vastly different evolutionary rates are compared, explaining why the principle can be practically useful for tasks such as identifying functional non-coding sequences despite the weakness of the correlation.
Collapse
|
35
|
Null mutations in human and mouse orthologs frequently result in different phenotypes. Proc Natl Acad Sci U S A 2008; 105:6987-92. [PMID: 18458337 DOI: 10.1073/pnas.0800387105] [Citation(s) in RCA: 179] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
One-to-one orthologous genes of relatively closely related species are widely assumed to have similar functions and cause similar phenotypes when deleted from the genome. Although this assumption is the foundation of comparative genomics and the basis for the use of model organisms to study human biology and disease, its validity is known only from anecdotes rather than from systematic examination. Comparing documented phenotypes of null mutations in humans and mice, we find that >20% of human essential genes have nonessential mouse orthologs. These changes of gene essentiality appear to be associated with adaptive evolution at the protein-sequence, but not gene-expression, level. Proteins localized to the vacuole, a cellular compartment for waste management, are highly enriched among essentiality-changing genes. It is probable that the evolution of the prolonged life history in humans required enhanced waste management for proper cellular function until the time of reproduction, which rendered these vacuole proteins essential and generated selective pressures for their improvement. If our gene sample represents the entire genome, our results would mean frequent changes of phenotypic effects of one-to-one orthologous genes even between relatively closely related species, a possibility that should be considered in comparative genomic studies and in making cross-species inferences of gene function and phenotypic effect.
Collapse
|
36
|
Gibbs RA, Rogers J, Katze MG, Bumgarner R, Weinstock GM, Mardis ER, Remington KA, Strausberg RL, Venter JC, Wilson RK, Batzer MA, Bustamante CD, Eichler EE, Hahn MW, Hardison RC, Makova KD, Miller W, Milosavljevic A, Palermo RE, Siepel A, Sikela JM, Attaway T, Bell S, Bernard KE, Buhay CJ, Chandrabose MN, Dao M, Davis C, Delehaunty KD, Ding Y, Dinh HH, Dugan-Rocha S, Fulton LA, Gabisi RA, Garner TT, Godfrey J, Hawes AC, Hernandez J, Hines S, Holder M, Hume J, Jhangiani SN, Joshi V, Khan ZM, Kirkness EF, Cree A, Fowler RG, Lee S, Lewis LR, Li Z, Liu YS, Moore SM, Muzny D, Nazareth LV, Ngo DN, Okwuonu GO, Pai G, Parker D, Paul HA, Pfannkoch C, Pohl CS, Rogers YH, Ruiz SJ, Sabo A, Santibanez J, Schneider BW, Smith SM, Sodergren E, Svatek AF, Utterback TR, Vattathil S, Warren W, White CS, Chinwalla AT, Feng Y, Halpern AL, Hillier LW, Huang X, Minx P, Nelson JO, Pepin KH, Qin X, Sutton GG, Venter E, Walenz BP, Wallis JW, Worley KC, Yang SP, Jones SM, Marra MA, Rocchi M, Schein JE, Baertsch R, Clarke L, Csürös M, Glasscock J, Harris RA, Havlak P, Jackson AR, Jiang H, Liu Y, Messina DN, Shen Y, Song HXZ, Wylie T, Zhang L, Birney E, Han K, Konkel MK, Lee J, Smit AFA, Ullmer B, Wang H, Xing J, Burhans R, Cheng Z, Karro JE, Ma J, Raney B, She X, Cox MJ, Demuth JP, Dumas LJ, Han SG, Hopkins J, Karimpour-Fard A, Kim YH, Pollack JR, Vinar T, Addo-Quaye C, Degenhardt J, Denby A, Hubisz MJ, Indap A, Kosiol C, Lahn BT, Lawson HA, Marklein A, Nielsen R, Vallender EJ, Clark AG, Ferguson B, Hernandez RD, Hirani K, Kehrer-Sawatzki H, Kolb J, Patil S, Pu LL, Ren Y, Smith DG, Wheeler DA, Schenck I, Ball EV, Chen R, Cooper DN, Giardine B, Hsu F, Kent WJ, Lesk A, Nelson DL, O'brien WE, Prüfer K, Stenson PD, Wallace JC, Ke H, Liu XM, Wang P, Xiang AP, Yang F, Barber GP, Haussler D, Karolchik D, Kern AD, Kuhn RM, Smith KE, Zwieg AS. Evolutionary and biomedical insights from the rhesus macaque genome. Science 2007; 316:222-34. [PMID: 17431167 DOI: 10.1126/science.1139247] [Citation(s) in RCA: 1002] [Impact Index Per Article: 58.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The rhesus macaque (Macaca mulatta) is an abundant primate species that diverged from the ancestors of Homo sapiens about 25 million years ago. Because they are genetically and physiologically similar to humans, rhesus monkeys are the most widely used nonhuman primate in basic and applied biomedical research. We determined the genome sequence of an Indian-origin Macaca mulatta female and compared the data with chimpanzees and humans to reveal the structure of ancestral primate genomes and to identify evidence for positive selection and lineage-specific expansions and contractions of gene families. A comparison of sequences from individual animals was used to investigate their underlying genetic diversity. The complete description of the macaque genome blueprint enhances the utility of this animal model for biomedical research and improves our understanding of the basic biology of the species.
Collapse
|
37
|
Evolutionary anatomies of positions and types of disease-associated and neutral amino acid mutations in the human genome. BMC Genomics 2006; 7:306. [PMID: 17144929 PMCID: PMC1702542 DOI: 10.1186/1471-2164-7-306] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2006] [Accepted: 12/05/2006] [Indexed: 02/02/2023] Open
Abstract
Background Amino acid mutations in a large number of human proteins are known to be associated with heritable genetic disease. These disease-associated mutations (DAMs) are known to occur predominantly in positions essential to the structure and function of the proteins. Here, we examine how the relative perpetuation and conservation of amino acid positions modulate the genome-wide patterns of 8,627 human disease-associated mutations (DAMs) reported in 541 genes. We compare these patterns with 5,308 non-synonymous Single Nucleotide Polymorphisms (nSNPs) in 2,592 genes from primary SNP resources. Results The abundance of DAMs shows a negative relationship with the evolutionary rate of the amino acid positions harboring them. An opposite trend describes the distribution of nSNPs. DAMs are also preferentially found in the amino acid positions that are retained (or present) in multiple vertebrate species, whereas the nSNPs are over-abundant in the positions that have been lost (or absent) in the non-human vertebrates. These observations are consistent with the effect of purifying selection on natural variation, which also explains the existence of lower minor nSNP allele frequencies at highly-conserved amino acid positions. The biochemical severity of the inter-specific amino acid changes is also modulated by natural selection, with the fast-evolving positions containing more radical amino acid differences among species. Similarly, DAMs associated with early-onset diseases are more radical than those associated with the late-onset diseases. A small fraction of DAMs (10%) overlap with the amino acid differences between species within the same position, but are biochemically the most conservative group of amino acid differences in our datasets. Overlapping DAMs are found disproportionately in fast-evolving amino acid positions, which, along with the conservative nature of the amino acid changes, may have allowed some of them to escape natural selection until compensatory changes occur. Conclusion The consistency and predictability of genome-wide patterns of disease- associated and neutral amino acid variants reported here underscores the importance of the consideration of evolutionary rates of amino acid positions in clinical and population genetic analyses aimed at understanding the nature and fate of disease-associated and neutral population variation. Establishing such general patterns is an early step in efforts to diagnose the pathogenic potentials of novel amino acid mutations.
Collapse
|
38
|
Ferrer-Costa C, Orozco M, de la Cruz X. Characterization of compensated mutations in terms of structural and physico-chemical properties. J Mol Biol 2006; 365:249-56. [PMID: 17059831 DOI: 10.1016/j.jmb.2006.09.053] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2006] [Revised: 09/14/2006] [Accepted: 09/21/2006] [Indexed: 11/29/2022]
Abstract
The study of the evolution of compensatory mechanisms among amino acids is paramount to our understanding of intramolecular epistatic interactions. It has been addressed from different points of view, for example much effort has been devoted to establish the number of compensatory mutations required per deleterious mutation. However, we still do not know how the nature of the compensated mutation determines the existence of compensatory mutations. Within this context, recent studies have produced several instances of an interesting phenomenon: human disease-associated residues may sometimes appear as wild-type residues in non-human proteins. This can be explained in terms of compensatory mutations, present in the non-human protein, which would neutralize the damage caused by the disease-associated residue. Therefore, comparison between these compensated mutations and non-compensated pathological mutations provides a simple approach to understand how the nature of the compensated deleterious mutation determines the existence of compensatory mutations. To address this issue, we have obtained a large set of compensated mutations and characterised them with a series of different properties. When comparing the resulting distributions with those from pathological mutations we find that in general compensated mutations are milder than pathological mutations. More precisely, we find that the probability that a compensatory mutation will evolve is directly related (i) to the location in the protein structure and (ii) to changes in physico-chemical properties (e.g. amino acid volume or hydrophobicity) of the compensated mutation.
Collapse
Affiliation(s)
- Carles Ferrer-Costa
- Molecular Modeling and Bioinformatics Unit, Institut de Recerca Biomédica, Parc Científic de Barcelona, Josep Samitier 1-5, 08028 Barcelona, Spain
| | | | | |
Collapse
|
39
|
Hooft van Huijsduijnen R, Rommel C. Decompartmentalizing target validation—thinking outside the pipeline boxes. J Mol Med (Berl) 2006; 84:802-13. [PMID: 16924470 DOI: 10.1007/s00109-006-0080-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2006] [Accepted: 05/30/2006] [Indexed: 10/24/2022]
Affiliation(s)
- Rob Hooft van Huijsduijnen
- Serono Pharmaceutical Research Institute, Serono International S.A., 14, Chemin des Aulx, 1228, Plan-les-Ouates, Geneva, Switzerland
| | | |
Collapse
|
40
|
Ferrer-Costa C, Orozco M, de la Cruz X. Use of bioinformatics tools for the annotation of disease-associated mutations in animal models. Proteins 2006; 61:878-87. [PMID: 16208716 DOI: 10.1002/prot.20664] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Single-point mutations are one of the most frequent causes of genetic variability in both human and close species. The recent availability of different bioinformatics tools for annotating human single nucleotide polymorphisms (SNPs) has opened the possibility of using them to score SNPs from species with a biomedical interest, in particular from mice and other models of human disease. Also, this ability to predict pathogenicity of single point mutations in one species, based on data from another species, opens the possibility to predict the pathological character of single point mutations in humans using data from well-characterized model systems of human disease. This could provide a valuable alternative to the more traditional genetic population approaches. However, transferral of prediction tools may be limited by different factors, from a species bias in the training set, to a large sequence divergence between the proteomes of the training and the target species. Here we study the conditions under which prediction tools can be transferred among species, concentrating in the case of mice. We find that for the majority of the human-mouse homolog pairs, the sequence similarity is large enough to preserve the pathological character of mutations among species, in general. We then establish that prediction/annotation tools developed for one organism can be used to predict the neutral/pathological character of mutations/SNPs in the other organism.
Collapse
Affiliation(s)
- Carles Ferrer-Costa
- Molecular Modeling and Bioinformatics Unit, Institut de Recerca Biomédica, Parc Científic de Barcelona, Barcelona, Spain
| | | | | |
Collapse
|
41
|
Choi SS, Li W, Lahn BT. Robust signals of coevolution of interacting residues in mammalian proteomes identified by phylogeny-aided structural analysis. Nat Genet 2005; 37:1367-71. [PMID: 16282975 DOI: 10.1038/ng1685] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2005] [Accepted: 09/12/2005] [Indexed: 11/08/2022]
Abstract
The structure of a protein depends critically on the complex interactions among its amino acid residues. It has long been hypothesized that interacting residues might tend to coevolve, but it is not known whether such coevolution is a general phenomenon across the proteome. Here, we describe a novel methodology called phylogeny-aided structural analysis, which uncovers robust signals of interacting-residue coevolution in mammalian proteomes. Furthermore, this new method allows the magnitude of coevolution to be quantified. Finally, it facilitates a comprehensive evaluation of various factors that affect interacting-residue coevolution, such as the physicochemical properties of the interactions between residues, solvent accessibility of the residues and their secondary structure context.
Collapse
Affiliation(s)
- Sun Shim Choi
- Howard Hughes Medical Institute, Department of Human Genetics, University of Chicago, Chicago, Illinois 60637, USA
| | | | | |
Collapse
|
42
|
de Magalhães JP. Human Disease-Associated Mitochondrial Mutations Fixed in Nonhuman Primates. J Mol Evol 2005; 61:491-7. [PMID: 16132471 DOI: 10.1007/s00239-004-0258-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2004] [Accepted: 04/19/2005] [Indexed: 12/01/2022]
Abstract
A number of human disease-associated sequences have been reported in other species, such as rodents, but compensatory changes appear to prevent these deleterious mutations from being expressed. The aim of this work was to compare the mitochondrial DNA of multiple primates to ascertain whether mitochondrial disease-causing sequences in humans are fixed in nonhuman primates. Indeed, 46 sequences related to human pathology were identified in 1 or more of the 12 studied nonhuman primates, the majority of which were associated with late-onset diseases. Most of these sequences can be explained by the presence of secondary compensatory changes that render these mutations phenotypically inert. Nonetheless, and since humans not only are the longest-lived primate but feature the largest brain, one hypothesis is that a gradual optimization of the human mitochondrion occurred in the hominid lineage driven by the need to optimize the aerobic energy metabolism to delay neurodegeneration. Therefore, it is also proposed that some of these disease-associated sequences in nonhuman primates may be linked to the evolution of human longevity and intelligence, indicating a general pattern of selection on longevity in the course of evolution of the human mitochondrion.
Collapse
|
43
|
Weinreich DM, Watson RA, Chao L. PERSPECTIVE: SIGN EPISTASIS AND GENETIC COSTRAINT ON EVOLUTIONARY TRAJECTORIES. Evolution 2005. [DOI: 10.1111/j.0014-3820.2005.tb01768.x] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
44
|
Weinreich DM, Watson RA, Chao L. PERSPECTIVE:SIGN EPISTASIS AND GENETIC CONSTRAINT ON EVOLUTIONARY TRAJECTORIES. Evolution 2005. [DOI: 10.1554/04-272] [Citation(s) in RCA: 87] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
45
|
Kern AD, Kondrashov FA. Mechanisms and convergence of compensatory evolution in mammalian mitochondrial tRNAs. Nat Genet 2004; 36:1207-12. [PMID: 15502829 DOI: 10.1038/ng1451] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2004] [Accepted: 09/20/2004] [Indexed: 11/09/2022]
Abstract
The function of protein and RNA molecules depends on complex epistatic interactions between sites. Therefore, the deleterious effect of a mutation can be suppressed by a compensatory second-site substitution. In relating a list of 86 pathogenic mutations in human tRNAs encoded by mitochondrial genes to the sequences of their mammalian orthologs, we noted that 52 pathogenic mutations were present in normal tRNAs of one or several nonhuman mammals. We found at least five mechanisms of compensation for 32 pathogenic mutations that destroyed a Watson-Crick pair in one of the four tRNA stems: restoration of the affected Watson-Crick interaction (25 cases), strengthening of another pair (4 cases), creation of a new pair (8 cases), changes of multiple interactions in the affected stem (11 cases) and changes involving the interaction between the loop and stem structures (3 cases). A pathogenic mutation and its compensating substitution are fixed in a lineage in rapid succession, and often a compensatory interaction evolves convergently in different clades. At least 10%, and perhaps as many as 50%, of all nucleotide substitutions in evolving mammalian tRNAs participate in such interactions, indicating that the evolution of tRNAs proceeds along highly epistatic fitness ridges.
Collapse
Affiliation(s)
- Andrew D Kern
- Center for Population Biology, Section of Evolution and Ecology, University of California at Davis, Davis, California 95616, USA
| | | |
Collapse
|
46
|
Pavlicek A, Noskov VN, Kouprina N, Barrett JC, Jurka J, Larionov V. Evolution of the tumor suppressor BRCA1 locus in primates: implications for cancer predisposition. Hum Mol Genet 2004; 13:2737-51. [PMID: 15385441 DOI: 10.1093/hmg/ddh301] [Citation(s) in RCA: 78] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Germ-line mutations in the BRCA1 gene predispose affected individuals to breast and ovarian cancer syndromes. In an attempt to systematically analyze a broader spectrum of genetic changes ranging from frequent exon deletions and duplications to amino acid replacements and protein truncations, we isolated and characterized full size BRCA1 homologues from a representative group of non-human primates. Our analysis represents the first comprehensive sequence comparison of primate BRCA1 loci and corresponding proteins. The comparison revealed an unusually high proportion of indels in non-coding DNA. The major force driving evolutionary changes in non-coding BRCA1 sequences was Alu-mediated rearrangements, including Alu transpositions and Alu-associated deletions, indicating that structural instability of this locus may be intrinsic in anthropoids. Analysis of the non-synonymous/synonymous ratio in coding portions of the gene revealed the presence of both conserved and rapidly evolving regions in the BRCA1 protein. Previously, a rapidly evolving region with evidence of positive evolutionary selection in human and chimpanzee had been identified only in exon 11. Here, we show that most of the internal BRCA1 sequence is variable between primates and evolved under positive selection. In contrast, the terminal regions of BRCA1, which encode the RING finger and BRCT domains, experienced negative selection, which left them almost identical between the compared primates. Distribution of the reported missense mutations, but not frameshift and nonsense mutations, is positively correlated with BRCA1 protein conservation. Finally, on the basis of protein sequence conservation, we identified missense changes that are likely to compromise BRCA1 function.
Collapse
Affiliation(s)
- Adam Pavlicek
- Genetic Information Research Institute, Mountain View, CA, USA
| | | | | | | | | | | |
Collapse
|
47
|
Huang H, Winter EE, Wang H, Weinstock KG, Xing H, Goodstadt L, Stenson PD, Cooper DN, Smith D, Albà MM, Ponting CP, Fechtel K. Evolutionary conservation and selection of human disease gene orthologs in the rat and mouse genomes. Genome Biol 2004; 5:R47. [PMID: 15239832 PMCID: PMC463309 DOI: 10.1186/gb-2004-5-7-r47] [Citation(s) in RCA: 107] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2004] [Revised: 05/10/2004] [Accepted: 05/28/2004] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Model organisms have contributed substantially to our understanding of the etiology of human disease as well as having assisted with the development of new treatment modalities. The availability of the human, mouse and, most recently, the rat genome sequences now permit the comprehensive investigation of the rodent orthologs of genes associated with human disease. Here, we investigate whether human disease genes differ significantly from their rodent orthologs with respect to their overall levels of conservation and their rates of evolutionary change. RESULTS Human disease genes are unevenly distributed among human chromosomes and are highly represented (99.5%) among human-rodent ortholog sets. Differences are revealed in evolutionary conservation and selection between different categories of human disease genes. Although selection appears not to have greatly discriminated between disease and non-disease genes, synonymous substitution rates are significantly higher for disease genes. In neurological and malformation syndrome disease systems, associated genes have evolved slowly whereas genes of the immune, hematological and pulmonary disease systems have changed more rapidly. Amino-acid substitutions associated with human inherited disease occur at sites that are more highly conserved than the average; nevertheless, 15 substituting amino acids associated with human disease were identified as wild-type amino acids in the rat. Rodent orthologs of human trinucleotide repeat-expansion disease genes were found to contain substantially fewer of such repeats. Six human genes that share the same characteristics as triplet repeat-expansion disease-associated genes were identified; although four of these genes are expressed in the brain, none is currently known to be associated with disease. CONCLUSIONS Most human disease genes have been retained in rodent genomes. Synonymous nucleotide substitutions occur at a higher rate in disease genes, a finding that may reflect increased mutation rates in the chromosomal regions in which disease genes are found. Rodent orthologs associated with neurological function exhibit the greatest evolutionary conservation; this suggests that rodent models of human neurological disease are likely to most faithfully represent human disease processes. However, with regard to neurological triplet repeat expansion-associated human disease genes, the contraction, relative to human, of rodent trinucleotide repeats suggests that rodent loci may not achieve a 'critical repeat threshold' necessary to undergo spontaneous pathological repeat expansions. The identification of six genes in this study that have multiple characteristics associated with repeat expansion-disease genes raises the possibility that not all human loci capable of facilitating neurological disease by repeat expansion have as yet been identified.
Collapse
MESH Headings
- Animals
- Chromosome Mapping/methods
- Conserved Sequence/genetics
- Disease Models, Animal
- Evolution, Molecular
- Fishes/genetics
- Genes/genetics
- Genes/physiology
- Genes, Fungal/genetics
- Genes, Helminth/genetics
- Genes, Insect/genetics
- Genetic Diseases, Inborn/genetics
- Genetic Diseases, Inborn/physiopathology
- Genome
- Genome, Human
- Humans
- Mice
- Mutagenesis/genetics
- Nucleotides/genetics
- Point Mutation/genetics
- Rats
- Repetitive Sequences, Amino Acid/genetics
- Selection, Genetic
- Sequence Homology, Nucleic Acid
- Trinucleotide Repeat Expansion/genetics
Collapse
Affiliation(s)
- Hui Huang
- Department of Bioinformatics, Genome Therapeutics Corporation, Waltham, MA 02453, USA
| | - Eitan E Winter
- MRC Functional Genetics Unit, Department of Human Anatomy and Genetics, University of Oxford, South Parks Road, Oxford OX1 3QX, UK
| | - Huajun Wang
- Department of Bioinformatics, Genome Therapeutics Corporation, Waltham, MA 02453, USA
| | - Keith G Weinstock
- Department of Bioinformatics, Genome Therapeutics Corporation, Waltham, MA 02453, USA
| | - Heming Xing
- Department of Bioinformatics, Genome Therapeutics Corporation, Waltham, MA 02453, USA
| | - Leo Goodstadt
- MRC Functional Genetics Unit, Department of Human Anatomy and Genetics, University of Oxford, South Parks Road, Oxford OX1 3QX, UK
| | - Peter D Stenson
- Institute of Medical Genetics, University of Wales College of Medicine, Heath Park, Cardiff CF14 4XN, UK
| | - David N Cooper
- Institute of Medical Genetics, University of Wales College of Medicine, Heath Park, Cardiff CF14 4XN, UK
| | - Douglas Smith
- Genome Sequencing Center, Genome Therapeutics Corporation, Waltham, MA 02453, USA
- Agencourt Bioscience Corporation, Beverly, MA 01915, USA
| | - M Mar Albà
- Grup de Recerca en Informàtica Biomèdica, Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona 08003, Spain
| | - Chris P Ponting
- MRC Functional Genetics Unit, Department of Human Anatomy and Genetics, University of Oxford, South Parks Road, Oxford OX1 3QX, UK
| | - Kim Fechtel
- Department of Bioinformatics, Genome Therapeutics Corporation, Waltham, MA 02453, USA
| |
Collapse
|