Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Subramanian S, Kumar S. Neutral substitutions occur at a faster rate in exons than in noncoding DNA in primate genomes. Genome Res 2003;13:838-44. [PMID: 12727904 PMCID: PMC430942 DOI: 10.1101/gr.1152803] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

For:	Subramanian S, Kumar S. Neutral substitutions occur at a faster rate in exons than in noncoding DNA in primate genomes. Genome Res 2003;13:838-44. [PMID: 12727904 PMCID: PMC430942 DOI: 10.1101/gr.1152803] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Number

Cited by Other Article(s)

Zhang Y, Ahsan MU, Wang K. Noncoding de novo mutations in SCN2A are associated with autism spectrum disorders. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.05.24306908. [PMID: 38766206 PMCID: PMC11100849 DOI: 10.1101/2024.05.05.24306908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]

Abstract

Coding de novo mutations (DNMs) contribute to the risk for autism spectrum disorders (ASD), but the contribution of noncoding DNMs remains relatively unexplored. Here we use whole genome sequencing (WGS) data of 12,411 individuals (including 3,508 probands and 2,218 unaffected siblings) from 3,357 families collected in Simons Foundation Powering Autism Research for Knowledge (SPARK) to detect DNMs associated with ASD, while examining Simons Simplex Collection (SSC) with 6383 individuals from 2274 families to replicate the results. For coding DNMs, SCN2A reached exome-wide significance (p=2.06×10-11) in SPARK. The 618 known dominant ASD genes as a group are strongly enriched for coding DNMs in cases than sibling controls (fold change=1.51, p =1.13×10-5 for SPARK; fold change=1.86, p =2.06×10-9 for SSC). For noncoding DNMs, we used two methods to assess statistical significance: a point-based test that analyzes sites with a Combined Annotation Dependent Depletion (CADD) score ≥15, and a segment-based test that analyzes 1kb genomic segments with segment-specific background mutation rates (inferred from expected rare mutations in Gnocchi genome constraint scores). The point-based test identified SCN2A as marginally significant (p=6.12×10-4) in SPARK, yet segment-based test identified CSMD1, RBFOX1 and CHD13 as exome-wide significant. We did not identify significant enrichment of noncoding DNMs (in all 1kb segments or those with Gnocchi>4) in the 618 known ASD genes as a group in cases than sibling controls. When combining evidence from both coding and noncoding DNMs, we found that SCN2A with 11 coding and 5 noncoding DNMs exhibited the strongest significance (p=4.15×10-13). In summary, we identified both coding and noncoding DNMs in SCN2A associated with ASD, while nominating additional candidates for further examination in future studies.

Collapse

Vasilyeva TA, Marakhonov AV, Kutsev SI, Zinchenko RA. Relative Frequencies of PAX6 Mutational Events in a Russian Cohort of Aniridia Patients in Comparison with the World's Population and the Human Genome. Int J Mol Sci 2022;23:ijms23126690. [PMID: 35743132 PMCID: PMC9223373 DOI: 10.3390/ijms23126690] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2022] [Revised: 06/06/2022] [Accepted: 06/13/2022] [Indexed: 12/10/2022] Open

Hanson HE, Wang C, Schrey AW, Liebl AL, Ravinet M, Jiang RH, Martin LB. Epigenetic Potential and DNA Methylation in an Ongoing House Sparrow (Passer domesticus) Range Expansion. Am Nat 2022;200:662-674. [DOI: 10.1086/720950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]

Han J, Munro JE, Kocoski A, Barry AE, Bahlo M. Population-level genome-wide STR discovery and validation for population structure and genetic diversity assessment of Plasmodium species. PLoS Genet 2022;18:e1009604. [PMID: 35007277 PMCID: PMC8782505 DOI: 10.1371/journal.pgen.1009604] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Revised: 01/21/2022] [Accepted: 12/14/2021] [Indexed: 11/18/2022] Open

Hassan NE, Al-Janabi AA. Investigation of Interferon Gamma Activity Using Bioinformatics Methods. ARCHIVES OF RAZI INSTITUTE 2021;76:1245-1253. [PMID: 35355749 PMCID: PMC8934094 DOI: 10.22092/ari.2021.356106.1780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Received: 09/11/2021] [Accepted: 10/02/2021] [Indexed: 05/25/2023]

Manawasinghe IS, Phillips AJL, Xu J, Balasuriya A, Hyde KD, Stępień Ł, Harischandra DL, Karunarathna A, Yan J, Weerasinghe J, Luo M, Dong Z, Cheewangkoon R. Defining a species in fungal plant pathology: beyond the species level. FUNGAL DIVERS 2021. [DOI: 10.1007/s13225-021-00481-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Khan AA, Ali MS, Babar F, Fatima A, Shafqat MA, Asghar B, Ilyas N, Fatima M, Liaqat A, Gondal MA. Lack of CpG islands in human unitary pseudogenes and its implication. Mamm Genome 2021;32:443-447. [PMID: 34272576 DOI: 10.1007/s00335-021-09893-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/07/2021] [Indexed: 11/24/2022]

The Long-Term Evolutionary History of Gradual Reduction of CpG Dinucleotides in the SARS-CoV-2 Lineage. BIOLOGY 2021;10:biology10010052. [PMID: 33445785 PMCID: PMC7828247 DOI: 10.3390/biology10010052] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 12/29/2020] [Accepted: 01/09/2021] [Indexed: 12/24/2022]

Abstract

Simple Summary

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused the coronavirus disease 2019 (COVID-19), a pandemic that infected over 81 million people worldwide. This has led the scientific community to characterize the genome of this virus, including its nucleotide composition. Investigation of the dinucleotide frequency revealed that the proportion of CG dinucleotides (CpG) is highly reduced in the viral genomes. Since CpG dinucleotides is the target site for the host antiviral zinc finger protein, it has been suggested that the reduction in the proportion of CpG is the viral response to escape from the host defense machinery. In the present study, we investigated the time of origin of reduction in the CpG content. Whole genome analyses based on all representative viral genomes of the group Betacoronavirus revealed that the CpG content in the lineage of SARS-CoV-2 has been progressively declining over the past 1213 years. The depletion of CpG was found to occur at neutral—as well as selectively constrained—positions of the viral genomes.

Abstract

Recent studies suggested that the fraction of CG dinucleotides (CpG) is severely reduced in the genome of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The CpG deficiency was predicted to be the adaptive response of the virus to evade degradation of the viral RNA by the antiviral zinc finger protein that specifically binds to CpG nucleotides. By comparing all representative genomes belonging to the genus Betacoronavirus, this study examined the potential time of origin of CpG depletion. The results of this investigation revealed a highly significant correlation between the proportions of CpG nucleotide (CpG content) of the betacoronavirus species and their times of divergence from SARS-CoV-2. Species that are distantly related to SARS-CoV-2 had much higher CpG contents than that of SARS-CoV-2. Conversely, closely related species had low CpG contents that are similar to or slightly higher than that of SARS-CoV-2. These results suggest a systematic and continuous reduction in the CpG content in the SARS-CoV-2 lineage that might have started since the Sarbecovirus + Hibecovirus clade separated from Nobecovirus, which was estimated to be 1213 years ago. This depletion was not found to be mediated by the GC contents of the genomes. Our results also showed that the depletion of CpG occurred at neutral positions of the genome as well as those under selection. The latter is evident from the progressive reduction in the proportion of arginine amino acid (coded by CpG dinucleotides) in the SARS-CoV-2 lineage over time. The results of this study suggest that shedding CpG nucleotides from their genome is a continuing process in this viral lineage, potentially to escape from their host defense mechanisms.

Collapse

Rong S, Buerer L, Rhine CL, Wang J, Cygan KJ, Fairbrother WG. Mutational bias and the protein code shape the evolution of splicing enhancers. Nat Commun 2020;11:2845. [PMID: 32504065 PMCID: PMC7275064 DOI: 10.1038/s41467-020-16673-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 04/28/2020] [Indexed: 02/06/2023] Open

McDew-White M, Li X, Nkhoma SC, Nair S, Cheeseman I, Anderson TJC. Mode and Tempo of Microsatellite Length Change in a Malaria Parasite Mutation Accumulation Experiment. Genome Biol Evol 2020;11:1971-1985. [PMID: 31273388 PMCID: PMC6644851 DOI: 10.1093/gbe/evz140] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/29/2019] [Indexed: 12/12/2022] Open

Bin Y, Wang X, Zhao L, Wen P, Xia J. An analysis of mutational signatures of synonymous mutations across 15 cancer types. BMC MEDICAL GENETICS 2019;20:190. [PMID: 31815613 PMCID: PMC6900878 DOI: 10.1186/s12881-019-0926-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Abstract

Background

Synonymous mutations have been identified to play important roles in cancer development, although they do not modify the protein sequences. However, relatively little research has specifically delineated the functionality of synonymous mutations in cancer.

Results

We investigated the nucleotide-based and amino acid-based features of synonymous mutations across 15 cancer types from The Cancer Genome Atlas (TCGA), and revealed novel driver candidates by identifying hotspot mutations. Firstly, synonymous mutations were analyzed between TCGA and 1000 Genomes Project at nucleotide and amino acid levels. We found that C:G → T:A transitions were the most frequent single-base substitutions, and leucine underwent the largest number of synonymous mutations in TCGA due to prevalent C → T transition, which induced the transformation between optimal and non-optimal codons. Next, 97 synonymous hotspot mutations in 86 genes were nominated as candidate drivers with potential cancer risk by considering the mutational rates across different sequence contexts. We observed that non-CpG-island GC transition sequence context was positively selected across most of cancer types, and different sequence contexts under which hotspot mutations occur could be significance for genetic differences and functional features. We also found that the hotspots were more conserved than neutral mutations of hotspot-mutation-containing-genes and frequently happened at leucine. In addition, we mapped hotspots, neutral and non-hotspot mutations of hotspot-mutation-containing-genes to their respective protein domains and found ion transport domain was the most frequent one, which could mediate the cell interaction and had relevant implication for tumor therapy. And the signatures of synonymous hotspots were qualitatively similar with those of harmful missense variants.

Conclusions

We illustrated the preferences of cancer associated synonymous mutations, especially hotspots, and laid the groundwork for understanding the synonymous mutations act as drivers in cancer.

Collapse

Maze EA, Ham C, Kelly J, Ussher L, Almond N, Towers GJ, Berry N, Belshaw R. Variable Baseline Papio cynocephalus Endogenous Retrovirus (PcEV) Expression Is Upregulated in Acutely SIV-Infected Macaques and Correlated to STAT1 Expression in the Spleen. Front Immunol 2019;10:901. [PMID: 31156613 PMCID: PMC6529565 DOI: 10.3389/fimmu.2019.00901] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Accepted: 04/08/2019] [Indexed: 01/12/2023] Open

Pranckėnienė L, Jakaitienė A, Ambrozaitytė L, Kavaliauskienė I, Kučinskas V. Insights Into de novo Mutation Variation in Lithuanian Exome. Front Genet 2018;9:315. [PMID: 30154829 PMCID: PMC6102505 DOI: 10.3389/fgene.2018.00315] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 07/24/2018] [Indexed: 01/23/2023] Open

Talukder SK, Azhaguvel P, Chekhovskiy K, Saha MC. Molecular discrimination of tall fescue morphotypes in association with Festuca relatives. PLoS One 2018;13:e0191343. [PMID: 29342197 PMCID: PMC5771633 DOI: 10.1371/journal.pone.0191343] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Accepted: 01/03/2018] [Indexed: 11/18/2022] Open

Abstract

Tall fescue (Festuca arundinacea Schreb.) is an important cool-season perennial grass species used as forage and turf, and in conservation plantings. There are three morphotypes in hexaploid tall fescue: Continental, Mediterranean and Rhizomatous. This study was conducted to develop morphotype-specific molecular markers to distinguish Continental and Mediterranean tall fescues, and establish their relationships with other species of the Festuca genus for genomic inference. Chloroplast sequence variation and simple sequence repeat (SSR) polymorphism were explored in 12 genotypes of three tall fescue morphotypes and four Festuca species. Hypervariable chloroplast regions were retrieved by using 33 specifically designed primers followed by sequencing the PCR products. SSR polymorphism was studied using 144 tall fescue SSR primers. Four chloroplast (NFTCHL17, NFTCHL43, NFTCHL45 and NFTCHL48) and three SSR (nffa090, nffa204 and nffa338) markers were identified which can distinctly differentiate Continental and Mediterranean morphotypes. A primer pair, NFTCHL45, amplified a 47 bp deletion between the two morphotypes is being routinely used in the Noble Research Institute's core facility for morphotype discrimination. Both chloroplast sequence variation and SSR diversity showed a close association between Rhizomatous and Continental morphotypes, while the Mediterranean morphotype was in a distant clade. F. pratensis and F. arundinacea var. glaucescens, the P and G1G2 genome donors, respectively, were grouped with the Continental clade, and F. mairei (M1M2 genome) grouped with the Mediterranean clade in chloroplast sequence variation, while both F. pratensis and F. mairei formed independent clade in SSR analysis. Age estimation based on chloroplast sequence variation indicated that the Continental and Mediterranean clades might have been colonized independently during 0.65 ± 0.06 and 0.96 ± 0.1 million years ago (Mya) respectively. The findings of the study will enhance tall fescue breeding for persistence and productivity.

Collapse

Hurst LD, Batada NN. Depletion of somatic mutations in splicing-associated sequences in cancer genomes. Genome Biol 2017;18:213. [PMID: 29115978 PMCID: PMC5678748 DOI: 10.1186/s13059-017-1337-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2017] [Accepted: 10/12/2017] [Indexed: 01/01/2023] Open

Abstract

Background

An important goal of cancer genomics is to identify systematically cancer-causing mutations. A common approach is to identify sites with high ratios of non-synonymous to synonymous mutations; however, if synonymous mutations are under purifying selection, this methodology leads to identification of false-positive mutations. Here, using synonymous somatic mutations (SSMs) identified in over 4000 tumours across 15 different cancer types, we sought to test this assumption by focusing on coding regions required for splicing.

Results

Exon flanks, which are enriched for sequences required for splicing fidelity, have ~ 17% lower SSM density compared to exonic cores, even after excluding canonical splice sites. While it is impossible to eliminate a mutation bias of unknown cause, multiple lines of evidence support a purifying selection model above a mutational bias explanation. The flank/core difference is not explained by skewed nucleotide content, replication timing, nucleosome occupancy or deficiency in mismatch repair. The depletion is not seen in tumour suppressors, consistent with their role in positive tumour selection, but is otherwise observed in cancer-associated and non-cancer genes, both essential and non-essential. Consistent with a role in splicing modulation, exonic splice enhancers have a lower SSM density before and after controlling for nucleotide composition; moreover, flanks at the 5’ end of the exons have significantly lower SSM density than at the 3’ end.

Conclusions

These results suggest that the observable mutational spectrum of cancer genomes is not simply a product of various mutational processes and positive selection, but might also be shaped by negative selection.

Electronic supplementary material

The online version of this article (doi:10.1186/s13059-017-1337-5) contains supplementary material, which is available to authorized users.

Collapse

Elurbe DM, Paranjpe SS, Georgiou G, van Kruijsbergen I, Bogdanovic O, Gibeaux R, Heald R, Lister R, Huynen MA, van Heeringen SJ, Veenstra GJC. Regulatory remodeling in the allo-tetraploid frog Xenopus laevis. Genome Biol 2017;18:198. [PMID: 29065907 PMCID: PMC5655803 DOI: 10.1186/s13059-017-1335-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Accepted: 10/03/2017] [Indexed: 12/22/2022] Open

Abstract

BACKGROUND

Genome duplication has played a pivotal role in the evolution of many eukaryotic lineages, including the vertebrates. A relatively recent vertebrate genome duplication is that in Xenopus laevis, which resulted from the hybridization of two closely related species about 17 million years ago. However, little is known about the consequences of this duplication at the level of the genome, the epigenome, and gene expression.

RESULTS

The X. laevis genome consists of two subgenomes, referred to as L (long chromosomes) and S (short chromosomes), that originated from distinct diploid progenitors. Of the parental subgenomes, S chromosomes have degraded faster than L chromosomes from the point of genome duplication until the present day. Deletions appear to have the largest effect on pseudogene formation and loss of regulatory regions. Deleted regions are enriched for long DNA repeats and the flanking regions have high alignment scores, suggesting that non-allelic homologous recombination has played a significant role in the loss of DNA. To assess innovations in the X. laevis subgenomes we examined p300-bound enhancer peaks that are unique to one subgenome and absent from X. tropicalis. A large majority of new enhancers comprise transposable elements. Finally, to dissect early and late events following interspecific hybridization, we examined the epigenome and the enhancer landscape in X. tropicalis × X. laevis hybrid embryos. Strikingly, young X. tropicalis DNA transposons are derepressed and recruit p300 in hybrid embryos.

CONCLUSIONS

The results show that erosion of X. laevis genes and functional regulatory elements is associated with repeats and non-allelic homologous recombination and furthermore that young repeats have also contributed to the p300-bound regulatory landscape following hybridization and whole-genome duplication.

Collapse

Affiliation(s)

Dei M Elurbe Radboud University Medical Center, Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, 6500 HB, Nijmegen, The Netherlands
Sarita S Paranjpe Radboud University, Faculty of Science, Department of Molecular Developmental Biology, Radboud Institute for Molecular Life Sciences, 6500 HB, Nijmegen, The Netherlands
Georgios Georgiou Radboud University, Faculty of Science, Department of Molecular Developmental Biology, Radboud Institute for Molecular Life Sciences, 6500 HB, Nijmegen, The Netherlands
Ila van Kruijsbergen Radboud University, Faculty of Science, Department of Molecular Developmental Biology, Radboud Institute for Molecular Life Sciences, 6500 HB, Nijmegen, The Netherlands
Ozren Bogdanovic Genomics and Epigenetics Division, Garvan Institute of Medical Research, Sydney, Australia St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Sydney, Australia ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, Australia
Romain Gibeaux Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
Rebecca Heald Department of Molecular and Cell Biology, University of California, Berkeley, CA, 94720, USA
Ryan Lister Harry Perkins Institute of Medical Research and ARC Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA, 6009, Australia
Martijn A Huynen Radboud University Medical Center, Center for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, 6500 HB, Nijmegen, The Netherlands.
Simon J van Heeringen Radboud University, Faculty of Science, Department of Molecular Developmental Biology, Radboud Institute for Molecular Life Sciences, 6500 HB, Nijmegen, The Netherlands.
Gert Jan C Veenstra Radboud University, Faculty of Science, Department of Molecular Developmental Biology, Radboud Institute for Molecular Life Sciences, 6500 HB, Nijmegen, The Netherlands.

Collapse

Enge M, Arda HE, Mignardi M, Beausang J, Bottino R, Kim SK, Quake SR. Single-Cell Analysis of Human Pancreas Reveals Transcriptional Signatures of Aging and Somatic Mutation Patterns. Cell 2017;171:321-330.e14. [PMID: 28965763 DOI: 10.1016/j.cell.2017.09.004] [Citation(s) in RCA: 332] [Impact Index Per Article: 47.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 07/02/2017] [Accepted: 08/30/2017] [Indexed: 12/20/2022]

Ponting CP. Biological function in the twilight zone of sequence conservation. BMC Biol 2017;15:71. [PMID: 28814299 PMCID: PMC5558704 DOI: 10.1186/s12915-017-0411-5] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Sahakyan AB, Balasubramanian S. Single genome retrieval of context-dependent variability in mutation rates for human germline. BMC Genomics 2017;18:81. [PMID: 28086752 PMCID: PMC5237266 DOI: 10.1186/s12864-016-3440-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 12/19/2016] [Indexed: 01/08/2023] Open

Stephens ZD, Hudson ME, Mainzer LS, Taschuk M, Weber MR, Iyer RK. Simulating Next-Generation Sequencing Datasets from Empirical Mutation and Sequencing Models. PLoS One 2016;11:e0167047. [PMID: 27893777 PMCID: PMC5125660 DOI: 10.1371/journal.pone.0167047] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2016] [Accepted: 11/08/2016] [Indexed: 12/31/2022] Open

Sahakyan AB, Balasubramanian S. Long genes and genes with multiple splice variants are enriched in pathways linked to cancer and other multigenic diseases. BMC Genomics 2016;17:225. [PMID: 26968808 PMCID: PMC4788956 DOI: 10.1186/s12864-016-2582-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Accepted: 03/08/2016] [Indexed: 11/10/2022] Open

Price N, Graur D. Are Synonymous Sites in Primates and Rodents Functionally Constrained? J Mol Evol 2015;82:51-64. [PMID: 26563252 DOI: 10.1007/s00239-015-9719-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2015] [Accepted: 11/04/2015] [Indexed: 11/28/2022]

Abstract

It has been claimed that synonymous sites in mammals are under selective constraint. Furthermore, in many studies the selective constraint at such sites in primates was claimed to be more stringent than that in rodents. Given the larger effective population sizes in rodents than in primates, the theoretical expectation is that selection in rodents would be more effective than that in primates. To resolve this contradiction between expectations and observations, we used processed pseudogenes as a model for strict neutral evolution, and estimated selective constraint on synonymous sites using the rate of substitution at pseudosynonymous and pseudononsynonymous sites in pseudogenes as the neutral expectation. After controlling for the effects of GC content, our results were similar to those from previous studies, i.e., synonymous sites in primates exhibited evidence for higher selective constraint that those in rodents. Specifically, our results indicated that in primates up to 24% of synonymous sites could be under purifying selection, while in rodents synonymous sites evolved neutrally. To further control for shifts in GC content, we estimated selective constraint at fourfold degenerate sites using a maximum parsimony approach. This allowed us to estimate selective constraint using mutational patterns that cause a shift in GC content (GT ↔ TG, CT ↔ TC, GA ↔ AG, and CA ↔ AC) and ones that do not (AT ↔ TA and CG ↔ GC). Using this approach, we found that synonymous sites evolve neutrally in both primates and rodents. Apparent deviations from neutrality were caused by a higher rate of C → A and C → T mutations in pseudogenes. Such differences are most likely caused by the shift in GC content experienced by pseudogenes. We conclude that previous estimates according to which 20-40% of synonymous sites in primates were under selective constraint were most likely artifacts of the biased pattern of mutation.

Collapse

Koziol U, Radio S, Smircich P, Zarowiecki M, Fernández C, Brehm K. A Novel Terminal-Repeat Retrotransposon in Miniature (TRIM) Is Massively Expressed in Echinococcus multilocularis Stem Cells. Genome Biol Evol 2015;7:2136-53. [PMID: 26133390 PMCID: PMC4558846 DOI: 10.1093/gbe/evv126] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/26/2015] [Indexed: 12/14/2022] Open

Francioli LC, Polak PP, Koren A, Menelaou A, Chun S, Renkens I, van Duijn CM, Swertz M, Wijmenga C, van Ommen G, Slagboom PE, Boomsma DI, Ye K, Guryev V, Arndt PF, Kloosterman WP, de Bakker PIW, Sunyaev SR. Genome-wide patterns and properties of de novo mutations in humans. Nat Genet 2015;47:822-826. [PMID: 25985141 PMCID: PMC4485564 DOI: 10.1038/ng.3292] [Citation(s) in RCA: 247] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2014] [Accepted: 04/07/2015] [Indexed: 12/12/2022]

Affiliation(s)

Laurent C Francioli Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
Paz P Polak Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
Amnon Koren Department of Genetics, Harvard Medical School, Boston, MA, USA
Androniki Menelaou Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
Sung Chun Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
Ivo Renkens Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands

Cornelia M van Duijn Department of Epidemiology, Erasmus Medical Center, Rotterdam, The Netherlands
Morris Swertz University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands.,University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, The Netherlands
Cisca Wijmenga University of Groningen, University Medical Center Groningen, Department of Genetics, Groningen, The Netherlands.,University of Groningen, University Medical Center Groningen, Genomics Coordination Center, Groningen, The Netherlands
Gertjan van Ommen Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
P Eline Slagboom Section of Molecular Epidemiology, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands
Dorret I Boomsma Department of Biological Psychology, VU University Amsterdam, Amsterdam, The Netherlands
Kai Ye Section of Molecular Epidemiology, Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, Leiden, The Netherlands.,The Genome Institute, Washington University, St. Louis, MO, USA
Victor Guryev European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
Peter F Arndt Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
Wigard P Kloosterman Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands
Paul I W de Bakker Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, The Netherlands.,Department of Epidemiology, Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, The Netherlands
Shamil R Sunyaev Division of Genetics, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA

Collapse

Magiorkinis G, Blanco-Melo D, Belshaw R. The decline of human endogenous retroviruses: extinction and survival. Retrovirology 2015;12:8. [PMID: 25640971 PMCID: PMC4335370 DOI: 10.1186/s12977-015-0136-x] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Accepted: 01/06/2015] [Indexed: 12/21/2022] Open

Abstract

Background

Endogenous Retroviruses (ERVs) are retroviruses that over the course of evolution have integrated into germline cells and eventually become part of the host genome. They proliferate within the germline of their host, making up ~5% of the human and mouse genome sequences. Several lines of evidence have suggested a decline in the rate of ERV integration into the human genome in recent evolutionary history but this has not been investigated quantitatively or possible causes explored.

Results

By dating the integration of ERV loci in 40 mammal species, we show that the human genome and that of other hominoids (great apes and gibbons) have experienced an approximately four-fold decline in the ERV integration rate over the last 10 million years. A major cause is the recent extinction of one very large ERV lineage (HERV-H), which is responsible for most of the integrations over the last 30 million years. The decline however affects most other ERV lineages. Only about 10% of the decline might be attributed to an accompanying increase in body mass (a trait we have shown recently to be negatively correlated with ERV integration rate). Humans are unusual compared to related species – Old World monkeys, great apes and gibbons – in (a) having not acquired any new ERV lineages during the last 30 million years and (b) the possession of an old ERV lineage that has continued to replicate up until at least the last few hundred thousand years – the potentially medically significant HERVK(HML2).

Conclusions

The human genome shares with the genome of other great apes and gibbons a recent decline in ERV integration that is not typical of other primates and mammals. The human genome differs from that of related species both in maintaining up until at least recently a replicating old ERV lineage and in not having acquired any new lineages. We speculate that the decline in ERV integration in the human genome has been exacerbated by a relatively low burden of horizontally-transmitted retroviruses and subsequent reduced risk of endogenization.

Electronic supplementary material

The online version of this article (doi:10.1186/s12977-015-0136-x) contains supplementary material, which is available to authorized users.

Collapse

Fares M. Modeling Evolution of Molecular Sequences. NATURAL SELECTION 2014:28-47. [DOI: 10.1201/b17795-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]

Mazzarella L, Riva L, Luzi L, Ronchini C, Pelicci PG. The Genomic and Epigenomic Landscapes of AML. Semin Hematol 2014;51:259-72. [DOI: 10.1053/j.seminhematol.2014.08.007] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Heterogeneous polymerase fidelity and mismatch repair bias genome variation and composition. Genome Res 2014;24:1751-64. [PMID: 25217194 PMCID: PMC4216917 DOI: 10.1101/gr.178335.114] [Citation(s) in RCA: 117] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Unfixed endogenous retroviral insertions in the human population. J Virol 2014;88:9529-37. [PMID: 24920817 PMCID: PMC4136357 DOI: 10.1128/jvi.00919-14] [Citation(s) in RCA: 93] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Abstract

One lineage of human endogenous retroviruses (HERVs), HERV-K(HML2), is upregulated in many cancers, some autoimmune/inflammatory diseases, and HIV-infected cells. Despite 3 decades of research, it is not known if these viruses play a causal role in disease, and there has been recent interest in whether they can be used as immunotherapy targets. Resolution of both these questions will be helped by an ability to distinguish between the effects of different integrated copies of the virus (loci). Research so far has concentrated on the 20 or so recently integrated loci that, with one exception, are in the human reference genome sequence. However, this viral lineage has been copying in the human population within the last million years, so some loci will inevitably be present in the human population but absent from the reference sequence. We therefore performed the first detailed search for such loci by mining whole-genome sequences generated by next-generation sequencing. We found a total of 17 loci, and the frequency of their presence ranged from only 2 of the 358 individuals examined to over 95% of them. On average, each individual had six loci that are not in the human reference genome sequence. Comparing the number of loci that we found to an expectation derived from a neutral population genetic model suggests that the lineage was copying until at least ∼250,000 years ago.

IMPORTANCE About 5% of the human genome sequence is composed of the remains of retroviruses that over millions of years have integrated into the chromosomes of egg and/or sperm precursor cells. There are indications that protein expression of these viruses is higher in some diseases, and we need to know (i) whether these viruses have a role in causing disease and (ii) whether they can be used as immunotherapy targets in some of them. Answering both questions requires a better understanding of how individuals differ in the viruses that they carry. We carried out the first careful search for new viruses in some of the many human genome sequences that are now available thanks to advances in sequencing technology. We also compared the number that we found to a theoretical expectation to see if it is likely that these viruses are still replicating in the human population today.

Collapse

Anvar SY, Khachatryan L, Vermaat M, van Galen M, Pulyakhina I, Ariyurek Y, Kraaijeveld K, den Dunnen JT, de Knijff P, ’t Hoen PAC, Laros JFJ. Determining the quality and complexity of next-generation sequencing data without a reference genome. Genome Biol 2014;15:555. [PMID: 25514851 PMCID: PMC4298064 DOI: 10.1186/s13059-014-0555-3] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2014] [Accepted: 11/27/2014] [Indexed: 01/22/2023] Open

Zhu T, Xu PZ, Liu JP, Peng S, Mo XC, Gao LZ. Phylogenetic relationships and genome divergence among the AA- genome species of the genus Oryza as revealed by 53 nuclear genes and 16 intergenic regions. Mol Phylogenet Evol 2013;70:348-61. [PMID: 24148990 DOI: 10.1016/j.ympev.2013.10.008] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2012] [Revised: 08/17/2013] [Accepted: 10/09/2013] [Indexed: 12/17/2022]

Rauch A, Wieczorek D, Graf E, Wieland T, Endele S, Schwarzmayr T, Albrecht B, Bartholdi D, Beygo J, Di Donato N, Dufke A, Cremer K, Hempel M, Horn D, Hoyer J, Joset P, Röpke A, Moog U, Riess A, Thiel CT, Tzschach A, Wiesener A, Wohlleber E, Zweier C, Ekici AB, Zink AM, Rump A, Meisinger C, Grallert H, Sticht H, Schenck A, Engels H, Rappold G, Schröck E, Wieacker P, Riess O, Meitinger T, Reis A, Strom TM. Range of genetic mutations associated with severe non-syndromic sporadic intellectual disability: an exome sequencing study. Lancet 2012;380:1674-82. [PMID: 23020937 DOI: 10.1016/s0140-6736(12)61480-9] [Citation(s) in RCA: 754] [Impact Index Per Article: 62.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Abstract

BACKGROUND

The genetic cause of intellectual disability in most patients is unclear because of the absence of morphological clues, information about the position of such genes, and suitable screening methods. Our aim was to identify de-novo variants in individuals with sporadic non-syndromic intellectual disability.

METHODS

In this study, we enrolled children with intellectual disability and their parents from ten centres in Germany and Switzerland. We compared exome sequences between patients and their parents to identify de-novo variants. 20 children and their parents from the KORA Augsburg Diabetes Family Study were investigated as controls.

FINDINGS

We enrolled 51 participants from the German Mental Retardation Network. 45 (88%) participants in the case group and 14 (70%) in the control group had de-novo variants. We identified 87 de-novo variants in the case group, with an exomic mutation rate of 1·71 per individual per generation. In the control group we identified 24 de-novo variants, which is 1·2 events per individual per generation. More participants in the case group had loss-of-function variants than in the control group (20/51 vs 2/20; p=0·022), suggesting their contribution to disease development. 16 patients carried de-novo variants in known intellectual disability genes with three recurrently mutated genes (STXBP1, SYNGAP1, and SCN2A). We deemed at least six loss-of-function mutations in six novel genes to be disease causing. We also identified several missense alterations with potential pathogenicity.

INTERPRETATION

After exclusion of copy-number variants, de-novo point mutations and small indels are associated with severe, sporadic non-syndromic intellectual disability, accounting for 45-55% of patients with high locus heterogeneity. Autosomal recessive inheritance seems to contribute little in the outbred population investigated. The large number of de-novo variants in known intellectual disability genes is only partially attributable to known non-specific phenotypes. Several patients did not meet the expected syndromic manifestation, suggesting a strong bias in present clinical syndrome descriptions.

FUNDING

German Ministry of Education and Research, European Commission 7th Framework Program, and Swiss National Science Foundation.

Collapse

Longer time scale for human evolution. Proc Natl Acad Sci U S A 2012;109:15531-2. [PMID: 22984161 DOI: 10.1073/pnas.1212718109] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Sen K, Ghosh TC. Evolutionary conservation and disease gene association of the human genes composing pseudogenes. Gene 2012;501:164-70. [PMID: 22521745 DOI: 10.1016/j.gene.2012.04.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2011] [Revised: 02/09/2012] [Accepted: 04/05/2012] [Indexed: 01/16/2023]

Iengar P. An analysis of substitution, deletion and insertion mutations in cancer genes. Nucleic Acids Res 2012;40:6401-13. [PMID: 22492711 PMCID: PMC3413105 DOI: 10.1093/nar/gks290] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Uno Y, Osada N. CpG site degeneration triggered by the loss of functional constraint created a highly polymorphic macaque drug-metabolizing gene, CYP1A2. BMC Evol Biol 2011;11:283. [PMID: 21961956 PMCID: PMC3199271 DOI: 10.1186/1471-2148-11-283] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Accepted: 10/01/2011] [Indexed: 11/29/2022] Open

Abstract

Background

Elucidating the pattern of evolutionary changes in drug-metabolizing genes is an important subject not only for evolutionary but for biomedical research. We investigated the pattern of divergence and polymorphisms of macaque CYP1A1 and CYP1A2 genes, which are major drug-metabolizing genes in humans. In humans, CYP1A2 is specifically expressed in livers while CYP1A1 has a wider gene expression pattern in extrahepatic tissues. In contrast, macaque CYP1A2 is expressed at a much lower level than CYP1A1 in livers. Interestingly, a previous study has shown that Macaca fascicularis CYP1A2 harbored unusually high genetic diversity within species. Genomic regions showing high genetic diversity within species is occasionally interpreted as a result of balancing selection, where natural selection maintains highly diverged alleles with different functions. Nevertheless many other forces could create such signatures.

Results

We found that the CYP1A1/2 gene copy number and orientation has been highly conserved among mammalian genomes. The signature of gene conversion between CYP1A1 and CYP1A2 was detected, but the last gene conversion event in the simian primate lineage occurred before the Catarrhini-Platyrrhini divergence. The high genetic diversity of macaque CYP1A2 therefore cannot be explained by gene conversion between CYP1A1 and CYP1A2. By surveying CYP1A2 polymorphisms in total 91 M. fascicularis and M. mulatta, we found several null alleles segregating in these species, indicating functional constraint on CYP1A2 in macaques may have weakened after the divergence between humans and macaques. We propose that the high genetic diversity in macaque CYP1A2 is partly due to the degeneration of CpG sites, which had been maintained at a high level by purifying selection, and the rapid degeneration process was initiated by the loss of functional constraint on macaque CYP1A2.

Conclusions

Our findings show that the highly polymorphic CYP1A2 gene in macaques has not been created by balancing selection but by the burst of CpG site degeneration after loss of functional constraint. Because the functional importance of CYP1A1/2 genes is different between humans and macaques, we have to be cautious in extrapolating a drug-testing data using substrates metabolized by CYP1A genes from macaques to humans, despite of their somewhat overlapping substrate specificity.

Collapse

Kumar S, Filipski AJ, Battistuzzi FU, Kosakovsky Pond SL, Tamura K. Statistics and truth in phylogenomics. Mol Biol Evol 2011;29:457-72. [PMID: 21873298 DOI: 10.1093/molbev/msr202] [Citation(s) in RCA: 164] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open

Arbiza L, Patricio M, Dopazo H, Posada D. Genome-wide heterogeneity of nucleotide substitution model fit. Genome Biol Evol 2011;3:896-908. [PMID: 21824869 PMCID: PMC3175760 DOI: 10.1093/gbe/evr080] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Abstract

At a genomic scale, the patterns that have shaped molecular evolution are believed to be largely heterogeneous. Consequently, comparative analyses should use appropriate probabilistic substitution models that capture the main features under which different genomic regions have evolved. While efforts have concentrated in the development and understanding of model selection techniques, no descriptions of overall relative substitution model fit at the genome level have been reported. Here, we provide a characterization of best-fit substitution models across three genomic data sets including coding regions from mammals, vertebrates, and Drosophila (24,000 alignments). According to the Akaike Information Criterion (AIC), 82 of 88 models considered were selected as best-fit models at least in one occasion, although with very different frequencies. Most parameter estimates also varied broadly among genes. Patterns found for vertebrates and Drosophila were quite similar and often more complex than those found in mammals. Phylogenetic trees derived from models in the 95% confidence interval set showed much less variance and were significantly closer to the tree estimated under the best-fit model than trees derived from models outside this interval. Although alternative criteria selected simpler models than the AIC, they suggested similar patterns. All together our results show that at a genomic scale, different gene alignments for the same set of taxa are best explained by a large variety of different substitution models and that model choice has implications on different parameter estimates including the inferred phylogenetic trees. After taking into account the differences related to sample size, our results suggest a noticeable diversity in the underlying evolutionary process. All together, we conclude that the use of model selection techniques is important to obtain consistent phylogenetic estimates from real data at a genomic scale.

Collapse

Suzuki Y. Statistical methods for detecting natural selection from genomic data. Genes Genet Syst 2011;85:359-76. [PMID: 21415566 DOI: 10.1266/ggs.85.359] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Brown CA, Scharner J, Felice K, Meriggioli MN, Tarnopolsky M, Bower M, Zammit PS, Mendell JR, Ellis JA. Novel and recurrent EMD mutations in patients with Emery–Dreifuss muscular dystrophy, identify exon 2 as a mutation hot spot. J Hum Genet 2011;56:589-94. [DOI: 10.1038/jhg.2011.65] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Inference of mutation parameters and selective constraint in mammalian coding sequences by approximate Bayesian computation. Genetics 2011;187:1153-61. [PMID: 21288873 DOI: 10.1534/genetics.110.124073] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open

Stover DA, Verrelli BC. Comparative Vertebrate Evolutionary Analyses of Type I Collagen: Potential of COL1a1 Gene Structure and Intron Variation for Common Bone-Related Diseases. Mol Biol Evol 2010;28:533-42. [DOI: 10.1093/molbev/msq221] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open

Sen K, Podder S, Ghosh TC. Insights into the genomic features and evolutionary impact of the genes configuring duplicated pseudogenes in human. FEBS Lett 2010;584:4015-8. [PMID: 20708614 DOI: 10.1016/j.febslet.2010.08.012] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2010] [Revised: 08/05/2010] [Accepted: 08/06/2010] [Indexed: 10/19/2022]

Claw KG, Tito RY, Stone AC, Verrelli BC. Haplotype structure and divergence at human and chimpanzee serotonin transporter and receptor genes: implications for behavioral disorder association analyses. Mol Biol Evol 2010;27:1518-29. [PMID: 20118193 DOI: 10.1093/molbev/msq030] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Medvedeva YA, Fridman MV, Oparina NJ, Malko DB, Ermakova EO, Kulakovskiy IV, Heinzel A, Makeev VJ. Intergenic, gene terminal, and intragenic CpG islands in the human genome. BMC Genomics 2010;11:48. [PMID: 20085634 PMCID: PMC2817693 DOI: 10.1186/1471-2164-11-48] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2009] [Accepted: 01/19/2010] [Indexed: 11/10/2022] Open

Abstract

Background

Recently, it has been discovered that the human genome contains many transcription start sites for non-coding RNA. Regulatory regions related to transcription of this non-coding RNAs are poorly studied. Some of these regulatory regions may be associated with CpG islands located far from transcription start-sites of any protein coding gene. The human genome contains many such CpG islands; however, until now their properties were not systematically studied.

Results

We studied CpG islands located in different regions of the human genome using methods of bioinformatics and comparative genomics. We have observed that CpG islands have a preference to overlap with exons, including exons located far from transcription start site, but usually extend well into introns. Synonymous substitution rate of CpG-containing codons becomes substantially reduced in regions where CpG islands overlap with protein-coding exons, even if they are located far downstream from transcription start site. CAGE tag analysis displayed frequent transcription start sites in all CpG islands, including those found far from transcription start sites of protein coding genes. Computational prediction and analysis of published ChIP-chip data revealed that CpG islands contain an increased number of sites recognized by Sp1 protein. CpG islands containing more CAGE tags usually also contain more Sp1 binding sites. This is especially relevant for CpG islands located in 3' gene regions. Various examples of transcription, confirmed by mRNAs or ESTs, but with no evidence of protein coding genes, were found in CAGE-enriched CpG islands located far from transcription start site of any known protein coding gene.

Conclusions

CpG islands located far from transcription start sites of protein coding genes have transcription initiation activity and display Sp1 binding properties. In exons, overlapping with these islands, the synonymous substitution rate of CpG containing codons is decreased. This suggests that these CpG islands are involved in transcription initiation, possibly of some non-coding RNAs.

Collapse

Suzuki Y, Gojobori T, Kumar S. Methods for incorporating the hypermutability of CpG dinucleotides in detecting natural selection operating at the amino acid sequence level. Mol Biol Evol 2009;26:2275-84. [PMID: 19581348 DOI: 10.1093/molbev/msp133] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open

Li JB, Gao Y, Aach J, Zhang K, Kryukov GV, Xie B, Ahlford A, Yoon JK, Rosenbaum AM, Zaranek AW, LeProust E, Sunyaev SR, Church GM. Multiplex padlock targeted sequencing reveals human hypermutable CpG variations. Genome Res 2009;19:1606-15. [PMID: 19525355 DOI: 10.1101/gr.092213.109] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

MacEachern S, McEwan J, McCulloch A, Mather A, Savin K, Goddard M. Molecular evolution of the Bovini tribe (Bovidae, Bovinae): is there evidence of rapid evolution or reduced selective constraint in Domestic cattle? BMC Genomics 2009;10:179. [PMID: 19393048 PMCID: PMC2681479 DOI: 10.1186/1471-2164-10-179] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2008] [Accepted: 04/24/2009] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

If mutation within the coding region of the genome is largely not adaptive, the ratio of nonsynonymous (dN) to synonymous substitutions (dS) per site (dN/dS) should be approximately equal among closely related species. Furthermore, dN/dS in divergence between species should be equivalent to dN/dS in polymorphisms. This hypothesis is of particular interest in closely related members of the Bovini tribe, because domestication has promoted rapid phenotypic divergence through strong artificial selection of some species while others remain undomesticated. We examined a number of genes that may be involved in milk production in Domestic cattle and a number of their wild relatives for evidence that domestication had affected molecular evolution. Elevated rates of dN/dS were further queried to determine if they were the result of positive selection, low effective population size (N(e)) or reduced selective constraint.

RESULTS

We have found that the domestication process has contributed to higher dN/dS ratios in cattle, especially in the lineages leading to the Domestic cow (Bos taurus) and Mithan (Bos frontalis) and within some breeds of Domestic cow. However, the high rates of dN/dS polymorphism within B. taurus when compared to species divergence suggest that positive selection has not elevated evolutionary rates in these genes. Likewise, the low rate of dN/dS in Bison, which has undergone a recent population bottleneck, indicates a reduction in population size alone is not responsible for these observations.

CONCLUSION

The effect of selection depends on effective population size and the selection coefficient (N(e)s). Typically under domestication both selection pressure for traits important in fitness in the wild and Ne are reduced. Therefore, reduced selective constraint could be responsible for the observed elevated evolutionary ratios in domesticated species, especially in B. taurus and B. frontalis, which have the highest dN/dS in the Bovini. This may have important implications for tests of selection such as the McDonald-Kreitman test. Surprisingly we have also detected a significant difference in the supposed neutral substitution rate between synonymous and noncoding sites in the Bovine genome, with a 30% higher rate of substitution at synonymous sites. This is due, at least in part, to an excess of the highly mutable CpG dinucleotides at synonymous sites, which will have implications for time of divergence estimates from molecular data.

Collapse

MacEachern S, Hayes B, McEwan J, Goddard M. An examination of positive selection and changing effective population size in Angus and Holstein cattle populations (Bos taurus) using a high density SNP genotyping platform and the contribution of ancient polymorphism to genomic diversity in Domestic cattle. BMC Genomics 2009;10:181. [PMID: 19393053 PMCID: PMC2681480 DOI: 10.1186/1471-2164-10-181] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2008] [Accepted: 04/24/2009] [Indexed: 12/03/2022] Open

Abstract

Background

Identifying recent positive selection signatures in domesticated animals could provide information on genome response to strong directional selection from domestication and artificial selection. With the completion of the cattle genome, private companies are now providing large numbers of polymorphic markers for probing variation in domestic cattle (Bos taurus). We analysed over 7,500 polymorphic single nucleotide polymorphisms (SNP) in beef (Angus) and dairy (Holstein) cattle and outgroup species Bison, Yak and Banteng in an indirect test of inbreeding and positive selection in Domestic cattle.

Results

Outgroup species: Bison, Yak and Banteng, were genotyped with high levels of success (90%) and used to determine ancestral and derived allele states in domestic cattle. Frequency spectrums of the derived alleles in Angus and Holstein were examined using Fay and Wu's H test. Significant divergences from the predicted frequency spectrums expected under neutrality were identified. This appeared to be the result of combined influences of positive selection, inbreeding and ascertainment bias for moderately frequent SNP. Approximately 10% of all polymorphisms identified as segregating in B. taurus were also segregating in Bison, Yak or Banteng; highlighting a large number of polymorphisms that are ancient in origin.

Conclusion

These results suggest that a large effective population size (N_e) of approximately 90,000 or more existed in B. taurus since they shared a common ancestor with Bison, Yak and Banteng ~1–2 million years ago (MYA). More recently N_edecreased sharply probably associated with domestication. This may partially explain the paradox of high levels of polymorphism in Domestic cattle and the relatively small recent N_ein this species. The period of inbreeding caused Fay and Wu's H statistic to depart from its expectation under neutrality mimicking the effect of selection. However, there was also evidence for selection, because high frequency derived alleles tended to cluster near each other on the genome.

Collapse

MacEachern S, McEwan J, Goddard M. Phylogenetic reconstruction and the identification of ancient polymorphism in the Bovini tribe (Bovidae, Bovinae). BMC Genomics 2009;10:177. [PMID: 19393045 PMCID: PMC2694835 DOI: 10.1186/1471-2164-10-177] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2008] [Accepted: 04/24/2009] [Indexed: 11/22/2022] Open

Abstract

Background

The Bovinae subfamily incorporates an array of antelope, buffalo and cattle species. All of the members of this subfamily have diverged recently. Not surprisingly, a number of phylogenetic studies from molecular and morphological data have resulted in ambiguous trees and relationships amongst species, especially for Yak and Bison species. A partial phylogenetic reconstruction of 13 extant members of the Bovini tribe (Bovidae, Bovinae) from 15 complete or partially sequenced autosomal genes is presented.

Results

We identified 3 distinct lineages after the Bovini split from the Boselaphini and Tragelaphini tribes, which has lead to the (1) Buffalo clade (Bubalus and Syncerus species) and a more recent divergence leading to the (2) Banteng, Gaur and Mithan and (3) Domestic cattle clades. A fourth lineage may also exist that leads to Bison and Yak. However, there was some ambiguity as to whether this was a divergence from the Banteng/Gaur/Mithan or the Domestic cattle clade. From an analysis of approximately 30,000 sites that were amplified in all species 133 sites were identified with ambiguous inheritance, in that all trees implied more than one mutation at the same site. Closer examination of these sites has identified that they are the result of ancient polymorphisms that have subsequently undergone lineage sorting in the Bovini tribe, of which 53 have remained polymorphic since Bos and Bison species last shared a common ancestor with Bubalus between 5–8 million years ago (MYA).

Conclusion

Uncertainty arises in our phylogenetic reconstructions because many species in the Bovini diverged over a short period of time. It appears that a number of sites with ambiguous inheritance have been maintained in subsequent populations by chance (lineage sorting) and that they have contributed to an association between Yak and Domestic cattle and an unreliable phylogenetic reconstruction for the Bison/Yak clade. Interestingly, a number of these aberrant sites are in coding sections of the genome and their identification may have important implications for studying the neutral rate of mutation at nonsynonymous sites. The presence of these sites could help account for the apparent contradiction between levels of polymorphism and effective population size in domesticated cattle.

Collapse