1
|
Morton BR. Context and Mutation in Gymnosperm Chloroplast DNA. Genes (Basel) 2023; 14:1492. [PMID: 37510396 PMCID: PMC10378972 DOI: 10.3390/genes14071492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 07/15/2023] [Accepted: 07/18/2023] [Indexed: 07/30/2023] Open
Abstract
Mutations and subsequent repair processes are known to be strongly context-dependent in the flowering-plant chloroplast genome. At least six flanking bases, three on each side, can have an influence on the relative rates of different types of mutation at any given site. In this analysis, examine context and substitution at noncoding and fourfold degenerate coding sites in gymnosperm DNA. The sequences are analyzed in sets of three, allowing the inference of the substitution direction and the generation of context-dependent rate matrices. The size of the dataset limits the analysis to the tetranucleotide context of the sites, but the evidence shows that there are significant contextual effects, with patterns that are similar to those observed in angiosperms. These effects most likely represent an influence on the underlying mutation/repair dynamics. The data extend the plastome lineages that feature very complex patterns of mutation, which can have significant effects on the evolutionary dynamics of the chloroplast genome.
Collapse
Affiliation(s)
- Brian R Morton
- Department of Biology, Barnard College, Columbia University, 3009 Broadway, New York, NY 10027, USA
| |
Collapse
|
2
|
Do Noncoding and Coding Sites in Angiosperm Chloroplast DNA Have Different Mutation Processes? Genes (Basel) 2023; 14:genes14010148. [PMID: 36672890 PMCID: PMC9858945 DOI: 10.3390/genes14010148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/30/2022] [Accepted: 01/03/2023] [Indexed: 01/09/2023] Open
Abstract
Fourfold degenerate sites within coding regions and intergenic sites have both been used as estimates of neutral evolution. In chloroplast DNA, the pattern of substitution at intergenic sites is strongly dependent on the composition of the surrounding hexanucleotide composed of the three base pairs on each side, which suggests that the mutation process is highly context-dependent in this genome. This study examines the context-dependency of substitutions at fourfold degenerate sites in protein-coding regions and compares the pattern to what has been observed at intergenic sites. Overall, there is strong similarity between the two types of sites, but there are some intriguing differences. One of these is that substitutions of G and C are significantly higher at fourfold degenerate sites across a range of contexts. In fact, A → T and T → A substitutions are the only substitution types that occur at a lower rate at fourfold degenerate sites. The data are not consistent with selective constraints being responsible for the difference in substitution patterns between intergenic and fourfold degenerate sites. Rather, it is suggested that the difference may be a result of different epigenetic modifications that result in slightly different mutation patterns in coding and intergenic DNA.
Collapse
|
3
|
Morton BR. Substitution rate heterogeneity across hexanucleotide contexts in noncoding chloroplast DNA. G3 GENES|GENOMES|GENETICS 2022; 12:6608088. [PMID: 35699494 PMCID: PMC9339276 DOI: 10.1093/g3journal/jkac150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Accepted: 06/07/2022] [Indexed: 11/13/2022]
Abstract
Substitutions between closely related noncoding chloroplast DNA sequences are studied with respect to the composition of the 3 bases on each side of the substitution, that is the hexanucleotide context. There is about 100-fold variation in rate, among the contexts, particularly on substitutions of A and T. Rate heterogeneity of transitions differs from that of transversions, resulting in a more than 200-fold variation in the transitions: transversion bias. The data are consistent with a CpG effect, and it is shown that both the A + T content and the arrangement of purines/pyrimidines along the same DNA strand are correlated with rate variation. Expected equilibrium A + T content ranges from 36.4% to 82.8% across contexts, while G–C skew ranges from −77.4 to 72.2 and A–T skew ranges from −63.9 to 68.2. The predicted equilibria are associated with specific features of the content of the hexanucleotide context, and also show close agreement with the observed context-dependent compositions. Finally, by controlling for the content of nucleotides closer to the substitution site, it is shown that both the third and fourth nucleotide removed on each side of the substitution directly influence substitution dynamics at that site. Overall, the results demonstrate that noncoding sites in different contexts are evolving along very different evolutionary trajectories and that substitution dynamics are far more complex than typically assumed. This has important implications for a number of types of sequence analysis, particularly analyses of natural selection, and the context-dependent substitution matrices developed here can be applied in future analyses.
Collapse
Affiliation(s)
- Brian R Morton
- Department of Biology, Barnard College, Columbia University , New York, NY 10027, USA
| |
Collapse
|
4
|
Context-Dependent Substitution Dynamics in Plastid DNA Across a Wide Range of Taxonomic Groups. J Mol Evol 2022; 90:44-55. [PMID: 35037071 DOI: 10.1007/s00239-021-10040-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 12/01/2021] [Indexed: 10/19/2022]
Abstract
The influence of neighboring base composition, or context, on substitution bias at fourfold degenerate coding sites and in intergenic regions in plastid DNA is compared across the angiosperms, gymnosperms, ferns, liverworts, chlorophytes, stramenopiles and rhodophytes. An influence of flanking base G + C content on the relative rates of transitions and transversions is observed in all lineages and extends up to four nucleotides from the site of substitution in some. Despite finding context effects in all lineages, significant differences were observed between lineages. Overall, the data suggest that context is a general factor affecting mutation bias in plastid DNA but that the dynamics of the influence have evolved over time. It is also shown that, although there are similar effects of context on substitution bias at fourfold degenerate coding sites and at sites within intergenic regions, there are also small but significant differences, suggesting that there could be some selection on some of these sites and that there could be some difference in the mutation and/or repair process between coding and noncoding DNA.
Collapse
|
5
|
Guerrero-Bosagna C. From epigenotype to new genotypes: Relevance of epigenetic mechanisms in the emergence of genomic evolutionary novelty. Semin Cell Dev Biol 2020; 97:86-92. [DOI: 10.1016/j.semcdb.2019.07.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Revised: 07/08/2019] [Accepted: 07/08/2019] [Indexed: 11/24/2022]
|
6
|
Laurin-Lemay S, Rodrigue N, Lartillot N, Philippe H. Conditional Approximate Bayesian Computation: A New Approach for Across-Site Dependency in High-Dimensional Mutation-Selection Models. Mol Biol Evol 2019; 35:2819-2834. [PMID: 30203003 DOI: 10.1093/molbev/msy173] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
A key question in molecular evolutionary biology concerns the relative roles of mutation and selection in shaping genomic data. Moreover, features of mutation and selection are heterogeneous along the genome and over time. Mechanistic codon substitution models based on the mutation-selection framework are promising approaches to separating these effects. In practice, however, several complications arise, since accounting for such heterogeneities often implies handling models of high dimensionality (e.g., amino acid preferences), or leads to across-site dependence (e.g., CpG hypermutability), making the likelihood function intractable. Approximate Bayesian Computation (ABC) could address this latter issue. Here, we propose a new approach, named Conditional ABC (CABC), which combines the sampling efficiency of MCMC and the flexibility of ABC. To illustrate the potential of the CABC approach, we apply it to the study of mammalian CpG hypermutability based on a new mutation-level parameter implying dependence across adjacent sites, combined with site-specific purifying selection on amino-acids captured by a Dirichlet process. Our proof-of-concept of the CABC methodology opens new modeling perspectives. Our application of the method reveals a high level of heterogeneity of CpG hypermutability across loci and mild heterogeneity across taxonomic groups; and finally, we show that CpG hypermutability is an important evolutionary factor in rendering relative synonymous codon usage. All source code is available as a GitHub repository (https://github.com/Simonll/LikelihoodFreePhylogenetics.git).
Collapse
Affiliation(s)
- Simon Laurin-Lemay
- Robert-Cedergren Center for Bioinformatics and Genomics, Department of Biochemistry and Molecular Medicine, Faculty of Medicine, Université de Montréal, Montréal, QC, Canada
| | - Nicolas Rodrigue
- Department of Biology, Institute of Biochemistry, and School of Mathematics and Statistics, Carleton University, Ottawa, ON, Canada
| | - Nicolas Lartillot
- Laboratoire de Biométrie et Biologie Évolutive, UMR CNRS 5558, Université Lyon 1, Lyon, France
| | - Hervé Philippe
- Robert-Cedergren Center for Bioinformatics and Genomics, Department of Biochemistry and Molecular Medicine, Faculty of Medicine, Université de Montréal, Montréal, QC, Canada.,Centre de Théorisation et de Modélisation de la Biodiversité, Station d'Écologie Théorique et Expérimentale, UMR CNRS 5321, Moulis, France
| |
Collapse
|
7
|
Pértille F, Da Silva VH, Johansson AM, Lindström T, Wright D, Coutinho LL, Jensen P, Guerrero-Bosagna C. Mutation dynamics of CpG dinucleotides during a recent event of vertebrate diversification. Epigenetics 2019; 14:685-707. [PMID: 31070073 PMCID: PMC6557589 DOI: 10.1080/15592294.2019.1609868] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
DNA methylation in CpGs dinucleotides is associated with high mutability and disappearance of CpG sites during evolution. Although the high mutability of CpGs is thought to be relevant for vertebrate evolution, very little is known on the role of CpG-related mutations in the genomic diversification of vertebrates. Our study analysed genetic differences in chickens, between Red Junglefowl (RJF; the living closest relative to the ancestor of domesticated chickens) and domesticated breeds, to identify genomic dynamics that have occurred during the process of their domestication, focusing particularly on CpG-related mutations. Single nucleotide polymorphisms (SNPs) and copy number variations (CNVs) between RJF and these domesticated breeds were assessed in a reduced fraction of their genome. Additionally, DNA methylation in the same fraction of the genome was measured in the sperm of RJF individuals to identify possible correlations with the mutations found between RJF and the domesticated breeds. Our study shows that although the vast majority of CpG-related mutations found relate to CNVs, CpGs disproportionally associate to SNPs in comparison to CNVs, where they are indeed substantially under-represented. Moreover, CpGs seem to be hotspots of mutations related to speciation. We suggest that, on the one hand, CpG-related mutations in CNV regions would promote genomic ‘flexibility’ in evolution, i.e., the ability of the genome to expand its functional possibilities; on the other hand, CpG-related mutations in SNPs would relate to genomic ‘specificity’ in evolution, thus, representing mutations that would associate with phenotypic traits relevant for speciation.
Collapse
Affiliation(s)
- Fábio Pértille
- a Avian Behavioral Genomics and Physiology Group, IFM Biology , Linköping University , Linköping , Sweden.,b Animal Biotechnology Laboratory, Animal Science Department , University of São Paulo (USP)/Luiz de Queiroz College of Agriculture (ESALQ) , Piracicaba , São Paulo , Brazil
| | - Vinicius H Da Silva
- c Animal Breeding and Genomics Centre , Wageningen University & Research , Wageningen , The Netherlands.,d Department of Animal Ecology (AnE) , Netherlands Institute of Ecology (NIOO-KNAW) , Wageningen , The Netherlands.,e Department of Animal Breeding and Genetics , Swedish University of Agricultural Sciences , Uppsala , Sweden
| | - Anna M Johansson
- e Department of Animal Breeding and Genetics , Swedish University of Agricultural Sciences , Uppsala , Sweden
| | - Tom Lindström
- f Division of Theoretical Biology, IFM , Linköping University , Linköping , Sweden
| | - Dominic Wright
- a Avian Behavioral Genomics and Physiology Group, IFM Biology , Linköping University , Linköping , Sweden
| | - Luiz L Coutinho
- b Animal Biotechnology Laboratory, Animal Science Department , University of São Paulo (USP)/Luiz de Queiroz College of Agriculture (ESALQ) , Piracicaba , São Paulo , Brazil
| | - Per Jensen
- a Avian Behavioral Genomics and Physiology Group, IFM Biology , Linköping University , Linköping , Sweden
| | - Carlos Guerrero-Bosagna
- a Avian Behavioral Genomics and Physiology Group, IFM Biology , Linköping University , Linköping , Sweden
| |
Collapse
|
8
|
Danchin E, Pocheville A, Rey O, Pujol B, Blanchet S. Epigenetically facilitated mutational assimilation: epigenetics as a hub within the inclusive evolutionary synthesis. Biol Rev Camb Philos Soc 2018. [PMCID: PMC6378602 DOI: 10.1111/brv.12453] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
After decades of debate about the existence of non‐genetic inheritance, the focus is now slowly shifting towards dissecting its underlying mechanisms. Here, we propose a new mechanism that, by integrating non‐genetic and genetic inheritance, may help build the long‐sought inclusive vision of evolution. After briefly reviewing the wealth of evidence documenting the existence and ubiquity of non‐genetic inheritance in a table, we review the categories of mechanisms of parent–offspring resemblance that underlie inheritance. We then review several lines of argument for the existence of interactions between non‐genetic and genetic components of inheritance, leading to a discussion of the contrasting timescales of action of non‐genetic and genetic inheritance. This raises the question of how the fidelity of the inheritance system can match the rate of environmental variation. This question is central to understanding the role of different inheritance systems in evolution. We then review and interpret evidence indicating the existence of shifts from inheritance systems with low to higher transmission fidelity. Based on results from different research fields we propose a conceptual hypothesis linking genetic and non‐genetic inheritance systems. According to this hypothesis, over the course of generations, shifts among information systems allow gradual matching between the rate of environmental change and the inheritance fidelity of the corresponding response. A striking conclusion from our review is that documented shifts between types of inherited non‐genetic information converge towards epigenetics (i.e. inclusively heritable molecular variation in gene expression without change in DNA sequence). We then interpret the well‐documented mutagenicity of epigenetic marks as potentially generating a final shift from epigenetic to genetic encoding. This sequence of shifts suggests the existence of a relay in inheritance systems from relatively labile ones to gradually more persistent modes of inheritance, a relay that could constitute a new mechanistic basis for the long‐proposed, but still poorly documented, hypothesis of genetic assimilation. A profound difference between the genocentric and the inclusive vision of heredity revealed by the genetic assimilation relay proposed here lies in the fact that a given form of inheritance can affect the rate of change of other inheritance systems. To explore the consequences of such inter‐connection among inheritance systems, we briefly review published theoretical models to build a model of genetic assimilation focusing on the shift in the engraving of environmentally induced phenotypic variation into the DNA sequence. According to this hypothesis, when environmental change remains stable over a sufficient number of generations, the relay among inheritance systems has the potential to generate a form of genetic assimilation. In this hypothesis, epigenetics appears as a hub by which non‐genetically inherited environmentally induced variation in traits can become genetically encoded over generations, in a form of epigenetically facilitated mutational assimilation. Finally, we illustrate some of the major implications of our hypothetical framework, concerning mutation randomness, the central dogma of molecular biology, concepts of inheritance and the curing of inherited disorders, as well as for the emergence of the inclusive evolutionary synthesis.
Collapse
Affiliation(s)
- Etienne Danchin
- Laboratoire Évolution & Diversité Biologique (EDB UMR 5174); Université de Toulouse Midi-Pyrénées, CNRS, IRD, UPS. 118 route de Narbonne, Bat 4R1; 31062 Toulouse Cedex 9 France
| | - Arnaud Pocheville
- Laboratoire Évolution & Diversité Biologique (EDB UMR 5174); Université de Toulouse Midi-Pyrénées, CNRS, IRD, UPS. 118 route de Narbonne, Bat 4R1; 31062 Toulouse Cedex 9 France
- Department of Philosophy and Charles Perkins Centre; University of Sydney; Sydney NSW 2006 Australia
| | - Olivier Rey
- CNRS, Station d'Ecologie Théorique et Expérimentale (SETE), UMR5321; 09200 Moulis France
- Université de Perpignan Via Domitia, IHPE UMR 5244, CNRS, IFREMER, Université de Montpellier; F-66860 Perpignan France
| | - Benoit Pujol
- Laboratoire Évolution & Diversité Biologique (EDB UMR 5174); Université de Toulouse Midi-Pyrénées, CNRS, IRD, UPS. 118 route de Narbonne, Bat 4R1; 31062 Toulouse Cedex 9 France
| | - Simon Blanchet
- Laboratoire Évolution & Diversité Biologique (EDB UMR 5174); Université de Toulouse Midi-Pyrénées, CNRS, IRD, UPS. 118 route de Narbonne, Bat 4R1; 31062 Toulouse Cedex 9 France
- CNRS, Station d'Ecologie Théorique et Expérimentale (SETE), UMR5321; 09200 Moulis France
| |
Collapse
|
9
|
Guerrero-Bosagna C. Evolution with No Reason: A Neutral View on Epigenetic Changes, Genomic Variability, and Evolutionary Novelty. Bioscience 2017. [DOI: 10.1093/biosci/bix021] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
|
10
|
Statistical Methods for Identifying Sequence Motifs Affecting Point Mutations. Genetics 2016; 205:843-856. [PMID: 27974498 DOI: 10.1534/genetics.116.195677] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 12/01/2016] [Indexed: 11/18/2022] Open
Abstract
Mutation processes differ between types of point mutation, genomic locations, cells, and biological species. For some point mutations, specific neighboring bases are known to be mechanistically influential. Beyond these cases, numerous questions remain unresolved, including: what are the sequence motifs that affect point mutations? How large are the motifs? Are they strand symmetric? And, do they vary between samples? We present new log-linear models that allow explicit examination of these questions, along with sequence logo style visualization to enable identifying specific motifs. We demonstrate the performance of these methods by analyzing mutation processes in human germline and malignant melanoma. We recapitulate the known CpG effect, and identify novel motifs, including a highly significant motif associated with A[Formula: see text]G mutations. We show that major effects of neighbors on germline mutation lie within [Formula: see text] of the mutating base. Models are also presented for contrasting the entire mutation spectra (the distribution of the different point mutations). We show the spectra vary significantly between autosomes and X-chromosome, with a difference in T[Formula: see text]C transition dominating. Analyses of malignant melanoma confirmed reported characteristic features of this cancer, including statistically significant strand asymmetry, and markedly different neighboring influences. The methods we present are made freely available as a Python library https://bitbucket.org/pycogent3/mutationmotif.
Collapse
|
11
|
Shang X, Su J, Wan Q, Su J. CpA/CpG methylation of CiMDA5 possesses tight association with the resistance against GCRV and negatively regulates mRNA expression in grass carp, Ctenopharyngodon idella. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2015; 48:86-94. [PMID: 25260715 DOI: 10.1016/j.dci.2014.09.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Revised: 09/19/2014] [Accepted: 09/19/2014] [Indexed: 06/03/2023]
Abstract
Melanoma differentiation-associated gene 5 (MDA5) plays a crucial role in recognizing intracellular viral infection, activating the interferon regulatory factor pathways as well as inducing antiviral response. While the antiviral regulatory mechanism of MDA5 remains unclear. In the present study, CiMDA5 (Ctenopharyngodon idella MDA5) against grass carp reovirus (GCRV) would be initially revealed from the perspective of DNA methylation, a pivotal epigenetic modification. Two CpG islands (CGIs) were predicted located in the first exon of CiMDA5, of which the first CpG island was 427 bp in length possessed 29 candidate CpG loci and 34 CpA loci, and the second one was 130 bp in length involving 7 CpG loci as well as 10 CpA loci. By bisulfite sequencing PCR (BSP), the methylation statuses were detected in spleen of 70 individuals divided into resistant/susceptible groups post challenge experiment, and the resistance-association analysis was performed with Chi-square test. Quantitative real-time RT-PCR (qRT-PCR) was carried out to explore the relationship between DNA methylation and gene expression in CiMDA5. Results indicated that the methylation levels of CpA/CpG sites at +200, +202, +204, +207 nt, which consisted of a putative densely methylated element (DME), were significantly higher in the susceptible group than those in the resistant group. Meanwhile, the average transcription of CiMDA5 was down-regulated in the susceptible individuals compared with the resistant individuals. Evidently, the DNA methylation may be the negative modulator of CiMDA5 antiviral expression. Collectively, the methylation levels of CiMDA5 demonstrated the tight association with the resistance against GCRV and the negative-regulated roles in mRNA expression. This study first discovered the resistance-associated gene modulated by DNA methylation in teleost, preliminary revealed the underlying regulatory mechanism of CiMDA5 transcription against GCRV as well as laid a theoretical foundation on molecular nosogenesis of hemorrhagic diseases in C. idella.
Collapse
Affiliation(s)
- Xueying Shang
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Jianguo Su
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China.
| | - Quanyuan Wan
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| | - Juanjuan Su
- College of Animal Science and Technology, Northwest A&F University, Yangling 712100, China
| |
Collapse
|
12
|
Skinner MK, Gurerrero-Bosagna C, Haque MM, Nilsson EE, Koop JAH, Knutie SA, Clayton DH. Epigenetics and the evolution of Darwin's Finches. Genome Biol Evol 2014; 6:1972-89. [PMID: 25062919 PMCID: PMC4159007 DOI: 10.1093/gbe/evu158] [Citation(s) in RCA: 86] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
The prevailing theory for the molecular basis of evolution involves genetic mutations that ultimately generate the heritable phenotypic variation on which natural selection acts. However, epigenetic transgenerational inheritance of phenotypic variation may also play an important role in evolutionary change. A growing number of studies have demonstrated the presence of epigenetic inheritance in a variety of different organisms that can persist for hundreds of generations. The possibility that epigenetic changes can accumulate over longer periods of evolutionary time has seldom been tested empirically. This study was designed to compare epigenetic changes among several closely related species of Darwin's finches, a well-known example of adaptive radiation. Erythrocyte DNA was obtained from five species of sympatric Darwin's finches that vary in phylogenetic relatedness. Genome-wide alterations in genetic mutations using copy number variation (CNV) were compared with epigenetic alterations associated with differential DNA methylation regions (epimutations). Epimutations were more common than genetic CNV mutations among the five species; furthermore, the number of epimutations increased monotonically with phylogenetic distance. Interestingly, the number of genetic CNV mutations did not consistently increase with phylogenetic distance. The number, chromosomal locations, regional clustering, and lack of overlap of epimutations and genetic mutations suggest that epigenetic changes are distinct and that they correlate with the evolutionary history of Darwin's finches. The potential functional significance of the epimutations was explored by comparing their locations on the genome to the location of evolutionarily important genes and cellular pathways in birds. Specific epimutations were associated with genes related to the bone morphogenic protein, toll receptor, and melanogenesis signaling pathways. Species-specific epimutations were significantly overrepresented in these pathways. As environmental factors are known to result in heritable changes in the epigenome, it is possible that epigenetic changes contribute to the molecular basis of the evolution of Darwin's finches.
Collapse
Affiliation(s)
- Michael K Skinner
- Center for Reproductive Biology, School of Biological Sciences, Washington State University
| | - Carlos Gurerrero-Bosagna
- Center for Reproductive Biology, School of Biological Sciences, Washington State UniversityPresent address: Department of Physics, Biology and Chemistry (IFM), Linköping University, Sweden
| | - M Muksitul Haque
- Center for Reproductive Biology, School of Biological Sciences, Washington State University
| | - Eric E Nilsson
- Center for Reproductive Biology, School of Biological Sciences, Washington State University
| | - Jennifer A H Koop
- Department of Biology, University of UtahPresent address: Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ
| | | | | |
Collapse
|
13
|
Liu X, Liu H, Guo W, Yu K. Codon substitution models based on residue similarity and their applications. Gene 2012; 509:136-41. [PMID: 22902303 DOI: 10.1016/j.gene.2012.07.075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2012] [Accepted: 07/31/2012] [Indexed: 10/28/2022]
Abstract
Codon models are now widely used to draw evolutionary inferences from alignments of homologous sequence data. Incorporating physicochemical properties of amino acids into codon models, two novel codon substitution models describing the evolution of protein-coding DNA sequences are presented based on the similarity scores of amino acids. To describe substitutions between codons a continue-time Markov process is used. Transition/transversion rate bias and nonsynonymous codon usage bias are allowed in the models. In our implementation, the parameters are estimated by maximum-likelihood (ML) method as in previous studies. Furthermore, instantaneous mutations involving more than one nucleotide position of a codon are considered in the second model. Then the two suggested models are applied to five real data sets. The analytic results indicate that the new codon models considering physicochemical properties of amino acids can provide a better fit to the data comparing with existing codon models, and then produce more reliable estimates of certain biologically important measures than existing methods.
Collapse
Affiliation(s)
- Xinsheng Liu
- Institute of Nano Science, State Key Laboratory of Mechanics and Control of Mechanical Structures, and College of Science, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China.
| | | | | | | |
Collapse
|
14
|
Finalism in Darwinian and Lamarckian Evolution: Lessons from Epigenetics and Developmental Biology. Evol Biol 2012. [DOI: 10.1007/s11692-012-9163-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
15
|
Misawa K. A codon substitution model that incorporates the effect of the GC contents, the gene density and the density of CpG islands of human chromosomes. BMC Genomics 2011; 12:397. [PMID: 21819607 PMCID: PMC3169530 DOI: 10.1186/1471-2164-12-397] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Accepted: 08/06/2011] [Indexed: 11/16/2022] Open
Abstract
Background Developing a model for codon substitutions is essential for the analyses of protein sequences. Recent studies on the mutation rates in the non-coding regions have shown that CpG mutation rates in the human genome are negatively correlated to the local GC content and to the densities of functional elements. This study aimed at understanding the effect of genomic features, namely, GC content, gene density, and frequency of CpG islands, on the rates of codon substitution in human chromosomes. Results Codon substitution rates of CpG to TpG mutations, TpG to CpG mutations, and non-CpG transitions and transversions in humans were estimated by comparing the coding regions of thousands of human and chimpanzee genes and inferring their ancestral sequences by using macaque genes as the outgroup. Since the genomic features are depending on each other, partial regression coefficients of these features were obtained. Conclusion The substitution rates of codons depend on gene densities of the chromosomes. Transcription-associated mutation is one such pressure. On the basis of these results, a model of codon substitutions that incorporates the effect of genomic features on codon substitution in human chromosomes was developed.
Collapse
Affiliation(s)
- Kazuharu Misawa
- Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, Fusion of Data and Analysis Research and Development Team, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa 230-0045, Japan.
| |
Collapse
|
16
|
Ying H, Huttley G. Exploiting CpG hypermutability to identify phenotypically significant variation within human protein-coding genes. Genome Biol Evol 2011; 3:938-49. [PMID: 21398426 PMCID: PMC3184784 DOI: 10.1093/gbe/evr021] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The CpG dinucleotide is disproportionately represented in human genetic variation due to the hypermutability of 5-methyl-cytosine (5mC). We exploit this hypermutability and a novel codon substitution model to identify candidate functionally important exonic nucleotides. Population genetic theory suggests that codon positions with high cross-species CpG frequency will derive from stronger purifying selection. Using the phylogeny-based maximum likelihood inference framework, we applied codon substitution models with context-dependent parameters to measure the mutagenic and selective processes affecting CpG dinucleotides within exonic sequence. The suitability of these models was validated on >2,000 protein coding genes from a naturally occurring biological control, four yeast species that do not methylate their DNA. As expected, our analyses of yeast revealed no evidence for an elevated CpG transition rate or for substitution suppression affecting CpG-containing codons. Our analyses of >12,000 protein-coding genes from four primate lineages confirm the systemic influence of 5mC hypermutability on the divergence of these genes. After adjusting for confounding influences of mutation and the properties of the encoded amino acids, we confirmed that CpG-containing codons are under greater purifying selection in primates. Genes with significant evidence of enhanced suppression of nonsynonymous CpG changes were also shown to be significantly enriched in Online Mendelian Inheritance in Man. We developed a method for ranking candidate phenotypically influential CpG positions in human genes. Application of this method indicates that of the ∼1 million exonic CpG dinucleotides within humans, ∼20% are strong candidates for both hypermutability and disease association.
Collapse
Affiliation(s)
- Hua Ying
- Department of Genome Biology, John Curtin School of Medical Research, The Australian National University, Canberra, ACT 0200, Australia
| | | |
Collapse
|
17
|
Misawa K, Kikuno RF. Relationship between amino acid composition and gene expression in the mouse genome. BMC Res Notes 2011; 4:20. [PMID: 21272306 PMCID: PMC3038927 DOI: 10.1186/1756-0500-4-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2010] [Accepted: 01/27/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Codon bias is a phenomenon that refers to the differences in the frequencies of synonymous codons among different genes. In many organisms, natural selection is considered to be a cause of codon bias because codon usage in highly expressed genes is biased toward optimal codons. Methods have previously been developed to predict the expression level of genes from their nucleotide sequences, which is based on the observation that synonymous codon usage shows an overall bias toward a few codons called major codons. However, the relationship between codon bias and gene expression level, as proposed by the translation-selection model, is less evident in mammals. FINDINGS We investigated the correlations between the expression levels of 1,182 mouse genes and amino acid composition, as well as between gene expression and codon preference. We found that a weak but significant correlation exists between gene expression levels and amino acid composition in mouse. In total, less than 10% of variation of expression levels is explained by amino acid components. We found the effect of codon preference on gene expression was weaker than the effect of amino acid composition, because no significant correlations were observed with respect to codon preference. CONCLUSION These results suggest that it is difficult to predict expression level from amino acid components or from codon bias in mouse.
Collapse
Affiliation(s)
- Kazuharu Misawa
- Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, Fusion of Data and Analysis Research and Development Team, RIKEN, 4-6-1 Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan.
| | | |
Collapse
|
18
|
Skinner MK, Manikkam M, Guerrero-Bosagna C. Epigenetic transgenerational actions of environmental factors in disease etiology. Trends Endocrinol Metab 2010; 21:214-22. [PMID: 20074974 PMCID: PMC2848884 DOI: 10.1016/j.tem.2009.12.007] [Citation(s) in RCA: 464] [Impact Index Per Article: 33.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/13/2009] [Revised: 12/09/2009] [Accepted: 12/14/2009] [Indexed: 12/26/2022]
Abstract
The ability of environmental factors to promote a phenotype or disease state not only in the individual exposed but also in subsequent progeny for successive generations is termed transgenerational inheritance. The majority of environmental factors such as nutrition or toxicants such as endocrine disruptors do not promote genetic mutations or alterations in DNA sequence. However, these factors do have the capacity to alter the epigenome. Epimutations in the germline that become permanently programmed can allow transmission of epigenetic transgenerational phenotypes. This review provides an overview of the epigenetics and biology of how environmental factors can promote transgenerational phenotypes and disease.
Collapse
Affiliation(s)
- Michael K Skinner
- Center for Reproductive Biology, School of Molecular Biosciences, Washington State University, Pullman, WA 99164-4236, USA.
| | | | | |
Collapse
|
19
|
Huttley G. Do genomic datasets resolve the correct relationship among the placental, marsupial and monotreme lineages? AUST J ZOOL 2009. [DOI: 10.1071/zo09049] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Did the mammal radiation arise through initial divergence of prototherians from a common ancestor of metatherians and eutherians, the Theria hypothesis, or of eutherians from a common ancestor of metatherians and prototherians, the Marsupionta hypothesis? Molecular phylogenetic analyses of point substitutions applied to this problem have been contradictory – mtDNA-encoded sequences supported Marsupionta, nuclear-encoded sequences and RY (purine–pyrimidine)-recoded mtDNA supported Theria. The consistency property of maximum likelihood guarantees convergence on the true tree only with longer alignments. Results from analyses of genome datasets should therefore be impervious to choice of outgroup. We assessed whether important hypotheses concerning mammal evolution, including Theria/Marsupionta and the branching order of rodents, carnivorans and primates, are resolved by phylogenetic analyses using ~2.3 megabases of protein-coding sequence from genome projects. In each case, only two tree topologies were being compared and thus inconsistency in resolved topologies can only derive from flawed models of sequence divergence. The results from all substitution models strongly supported Theria. For the eutherian lineages, all models were sensitive to the outgroup. We argue that phylogenetic inference from point substitutions will remain unreliable until substitution models that better match biological mechanisms of sequence divergence have been developed.
Collapse
|
20
|
Lindsay H, Yap VB, Ying H, Huttley GA. Pitfalls of the most commonly used models of context dependent substitution. Biol Direct 2008; 3:52. [PMID: 19087239 PMCID: PMC2628887 DOI: 10.1186/1745-6150-3-52] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Accepted: 12/16/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Neighboring nucleotides exert a striking influence on mutation, with the hypermutability of CpG dinucleotides in many genomes being an exemplar. Among the approaches employed to measure the relative importance of sequence neighbors on molecular evolution have been continuous-time Markov process models for substitutions that treat sequences as a series of independent tuples. The most widely used examples are the codon substitution models. We evaluated the suitability of derivatives of the nucleotide frequency weighted (hereafter NF) and tuple frequency weighted (hereafter TF) models for measuring sequence context dependent substitution. Critical properties we address are their relationships to an independent nucleotide process and the robustness of parameter estimation to changes in sequence composition. We then consider the impact on inference concerning dinucleotide substitution processes from application of these two forms to intron sequence alignments from primates. RESULTS We prove that the NF form always nests the independent nucleotide process and that this is not true for the TF form. As a consequence, using TF to study context effects can be misleading, which is shown by both theoretical calculations and simulations. We describe a simple example where a context parameter estimated under TF is confounded with composition terms unless all sequence states are equi-frequent. We illustrate this for the dinucleotide case by simulation under a nucleotide model, showing that the TF form identifies a CpG effect when none exists. Our analysis of primate introns revealed that the effect of nucleotide neighbors is over-estimated under TF compared with NF. Parameter estimates for a number of contexts are also strikingly discordant between the two model forms. CONCLUSION Our results establish that the NF form should be used for analysis of independent-tuple context dependent processes. Although neighboring effects in general are still important, prominent influences such as the elevated CpG transversion rate previously identified using the TF form are an artifact. Our results further suggest as few as 5 parameters may account for approximately 85% of neighboring nucleotide influence.
Collapse
Affiliation(s)
- Helen Lindsay
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra, Australia.
| | | | | | | |
Collapse
|
21
|
Oscamou M, McDonald D, Yap VB, Huttley GA, Lladser ME, Knight R. Comparison of methods for estimating the nucleotide substitution matrix. BMC Bioinformatics 2008; 9:511. [PMID: 19046431 PMCID: PMC2655096 DOI: 10.1186/1471-2105-9-511] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2008] [Accepted: 12/01/2008] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND The nucleotide substitution rate matrix is a key parameter of molecular evolution. Several methods for inferring this parameter have been proposed, with different mathematical bases. These methods include counting sequence differences and taking the log of the resulting probability matrices, methods based on Markov triples, and maximum likelihood methods that infer the substitution probabilities that lead to the most likely model of evolution. However, the speed and accuracy of these methods has not been compared. RESULTS Different methods differ in performance by orders of magnitude (ranging from 1 ms to 10 s per matrix), but differences in accuracy of rate matrix reconstruction appear to be relatively small. Encouragingly, relatively simple and fast methods can provide results at least as accurate as far more complex and computationally intensive methods, especially when the sequences to be compared are relatively short. CONCLUSION Based on the conditions tested, we recommend the use of method of Gojobori et al. (1982) for long sequences (> 600 nucleotides), and the method of Goldman et al. (1996) for shorter sequences (< 600 nucleotides). The method of Barry and Hartigan (1987) can provide somewhat more accuracy, measured as the Euclidean distance between the true and inferred matrices, on long sequences (> 2000 nucleotides) at the expense of substantially longer computation time. The availability of methods that are both fast and accurate will allow us to gain a global picture of change in the nucleotide substitution rate matrix on a genomewide scale across the tree of life.
Collapse
Affiliation(s)
- Maribeth Oscamou
- Department of Applied Mathematics, University of Colorado, Boulder, CO, USA
| | - Daniel McDonald
- Department of Computer Science, University of Colorado, Boulder, CO, USA
| | - Von Bing Yap
- Department of Statistics and Applied Probability, National University of Singapore, 21 Lower Kent Ridge Road 119077, Singapore
| | - Gavin A Huttley
- John Curtin School of Medical Research, Australian National University, Canberra, Australia
| | - Manuel E Lladser
- Department of Applied Mathematics, University of Colorado, Boulder, CO, USA
| | - Rob Knight
- Department of Chemistry & Biochemistry, University of Colorado, Boulder, CO, USA
| |
Collapse
|
22
|
Misawa K, Kikuno RF. Evaluation of the effect of CpG hypermutability on human codon substitution. Gene 2008; 431:18-22. [PMID: 19059467 DOI: 10.1016/j.gene.2008.11.006] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2008] [Revised: 10/10/2008] [Accepted: 11/06/2008] [Indexed: 10/21/2022]
Abstract
Understanding the cause underlying the changes in amino acid composition of proteins is essential for understanding protein evolution and function. Accurate models of DNA and protein evolution are essential for studying molecular evolution. Although many models have been developed, most models assume that each site evolves independently and that substitutions are time reversible. In mammals and other organisms, CpG hypermutability is one of the major causes of nucleotide mutations because CpG dinucleotides are often methylated at C, and the methyl-C mutation spontaneously deaminates to yield T about 3 times more rapidly than other types of point mutations. In this study, we evaluate the effect of CpG hypermutability on codon substitution by comparing thousands of coding regions in the human and chimpanzee genomes and by inferring ancestral sequences by using mouse as the outgroup. We found that 14% of synonymous and nonsynonymous substitutions on human genes were caused by CpG hypermutability. Based on these results, we developed a model that incorporates CpG hypermutability as well as the transition/transversion ratio and changes in the chemical properties of amino acids.
Collapse
Affiliation(s)
- Kazuharu Misawa
- Chiba Industry Advancement Center, 2-6 Nakase, Mihama-ku, Chiba 261-7126, Japan.
| | | |
Collapse
|
23
|
Abstract
Probabilistic models of sequence evolution are in widespread use in phylogenetics and molecular sequence evolution. These models have become increasingly sophisticated and combined with statistical model comparison techniques have helped to shed light on how genes and proteins evolve. Models of codon evolution have been particularly useful, because, in addition to providing a significant improvement in model realism for protein-coding sequences, codon models can also be designed to test hypotheses about the selective pressures that shape the evolution of the sequences. Such models typically assume a phylogeny and can be used to identify sites or lineages that have evolved adaptively. Recently some of the key assumptions that underlie phylogenetic tests of selection have been questioned, such as the assumption that the rate of synonymous changes is constant across sites or that a single phylogenetic tree can be assumed at all sites for recombining sequences. While some of these issues have been addressed through the development of novel methods, others remain as caveats that need to be considered on a case-by-case basis. Here, we outline the theory of codon models and their application to the detection of positive selection. We review some of the more recent developments that have improved their power and utility, laying a foundation for further advances in the modeling of coding sequence evolution.
Collapse
Affiliation(s)
- Wayne Delport
- University of Cape Town, Observatory, 7925, Cape Town, South Africa
| | | | | |
Collapse
|
24
|
Anisimova M, Kosiol C. Investigating protein-coding sequence evolution with probabilistic codon substitution models. Mol Biol Evol 2008; 26:255-71. [PMID: 18922761 DOI: 10.1093/molbev/msn232] [Citation(s) in RCA: 127] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
This review is motivated by the true explosion in the number of recent studies both developing and ameliorating probabilistic models of codon evolution. Traditionally parametric, the first codon models focused on estimating the effects of selective pressure on the protein via an explicit parameter in the maximum likelihood framework. Likelihood ratio tests of nested codon models armed the biologists with powerful tools, which provided unambiguous evidence for positive selection in real data. This, in turn, triggered a new wave of methodological developments. The new generation of models views the codon evolution process in a more sophisticated way, relaxing several mathematical assumptions. These models make a greater use of physicochemical amino acid properties, genetic code machinery, and the large amounts of data from the public domain. The overview of the most recent advances on modeling codon evolution is presented here, and a wide range of their applications to real data is discussed. On the downside, availability of a large variety of models, each accounting for various biological factors, increases the margin for misinterpretation; the biological meaning of certain parameters may vary among models, and model selection procedures also deserve greater attention. Solid understanding of the modeling assumptions and their applicability is essential for successful statistical data analysis.
Collapse
Affiliation(s)
- Maria Anisimova
- Institute of Computational Science, Swiss Federal Institute of Technology, Zurich, Switzerland.
| | | |
Collapse
|
25
|
The universal trend of amino acid gain-loss is caused by CpG hypermutability. J Mol Evol 2008; 67:334-42. [PMID: 18810523 DOI: 10.1007/s00239-008-9141-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2008] [Revised: 05/31/2008] [Accepted: 06/23/2008] [Indexed: 10/21/2022]
Abstract
Understanding the cause of the changes in the amino acid composition of proteins is essential for understanding the evolution of protein functions. Since the early 1970s, it has been known that the frequency of some amino acids in protein sequences is increasing and that of others is decreasing. Recently, it was found that the trends of amino acid changes were similar in 15 taxa representing Bacteria, Archaea, and Eukaryota. However, the cause of this similarity in the trend of the gains and losses of amino acids continued to be debated. Here, we show that this trend of the gain and loss of amino acids can be simply explained by CpG hypermutability. We found that the frequency of amino acids coded by codons with TpG dinucleotides and those with CpA dinucleotides is increasing, while that of amino acids coded by codons with CpG dinucleotides is decreasing. We also found that organisms that lack DNA methyltransferase show different trends of the gain and loss of amino acids. DNA methyltransferase methylates CpG dinucleotides and induces CpG hypermutability. The incorporation of CpG hypermutability into models of protein evolution will improve studies on protein evolution in different organisms.
Collapse
|
26
|
Knight R, Maxwell P, Birmingham A, Carnes J, Caporaso JG, Easton BC, Eaton M, Hamady M, Lindsay H, Liu Z, Lozupone C, McDonald D, Robeson M, Sammut R, Smit S, Wakefield MJ, Widmann J, Wikman S, Wilson S, Ying H, Huttley GA. PyCogent: a toolkit for making sense from sequence. Genome Biol 2008; 8:R171. [PMID: 17708774 PMCID: PMC2375001 DOI: 10.1186/gb-2007-8-8-r171] [Citation(s) in RCA: 147] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2007] [Revised: 08/13/2007] [Accepted: 08/21/2007] [Indexed: 12/30/2022] Open
Abstract
The COmparative GENomic Toolkit, a framework for probabilistic analyses of biological sequences, devising workflows and generating publication quality graphics, has been implemented in Python. We have implemented in Python the COmparative GENomic Toolkit, a fully integrated and thoroughly tested framework for novel probabilistic analyses of biological sequences, devising workflows, and generating publication quality graphics. PyCogent includes connectors to remote databases, built-in generalized probabilistic techniques for working with biological sequences, and controllers for third-party applications. The toolkit takes advantage of parallel architectures and runs on a range of hardware and operating systems, and is available under the general public license from .
Collapse
Affiliation(s)
- Rob Knight
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, USA
| | - Peter Maxwell
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | | | - Jason Carnes
- Seattle Biomedical Research Institute, Seattle, Washington, USA
| | - J Gregory Caporaso
- Department of Biochemistry and Molecular Genetics, University of Colorado Health Sciences Center, Aurora, Colorado, USA
| | - Brett C Easton
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Michael Eaton
- Science Applications International Corporation, Englewood, Colorado, USA
| | - Micah Hamady
- Department of Computer Science, University of Colorado, Boulder, Colorado, USA
| | - Helen Lindsay
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Zongzhi Liu
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, USA
| | - Catherine Lozupone
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, USA
| | - Daniel McDonald
- Department of Computer Science, University of Colorado, Boulder, Colorado, USA
| | - Michael Robeson
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, USA
| | - Raymond Sammut
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Sandra Smit
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, USA
| | - Matthew J Wakefield
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
- Walter and Eliza Hall Institute, Melbourne, Victoria, Australia
| | - Jeremy Widmann
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, USA
| | - Shandy Wikman
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, Colorado, USA
| | - Stephanie Wilson
- Department of Computer Science, University of Colorado, Boulder, Colorado, USA
| | - Hua Ying
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| | - Gavin A Huttley
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra, Australian Capital Territory, Australia
| |
Collapse
|
27
|
Wakefield MJ, Maxwell P, Huttley GA. Vestige: maximum likelihood phylogenetic footprinting. BMC Bioinformatics 2005; 6:130. [PMID: 15921531 PMCID: PMC1156870 DOI: 10.1186/1471-2105-6-130] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2005] [Accepted: 05/29/2005] [Indexed: 11/24/2022] Open
Abstract
Background Phylogenetic footprinting is the identification of functional regions of DNA by their evolutionary conservation. This is achieved by comparing orthologous regions from multiple species and identifying the DNA regions that have diverged less than neutral DNA. Vestige is a phylogenetic footprinting package built on the PyEvolve toolkit that uses probabilistic molecular evolutionary modelling to represent aspects of sequence evolution, including the conventional divergence measure employed by other footprinting approaches. In addition to measuring the divergence, Vestige allows the expansion of the definition of a phylogenetic footprint to include variation in the distribution of any molecular evolutionary processes. This is achieved by displaying the distribution of model parameters that represent partitions of molecular evolutionary substitutions. Examination of the spatial incidence of these effects across regions of the genome can identify DNA segments that differ in the nature of the evolutionary process. Results Vestige was applied to a reference dataset of the SCL locus from four species and provided clear identification of the known conserved regions in this dataset. To demonstrate the flexibility to use diverse models of molecular evolution and dissect the nature of the evolutionary process Vestige was used to footprint the Ka/Ks ratio in primate BRCA1 with a codon model of evolution. Two regions of putative adaptive evolution were identified illustrating the ability of Vestige to represent the spatial distribution of distinct molecular evolutionary processes. Conclusion Vestige provides a flexible, open platform for phylogenetic footprinting. Underpinned by the PyEvolve toolkit, Vestige provides a framework for visualising the signatures of evolutionary processes across the genome of numerous organisms simultaneously. By exploiting the maximum-likelihood statistical framework, the complex interplay between mutational processes, DNA repair and selection can be evaluated both spatially (along a sequence alignment) and temporally (for each branch of the tree) providing visual indicators to the attributes and functions of DNA sequences.
Collapse
Affiliation(s)
- Matthew J Wakefield
- Predictive Medicine Group, John Curtin School of Medical Research, The Australian National University, Canberra 0200 ACT Australia
- ARC Centre for Kangaroo Genomics, John Curtin School of Medical Research, The Australian National University, Canberra 0200 ACT Australia
| | - Peter Maxwell
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra 0200 ACT Australia
- Centre for BioInformation Science, John Curtin School of Medical Research, The Australian National University, Canberra 0200 ACT Australia
| | - Gavin A Huttley
- Computational Genomics Laboratory, John Curtin School of Medical Research, The Australian National University, Canberra 0200 ACT Australia
- Centre for BioInformation Science, John Curtin School of Medical Research, The Australian National University, Canberra 0200 ACT Australia
| |
Collapse
|
28
|
Beier UH, Görögh T. Implications of galactocerebrosidase and galactosylcerebroside metabolism in cancer cells. Int J Cancer 2005; 115:6-10. [PMID: 15657896 DOI: 10.1002/ijc.20851] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Galactosylcerebroside is known to be overexpressed upon the cellular surface of a variety of cancers. In squamous cell carcinomas of the head and neck, one explanation for galactosylcerebroside accumulation has been identified as a transcriptional repression of the galactocerebrosidase gene. Galactocerebrosidase is the enzyme responsible for degrading galactosylcerebroside to ceramide. Ceramide is an important apoptosis activator, whereas galactosylcerebroside functions as an inhibitor. A shift of the ceramide metabolism balance in favor of glycosylated forms has been identified as a mechanism of drug resistance for several antineoplastic agents. Our review elaborates on possible explanations for galactocerebrosidase suppression and on other explanations for increased glycosphingolipid concentration within cancer cell membranes. Furthermore, conjecturable influences of a repressed galactocerebrosidase expression on tumor biology are to be explained. The inhibiting transcription factors YY1 and AP2 have been identified as potential galactocerebrosidase gene suppressors. The resulting accumulation of galactosylcerebroside promotes a reduction of cellular adhesion and inhibits apoptosis, leading to increased cellular growth, migration and prolonged cell survival contributing to carcinogenesis.
Collapse
Affiliation(s)
- Ulf Henning Beier
- Division of Molecular Oncology, Department of Otorhinolaryngology, Head and Neck Surgery,Christian-Albrechts-University of Kiel, Kiel, Germany
| | | |
Collapse
|