1
|
Wei PJ, Guo Z, Gao Z, Ding Z, Cao RF, Su Y, Zheng CH. Inference of gene regulatory networks based on directed graph convolutional networks. Brief Bioinform 2024; 25:bbae309. [PMID: 38935070 PMCID: PMC11209731 DOI: 10.1093/bib/bbae309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 05/17/2024] [Indexed: 06/28/2024] Open
Abstract
Inferring gene regulatory network (GRN) is one of the important challenges in systems biology, and many outstanding computational methods have been proposed; however there remains some challenges especially in real datasets. In this study, we propose Directed Graph Convolutional neural network-based method for GRN inference (DGCGRN). To better understand and process the directed graph structure data of GRN, a directed graph convolutional neural network is conducted which retains the structural information of the directed graph while also making full use of neighbor node features. The local augmentation strategy is adopted in graph neural network to solve the problem of poor prediction accuracy caused by a large number of low-degree nodes in GRN. In addition, for real data such as E.coli, sequence features are obtained by extracting hidden features using Bi-GRU and calculating the statistical physicochemical characteristics of gene sequence. At the training stage, a dynamic update strategy is used to convert the obtained edge prediction scores into edge weights to guide the subsequent training process of the model. The results on synthetic benchmark datasets and real datasets show that the prediction performance of DGCGRN is significantly better than existing models. Furthermore, the case studies on bladder uroepithelial carcinoma and lung cancer cells also illustrate the performance of the proposed model.
Collapse
Affiliation(s)
- Pi-Jing Wei
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Ziqiang Guo
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Zhen Gao
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Zheng Ding
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Rui-Fen Cao
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Yansen Su
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| | - Chun-Hou Zheng
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, 111 Jiulong Road, 230601, Anhui, China
| |
Collapse
|
2
|
Love M, Samora L, Barker D, Zukosky P, Kummet N, Ahmad A, Bernhardt D, Tripathi M, Klotz S, Ahmad N. Genetic Analysis of HIV-1 vpr Sequences from HIV-Infected Older Patients on Long-Term Antiretroviral Therapy. Curr HIV Res 2022; 20:309-320. [PMID: 35792120 DOI: 10.2174/1570162x20666220705124341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 04/20/2022] [Accepted: 04/28/2022] [Indexed: 01/27/2023]
Abstract
BACKGROUND Many HIV-infected individuals have achieved undetectable viral load and increased CD4 T cell counts due to the success of Antiretroviral Therapy (ART). However, HIV persists in resting T cells, monocytes/macrophages and other quiescent cells. Furthermore, the HIV- 1 vpr accessory gene may play an important role in the persistence of HIV in these infected patients. OBJECTIVES Therefore, we characterized the HIV-1 vpr gene from PBMC DNA of 14 HIV-infected older patients on long-term ART with mostly undetectable viral load and increased CD4 T cell counts. METHODS Peripheral Blood Mononuclear Cells (PBMC) were isolated from 14 HIV-infected individuals, followed by extraction of genomic DNA, amplification of HIV-1 vpr gene by polymerase chain reaction (PCR), cloning of vpr gene in TOPO vector and characterization of correct size recombinant inserts containing vpr genes. An average of 13 clones were sequenced from each patient, followed by sequence analysis by bioinformatic tools. RESULTS Phylogenetic analysis of 182 vpr sequences demonstrated that the vpr sequences of each patient were well separated and discriminated from other patients' sequences and formed distinct clusters. The vpr sequences showed a low degree of viral heterogeneity, lower estimates of genetic diversity and about half of the patients' sequences were under positive selection pressure. While the majority of the vpr deduced amino acid sequences from most patients contained intact open reading frames, several sequences, mostly from two patients, had stop codons. Numerous patient-specific and common amino acid motifs were found in deduced vpr sequences. The functional domains required for vpr activity, including virion incorporation, nuclear import of pre-integration complex and cell cycle arrest, were generally conserved in most vpr sequences. Several of the known Cytotoxic T-lymphocytes (CTL) epitopes in vpr showed variation in our patients' sequences. CONCLUSION In summary, a low degree of genetic variability, conservation of functional domains and variations in CTL epitopes were the features of vpr sequences from the 14 HIV-infected older patients with controlled viremia on long-term ART.
Collapse
Affiliation(s)
- Maria Love
- Department of Immunobiology, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| | - Luiza Samora
- Department of Immunobiology, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| | - Danae Barker
- Department of Immunobiology, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| | - Priya Zukosky
- Department of Immunobiology, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| | - Nathan Kummet
- Department of Immunobiology, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| | - Aasim Ahmad
- Department of Immunobiology, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| | - Dana Bernhardt
- Department of Immunobiology, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| | - Meghna Tripathi
- Department of Immunobiology, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| | - Stephen Klotz
- Department of Medicine, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| | - Nafees Ahmad
- Department of Immunobiology, College of Medicine, University of Arizona, Tucson, AZ 85721, Arizona, USA
| |
Collapse
|
3
|
Consistent Clustering Pattern of Prokaryotic Genes Based on Base Frequency at the Second Codon Position and its Association with Functional Category Preference. Interdiscip Sci 2022; 14:349-357. [PMID: 34817803 PMCID: PMC9124167 DOI: 10.1007/s12539-021-00493-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 11/02/2021] [Accepted: 11/07/2021] [Indexed: 10/26/2022]
Abstract
AbstractIn 2002, our research group observed a gene clustering pattern based on the base frequency of A versus T at the second codon position in the genome of Vibrio cholera and found that the functional category distribution of genes in the two clusters was different. With the availability of a large number of sequenced genomes, we performed a systematic investigation of A2–T2 distribution and found that 2694 out of 2764 prokaryotic genomes have an optimal clustering number of two, indicating a consistent pattern. Analysis of the functional categories of the coding genes in each cluster in 1483 prokaryotic genomes indicated, that 99.33% of the genomes exhibited a significant difference (p < 0.01) in function distribution between the two clusters. Specifically, functional category P was overrepresented in the small cluster of 98.65% of genomes, whereas categories J, K, and L were overrepresented in the larger cluster of over 98.52% of genomes. Lineage analysis uncovered that these preferences appear consistently across all phyla. Overall, our work revealed an almost universal clustering pattern based on the relative frequency of A2 versus T2 and its role in functional category preference. These findings will promote the understanding of the rationality of theoretical prediction of functional classes of genes from their nucleotide sequences and how protein function is determined by DNA sequence.
Graphical abstract
Collapse
|
4
|
Ma Y, Yin J, Li G, Gao W, Lin W. Simultaneous sensing of nucleic acid and associated cellular components with organic fluorescent chemsensors. Coord Chem Rev 2020. [DOI: 10.1016/j.ccr.2019.213144] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
5
|
Cao Y, Jiang L, Wang L, Cai Y. Evolutionary Rate Heterogeneity and Functional Divergence of Orthologous Genes in Pyrus. Biomolecules 2019; 9:biom9090490. [PMID: 31527450 PMCID: PMC6770726 DOI: 10.3390/biom9090490] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Revised: 09/09/2019] [Accepted: 09/12/2019] [Indexed: 11/21/2022] Open
Abstract
Negatively selected genes (NSGs) and positively selected genes (PSGs) are the two types of most nuclear protein-coding genes in organisms. However, the evolutionary rates and characteristics of different types of genes have been rarely understood. In the present study, we investigate the rates of synonymous substitution (Ks) and the rates of non-synonymous substitution (Ka) by comparing the orthologous genes of two sequenced Pyrus species, Pyrus bretschneideri and Pyrus communis. Subsequently, we compared the evolutionary rates, gene structures, and expression profiles during different fruit development between PSGs and NSGs. Compared with the NSGs, the PSGs have fewer exons, shorter gene length, lower synonymous substitution rates and have higher evolutionary rates. Remarkably, gene expression patterns between two Pyrus species fruit indicated functional divergence for most of the orthologous genes derived from a common ancestor, and subfunctionalization for some of them. Overall, the present study shows that PSGs differs from NSGs not only under environmental selective pressure (Ka/Ks), but also in their structural, functional, and evolutionary properties. Additionally, our resulting data provides important insights for the evolution and highlights the diversification of orthologous genes in two Pyrus species.
Collapse
Affiliation(s)
- Yunpeng Cao
- Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees, Ministry of Education, Central South University of Forestry and Technology, Changsha 410004, China.
- School of Life Sciences, Anhui Agricultural University, Hefei 230036, China.
| | - Lan Jiang
- Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees, Ministry of Education, Central South University of Forestry and Technology, Changsha 410004, China.
| | - Lihu Wang
- College of Landscape and Ecological Engineering, Hebei University of Engineering, Handan 056038, China.
| | - Yongping Cai
- School of Life Sciences, Anhui Agricultural University, Hefei 230036, China.
| |
Collapse
|
6
|
A novel mitochondria-targetable probe for imaging endogenous deoxyribonucleic acid in biological systems. J Photochem Photobiol A Chem 2019. [DOI: 10.1016/j.jphotochem.2019.04.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
7
|
Brandt A, Schaefer I, Glanz J, Schwander T, Maraun M, Scheu S, Bast J. Effective purifying selection in ancient asexual oribatid mites. Nat Commun 2017; 8:873. [PMID: 29026136 PMCID: PMC5638860 DOI: 10.1038/s41467-017-01002-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 08/08/2017] [Indexed: 11/29/2022] Open
Abstract
Sex is beneficial in the long term because it can prevent mutational meltdown through increased effectiveness of selection. This idea is supported by empirical evidence of deleterious mutation accumulation in species with a recent transition to asexuality. Here, we study the effectiveness of purifying selection in oribatid mites which have lost sex millions of years ago and diversified into different families and species while reproducing asexually. We compare the accumulation of deleterious nonsynonymous and synonymous mutations between three asexual and three sexual lineages using transcriptome data. Contrasting studies of young asexual lineages, we find evidence for strong purifying selection that is more effective in asexual as compared to sexual oribatid mite lineages. Our results suggest that large populations likely sustain effective purifying selection and facilitate the escape of mutational meltdown in the absence of sex. Thus, sex per se is not a prerequisite for the long-term persistence of animal lineages. Asexual reproduction is thought to be an evolutionary dead end in eukaryotes because deleterious mutations will not be purged effectively. Here, Brandt and colleagues show that anciently asexual oribatid mites in fact have reduced accumulation of deleterious mutations compared to their sexual relatives.
Collapse
Affiliation(s)
- Alexander Brandt
- Johann-Friedrich-Blumenbach Institute of Zoology and Anthropology, Georg-August-University Goettingen, Untere Karspuele 2, DE-37073, Goettingen, Germany.
| | - Ina Schaefer
- Johann-Friedrich-Blumenbach Institute of Zoology and Anthropology, Georg-August-University Goettingen, Untere Karspuele 2, DE-37073, Goettingen, Germany
| | - Julien Glanz
- Johann-Friedrich-Blumenbach Institute of Zoology and Anthropology, Georg-August-University Goettingen, Untere Karspuele 2, DE-37073, Goettingen, Germany
| | - Tanja Schwander
- Department of Ecology and Evolution, University of Lausanne, UNIL Sorge, Le Biophore, CH-1015, Lausanne, Switzerland
| | - Mark Maraun
- Johann-Friedrich-Blumenbach Institute of Zoology and Anthropology, Georg-August-University Goettingen, Untere Karspuele 2, DE-37073, Goettingen, Germany
| | - Stefan Scheu
- Johann-Friedrich-Blumenbach Institute of Zoology and Anthropology, Georg-August-University Goettingen, Untere Karspuele 2, DE-37073, Goettingen, Germany.,Center of Biodiversity and Sustainable Land Use, Georg-August-University Goettingen, Untere Karspuele 2, DE-37073, Goettingen, Germany
| | - Jens Bast
- Department of Ecology and Evolution, University of Lausanne, UNIL Sorge, Le Biophore, CH-1015, Lausanne, Switzerland.
| |
Collapse
|
8
|
Guo Y, Liu J, Zhang J, Liu S, Du J. Selective modes determine evolutionary rates, gene compactness and expression patterns in Brassica. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 91:34-44. [PMID: 28332757 DOI: 10.1111/tpj.13541] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Revised: 02/28/2017] [Accepted: 03/15/2017] [Indexed: 05/18/2023]
Abstract
It has been well documented that most nuclear protein-coding genes in organisms can be classified into two categories: positively selected genes (PSGs) and negatively selected genes (NSGs). The characteristics and evolutionary fates of different types of genes, however, have been poorly understood. In this study, the rates of nonsynonymous substitution (Ka ) and the rates of synonymous substitution (Ks ) were investigated by comparing the orthologs between the two sequenced Brassica species, Brassica rapa and Brassica oleracea, and the evolutionary rates, gene structures, expression patterns, and codon bias were compared between PSGs and NSGs. The resulting data show that PSGs have higher protein evolutionary rates, lower synonymous substitution rates, shorter gene length, fewer exons, higher functional specificity, lower expression level, higher tissue-specific expression and stronger codon bias than NSGs. Although the quantities and values are different, the relative features of PSGs and NSGs have been largely verified in the model species Arabidopsis. These data suggest that PSGs and NSGs differ not only under selective pressure (Ka /Ks ), but also in their evolutionary, structural and functional properties, indicating that selective modes may serve as a determinant factor for measuring evolutionary rates, gene compactness and expression patterns in Brassica.
Collapse
Affiliation(s)
- Yue Guo
- Provincial Key Laboratory of Agrobiology, Institute of Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
| | - Jing Liu
- Provincial Key Laboratory of Agrobiology, Institute of Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
| | - Jiefu Zhang
- Key Laboratory of Cotton and Rapeseed, Ministry of Agriculture of People's Republic of China, Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
| | - Shengyi Liu
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture of People's Republic of China, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, 430062, China
| | - Jianchang Du
- Provincial Key Laboratory of Agrobiology, Institute of Biotechnology, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
- Key Laboratory of Cotton and Rapeseed, Ministry of Agriculture of People's Republic of China, Institute of Industrial Crops, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014, China
- Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture of People's Republic of China, Oil Crops Research Institute, Chinese Academy of Agricultural Sciences, Wuhan, 430062, China
| |
Collapse
|
9
|
Bush SJ, Kover PX, Urrutia AO. Lineage-specific sequence evolution and exon edge conservation partially explain the relationship between evolutionary rate and expression level in A. thaliana. Mol Ecol 2015; 24:3093-106. [PMID: 25930165 PMCID: PMC4480654 DOI: 10.1111/mec.13221] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Revised: 04/21/2015] [Accepted: 04/28/2015] [Indexed: 02/06/2023]
Abstract
Rapidly evolving proteins can aid the identification of genes underlying phenotypic adaptation across taxa, but functional and structural elements of genes can also affect evolutionary rates. In plants, the ‘edges’ of exons, flanking intron junctions, are known to contain splice enhancers and to have a higher degree of conservation compared to the remainder of the coding region. However, the extent to which these regions may be masking indicators of positive selection or account for the relationship between dN/dS and other genomic parameters is unclear. We investigate the effects of exon edge conservation on the relationship of dN/dS to various sequence characteristics and gene expression parameters in the model plant Arabidopsis thaliana. We also obtain lineage-specific dN/dS estimates, making use of the recently sequenced genome of Thellungiella parvula, the second closest sequenced relative after the sister species Arabidopsis lyrata. Overall, we find that the effect of exon edge conservation, as well as the use of lineage-specific substitution estimates, upon dN/dS ratios partly explains the relationship between the rates of protein evolution and expression level. Furthermore, the removal of exon edges shifts dN/dS estimates upwards, increasing the proportion of genes potentially under adaptive selection. We conclude that lineage-specific substitutions and exon edge conservation have an important effect on dN/dS ratios and should be considered when assessing their relationship with other genomic parameters.
Collapse
Affiliation(s)
- Stephen J Bush
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Paula X Kover
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| | - Araxi O Urrutia
- Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
10
|
Egan AN, Doyle J. A comparison of global, gene-specific, and relaxed clock methods in a comparative genomics framework: dating the polyploid history of soybean (Glycine max). Syst Biol 2010; 59:534-47. [PMID: 20705909 DOI: 10.1093/sysbio/syq041] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
It is widely recognized that many genes and lineages do not adhere to a molecular clock, yet molecular clocks are commonly used to date divergences in comparative genomic studies. We test the application of a molecular clock across genes and lineages in a phylogenetic framework utilizing 12 genes linked in a 1-Mb region on chromosome 13 of soybean (Glycine max); homoeologous copies of these genes formed by polyploidy in Glycine; and orthologous copies in G. tomentella, Phaseolus vulgaris, and Medicago truncatula. We compare divergence dates estimated by two methods each in three frameworks: a global molecular clock with a single rate across genes and lineages using full and approximate likelihood methods based on synonymous substitutions, a gene-specific clock assuming rate constancy over lineages but allowing a different rate for each gene, and a relaxed molecular clock where rates may vary across genes and lineages estimated under penalized likelihood and Bayesian inference. We use the cumulative variance across genes as a means of quantifying precision. Our results suggest that divergence dating methods produce results that are correlated, but that older nodes are more variable and more difficult to estimate with precision and accuracy. We also find that models incorporating less rate heterogeneity estimate older dates of divergence than more complex models, as node age increases. A mixed model nested analysis of variance testing the effects of framework, method, and gene found that framework had a significant effect on the divergence date estimates but that most variation among dates is due to variation among genes, suggesting a need to further characterize and understand the evolutionary phenomena underlying rate variation within genomes, among genes, and across lineages.
Collapse
Affiliation(s)
- Ashley N Egan
- Department of Plant Biology, L.H. Bailey Hortorium, Cornell University, 412 Mann Library Building, Ithaca, NY 14853, USA.
| | | |
Collapse
|
11
|
Codon Usage Patterns in Corynebacterium glutamicum: Mutational Bias, Natural Selection and Amino Acid Conservation. Comp Funct Genomics 2010; 2010:343569. [PMID: 20445740 PMCID: PMC2860111 DOI: 10.1155/2010/343569] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2009] [Revised: 01/29/2010] [Accepted: 02/04/2010] [Indexed: 11/17/2022] Open
Abstract
The alternative synonymous codons in Corynebacterium glutamicum, a well-known bacterium used in industry for the production of amino acid, have been investigated by multivariate analysis. As C. glutamicum is a GC-rich organism, G and C are expected to predominate at the third position of codons. Indeed, overall codon usage analyses have indicated that C and/or G ending codons are predominant in this organism. Through multivariate statistical analysis, apart from mutational selection, we identified three other trends of codon usage variation among the genes. Firstly, the majority of highly expressed genes are scattered towards the positive end of the first axis, whereas the majority of lowly expressed genes are clustered towards the other end of the first axis. Furthermore, the distinct difference in the two sets of genes was that the C ending codons are predominate in putatively highly expressed genes, suggesting that the C ending codons are translationally optimal in this organism. Secondly, the majority of the putatively highly expressed genes have a tendency to locate on the leading strand, which indicates that replicational and transciptional selection might be invoked. Thirdly, highly expressed genes are more conserved than lowly expressed genes by synonymous and nonsynonymous substitutions among orthologous genes fromthe genomes of C. glutamicum and C. diphtheriae. We also analyzed other factors such as the length of genes and hydrophobicity that might influence codon usage and found their contributions to be weak.
Collapse
|
12
|
Biro JC. Does codon bias have an evolutionary origin? Theor Biol Med Model 2008; 5:16. [PMID: 18667081 PMCID: PMC2519059 DOI: 10.1186/1742-4682-5-16] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2008] [Accepted: 07/30/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There is a 3-fold redundancy in the Genetic Code; most amino acids are encoded by more than one codon. These synonymous codons are not used equally; there is a Codon Usage Bias (CUB). This article will provide novel information about the origin and evolution of this bias. RESULTS Codon Usage Bias (CUB, defined here as deviation from equal usage of synonymous codons) was studied in 113 species. The average CUB was 29.3 +/- 1.1% (S.E.M, n = 113) of the theoretical maximum and declined progressively with evolution and increasing genome complexity. A Pan-Genomic Codon Usage Frequency (CUF) Table was constructed to describe genome-wide relationships among codons. Significant correlations were found between the number of synonymous codons and (i) the frequency of the respective amino acids (ii) the size of CUB. Numerous, statistically highly significant, internal correlations were found among codons and the nucleic acids they comprise. These strong correlations made it possible to predict missing synonymous codons (wobble bases) reliably from the remaining codons or codon residues. CONCLUSION The results put the concept of "codon bias" into a novel perspective. The internal connectivity of codons indicates that all synonymous codons might be integrated parts of the Genetic Code with equal importance in maintaining its functional integrity.
Collapse
Affiliation(s)
- Jan C Biro
- Homulus Foundation, 612 S Flower St, Los Angeles, CA 90017, USA.
| |
Collapse
|
13
|
Krishnan NM, Seligmann H, Rao BJ. Relationship between mRNA secondary structure and sequence variability in Chloroplast genes: possible life history implications. BMC Genomics 2008; 9:48. [PMID: 18226235 PMCID: PMC2276208 DOI: 10.1186/1471-2164-9-48] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2007] [Accepted: 01/28/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Synonymous sites are freer to vary because of redundancy in genetic code. Messenger RNA secondary structure restricts this freedom, as revealed by previous findings in mitochondrial genes that mutations at third codon position nucleotides in helices are more selected against than those in loops. This motivated us to explore the constraints imposed by mRNA secondary structure on evolutionary variability at all codon positions in general, in chloroplast systems. RESULTS We found that the evolutionary variability and intrinsic secondary structure stability of these sequences share an inverse relationship. Simulations of most likely single nucleotide evolution in Psilotum nudum and Nephroselmis olivacea mRNAs, indicate that helix-forming propensities of mutated mRNAs are greater than those of the natural mRNAs for short sequences and vice-versa for long sequences. Moreover, helix-forming propensity estimated by the percentage of total mRNA in helices increases gradually with mRNA length, saturating beyond 1000 nucleotides. Protection levels of functionally important sites vary across plants and proteins: r-strategists minimize mutation costs in large genes; K-strategists do the opposite. CONCLUSION Mrna length presumably predisposes shorter mRNAs to evolve under different constraints than longer mRNAs. The positive correlation between secondary structure protection and functional importance of sites suggests that some sites might be conserved due to packing-protection constraints at the nucleic acid level in addition to protein level constraints. Consequently, nucleic acid secondary structure a priori biases mutations. The converse (exposure of conserved sites) apparently occurs in a smaller number of cases, indicating a different evolutionary adaptive strategy in these plants. The differences between the protection levels of functionally important sites for r- and K-strategists reflect their respective molecular adaptive strategies. These converge with increasing domestication levels of K-strategists, perhaps because domestication increases reproductive output.
Collapse
Affiliation(s)
- Neeraja M Krishnan
- Department of Biological Sciences, Tata Institute of Fundamental Research, 1 Homi Bhabha road, Colaba, Mumbai 400005, India.
| | | | | |
Collapse
|
14
|
Wong WKP, Morse JH, Knowles JA. Evolutionary conservation and mutational spectrum of BMPR2 gene. Gene 2006; 368:84-93. [PMID: 16361068 DOI: 10.1016/j.gene.2005.10.025] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2005] [Revised: 09/27/2005] [Accepted: 10/14/2005] [Indexed: 10/25/2022]
Abstract
A variety of mutations in the bone morphogenetic protein receptor type 2 (BMPR2) have been identified in patients with pulmonary arterial hypertension. In this study, using our BMPR2 mutation database and BMPR-II protein sequences from eight distantly related species, we defined the relationship among evolutionary conservation, mutation frequency and mutation distribution. As a whole, BMPR2 is evolving slower than the average for mammalian protein-encoding genes. As expected, the kinase domain is evolving more slowly than the extracellular ligand-binding and C-terminal domains. A detailed map of evolutionary conservation shows that there are repeating peaks and valleys within the C-terminal domain, representing higher and lower evolutionary conservation. We observed a strong correlation between evolutionary conservation and the distribution of mutations along the gene. All except two, of the nineteen missense mutations occur in absolutely conserved amino acids among the vertebrate homologs. In addition, we identified six mutational hotspots (P<0.05) by comparing the observed distribution of mutations to the pattern expected from a random multinomial distribution. Furthermore, analysis of the sequence environment surrounding the mutations revealed a specific pattern of mutagenesis. Over 22% of all single base-paired substitutions and 30% of all deletions and insertions are situated within tandem or non-tandem direct repeats of at least 5-bp and may be explained by slipped-mispairing model of mutagenesis. Also, over 59% of single base-paired substitutions versus 20% of deletions and insertions are located in perfect palindromic sequences that could produce "hairpin-loop" secondary structures with relatively high thermodynamic stability under physiological conditions. In addition, 3.7% of single base-paired substitutions versus 30% of deletions and insertions are located either within or in close proximity to the Krawczak and Cooper consensus sequence (TG A/G A/G G/T A/C). Further study of the mechanism of mutagenesis in BMPR2 may help identify other potentially mutable sites and differentiate between deleterious mutations and harmless polymorphic variants.
Collapse
Affiliation(s)
- Wai K P Wong
- Department of Medicine, Columbia University/New York State Psychiatric Institute, 1051 Riverside Drive, Unit 28, Room 5917, New York, NY 10032, USA.
| | | | | |
Collapse
|
15
|
Hawk JD, Stefanovic L, Boyer JC, Petes TD, Farber RA. Variation in efficiency of DNA mismatch repair at different sites in the yeast genome. Proc Natl Acad Sci U S A 2005; 102:8639-43. [PMID: 15932942 PMCID: PMC1150857 DOI: 10.1073/pnas.0503415102] [Citation(s) in RCA: 72] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Evolutionary studies have suggested that mutation rates vary significantly at different positions in the eukaryotic genome. The mechanism that is responsible for this context-dependence of mutation rates is not understood. We demonstrate experimentally that frameshift mutation rates in yeast microsatellites depend on the genomic context and that this variation primarily reflects the context-dependence of the efficiency of DNA mismatch repair. We measured the stability of a 16.5-repeat polyGT tract by using a reporter gene (URA3-GT) in which the microsatellite was inserted in-frame into the yeast URA3 gene. We constructed 10 isogenic yeast strains with the reporter gene at different locations in the genome. Rates of frameshift mutations that abolished the correct reading frame of this gene were determined by fluctuation analysis. A 16-fold difference was found among these strains. We made mismatch-repair-deficient (msh2) derivatives of six of the strains. Mutation rates were elevated for all of these strains, but the differences in rates among the strains were substantially reduced. The simplest interpretation of this result is that the efficiency of DNA mismatch repair varies in different regions of the genome, perhaps reflecting some aspect of chromosome structure.
Collapse
Affiliation(s)
- Joshua D Hawk
- Department of Pathology and Laboratory Medicine, University of North Carolina, Chapel Hill, NC 27599, USA
| | | | | | | | | |
Collapse
|
16
|
Nikolaou C, Almirantis Y. Mutually symmetric and complementary triplets: differences in their use distinguish systematically between coding and non-coding genomic sequences. J Theor Biol 2003; 223:477-87. [PMID: 12875825 DOI: 10.1016/s0022-5193(03)00123-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The general property of asymmetry in word use in meaningful texts written in a variety of languages, motivates a quantification of the differences in the use of mutually symmetric triplets in genomic sequences. When this is done in the three reading frames, high values found for one of them are used as indication that the sequence is coding for a protein. Moreover, a similar quantification of the differences in the use of complementary triplets is introduced, again with predictive power of the coding character of a sequence. This method reflects the non-equivalence between sense and anti-sense strand of a coding segment. In both approaches, "linguistic asymmetry" in coding sequences is related to the form of the genetic code and to the bias in codon usage and amino acid use skews.
Collapse
Affiliation(s)
- Christoforos Nikolaou
- National Research Center for Physical Sciences Demokritos, Institute of Biology, 15310 Athens, Greece
| | | |
Collapse
|
17
|
Hardison RC, Roskin KM, Yang S, Diekhans M, Kent WJ, Weber R, Elnitski L, Li J, O'Connor M, Kolbe D, Schwartz S, Furey TS, Whelan S, Goldman N, Smit A, Miller W, Chiaromonte F, Haussler D. Covariation in frequencies of substitution, deletion, transposition, and recombination during eutherian evolution. Genome Res 2003; 13:13-26. [PMID: 12529302 PMCID: PMC430971 DOI: 10.1101/gr.844103] [Citation(s) in RCA: 225] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2002] [Accepted: 11/14/2002] [Indexed: 11/24/2022]
Abstract
Six measures of evolutionary change in the human genome were studied, three derived from the aligned human and mouse genomes in conjunction with the Mouse Genome Sequencing Consortium, consisting of (1) nucleotide substitution per fourfold degenerate site in coding regions, (2) nucleotide substitution per site in relics of transposable elements active only before the human-mouse speciation, and (3) the nonaligning fraction of human DNA that is nonrepetitive or in ancestral repeats; and three derived from human genome data alone, consisting of (4) SNP density, (5) frequency of insertion of transposable elements, and (6) rate of recombination. Features 1 and 2 are measures of nucleotide substitutions at two classes of "neutral" sites, whereas 4 is a measure of recent mutations. Feature 3 is a measure dominated by deletions in mouse, whereas 5 represents insertions in human. It was found that all six vary significantly in megabase-sized regions genome-wide, and many vary together. This indicates that some regions of a genome change slowly by all processes that alter DNA, and others change faster. Regional variation in all processes is correlated with, but not completely accounted for, by GC content in human and the difference between GC content in human and mouse.
Collapse
Affiliation(s)
- Ross C Hardison
- Department of Biochemistry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Zhang L, Vision TJ, Gaut BS. Patterns of nucleotide substitution among simultaneously duplicated gene pairs in Arabidopsis thaliana. Mol Biol Evol 2002; 19:1464-73. [PMID: 12200474 DOI: 10.1093/oxfordjournals.molbev.a004209] [Citation(s) in RCA: 101] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We characterized rates and patterns of synonymous and nonsynonymous substitution in 242 duplicated gene pairs on chromosomes 2 and 4 of Arabidopsis thaliana. Based on their collinear order along the two chromosomes, the gene pairs were likely duplicated contemporaneously, and therefore comparison of genetic distances among gene pairs provides insights into the distribution of nucleotide substitution rates among plant nuclear genes. Rates of synonymous substitution varied 13.8-fold among the duplicated gene pairs, but 90% of gene pairs differed by less than 2.6-fold. Average nonsynonymous rates were approximately fivefold lower than average synonymous rates; this rate difference is lower than that of previously studied nonplant lineages. The coefficient of variation of rates among genes was 0.65 for nonsynonymous rates and 0.44 for synonymous rates, indicating that synonymous and nonsynonymous rates vary among genes to roughly the same extent. The causes underlying rate variation were explored. Our analyses tentatively suggest an effect of physical location on synonymous substitution rates but no similar effect on nonsynonymous rates. Nonsynonymous substitution rates were negatively correlated with GC content at synonymous third codon positions, and synonymous substitution rates were negatively correlated with codon bias, as observed in other systems. Finally, the 242 gene pairs permitted investigation of the processes underlying divergence between paralogs. We found no evidence of positive selection, little evidence that paralogs evolve at different rates, and no evidence of differential codon usage or third position GC content.
Collapse
Affiliation(s)
- Liqing Zhang
- Department of Ecology and Evolutionary Biology, University of California, Irvine, 92697-2525, USA
| | | | | |
Collapse
|
19
|
Larizza A, Makalowski W, Pesole G, Saccone C. Evolutionary dynamics of mammalian mRNA untranslated regions by comparative analysis of orthologous human, artiodactyl and rodent gene pairs. COMPUTERS & CHEMISTRY 2002; 26:479-90. [PMID: 12144177 DOI: 10.1016/s0097-8485(02)00009-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Most evolutionary studies based on molecular data refer to the portion of genomes encoding for proteins. Today, however, more and more attention is paid to the so-called 'non-coding' regions, which constitute a notable portion of the metazoan nuclear genome. Among them, the untranslated regions of messenger RNAs (mRNA UTRs) are particularly important, as they are involved in the regulation of gene expression, controlling translation efficiency as well as mRNA localization and stability. Up to now, only few studies have focused on the analysis of the compositional and structural features of UTRs, or carried out to investigate quantitatively their evolutionary dynamics. For this reason we have carried out an inter-order study on the evolutionary rate of 5' and 3' UTRs with respect to the corresponding coding region in 93 triplets of orthologous genes (selected through a phylogenetic approach, for a total of 645 625 nt) belonging to Primates (Homo sapiens), Artiodactyla (Bos taurus) and Rodentia (Mus spp.). Our study, that considered only likely orthologous genes, has revealed interesting features on the evolution of these regions concerning nucleotide substitution rate and indels and repetitive element distribution. UTRs from different genes showed a remarkable heterogeneity in the evolutionary dynamics, with some homologous so highly divergent to prevent their alignment, and other rather conserved, at least in some regions, most divergent sequence pairs were excluded from our analysis. The comparison between the nucleotide substitution rates calculated for 5' and 3' UTRs with those calculated on synonymous coding position allowed us to verify and measure the existence of functional constraints acting upon the UTRs of different genes which have shown, in many cases, a positive selection driven evolutionary dynamics.
Collapse
|
20
|
Kusumi J, Tsumura Y, Yoshimaru H, Tachida H. Molecular evolution of nuclear genes in Cupressacea, a group of conifer trees. Mol Biol Evol 2002; 19:736-47. [PMID: 11961107 DOI: 10.1093/oxfordjournals.molbev.a004132] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We surveyed the molecular evolutionary characteristics of 11 nuclear genes from 10 conifer trees belonging to the Taxodioideae, the Cupressoideae, and the Sequoioideae. Comparisons of substitution rates among the lineages indicated that the synonymous substitution rates of the Cupressoideae lineage were higher than those of the Taxodioideae. This result parallels the pattern previously found in plastid genes. Likelihood-ratio tests showed that the nonsynonymous-synonymous rate ratio did not change significantly among lineages. In addition, after adjustments for lineage effects, the dispersion indices of synonymous and nonsynonymous substitutions were considerably reduced, and the latter was close to 1. These results indicated that the acceleration of evolutionary rates in the Cupressoideae lineage occurred in both the nuclear and plastid genomes, and that generally, this lineage effect affected synonymous and nonsynonymous substitutions similarly. We also investigated the relationship of synonymous substitution rates with the nonsynonymous substitution rate, base composition, and codon bias in each lineage. Synonymous substitution rates were positively correlated with nonsynonymous substitution rates and GC content at third codon positions, but synonymous substitution rates were not correlated with codon bias. Finally, we tested the possibility of positive selection at the protein level, using maximum likelihood models, assuming heterogeneous nonsynonymous-synonymous rate ratios among codon (amino acid) sites. Although we did not detect strong evidence of positively selected codon sites, the analysis suggested that significant variation in nonsynonymous-synonymous rate ratio exists among the sites. The most likely sites for action of positive selection were found in the ferredoxin gene, which is an important component of the apparatus for photosynthesis.
Collapse
Affiliation(s)
- Junko Kusumi
- Department of Biology, Faculty of Sciences, Kyushu University, Ropponmatsu, Fukuoka 810-8560, Japan
| | | | | | | |
Collapse
|
21
|
Hancock JM, Worthey EA, Santibáñez-Koref MF. A Role for Selection in Regulating the Evolutionary Emergence of Disease-Causing and Other Coding CAG Repeats in Humans and Mice. Mol Biol Evol 2001; 18:1014-23. [PMID: 11371590 DOI: 10.1093/oxfordjournals.molbev.a003873] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The evolutionary expansion of CAG repeats in human triplet expansion disease genes is intriguing because of their deleterious phenotype. In the past, this expansion has been suggested to reflect a broad genomewide expansion of repeats, which would imply that mutational and evolutionary processes acting on repeats differ between species. Here, we tested this hypothesis by analyzing repeat- and flanking-sequence evolution in 28 repeat-containing genes that had been sequenced in humans and mice and by considering overall lengths and distributions of CAG repeats in the two species. We found no evidence that these repeats were longer in humans than in mice. We also found no evidence for preferential accumulation of CAG repeats in the human genome relative to mice from an analysis of the lengths of repeats identified in sequence databases. We then investigated whether sequence properties, such as base and amino acid composition and base substitution rates, showed any relationship to repeat evolution. We found that repeat-containing genes were enriched in certain amino acids, presumably as the result of selection, but that this did not reflect underlying biases in base composition. We also found that regions near repeats showed higher nonsynonymous substitution rates than the remainder of the gene and lower nonsynonymous rates in genes that contained a repeat in both the human and the mouse. Higher rates of nonsynonymous mutation in the neighborhood of repeats presumably reflect weaker purifying selection acting in these regions of the proteins, while the very low rate of nonsynonymous mutation in proteins containing a CAG repeat in both species presumably reflects a high level of purifying selection. Based on these observations, we propose that the mutational processes giving rise to polyglutamine repeats in human and murine proteins do not differ. Instead, we propose that the evolution of polyglutamine repeats in proteins results from an interplay between mutational processes and selection.
Collapse
Affiliation(s)
- J M Hancock
- MRC Clinical Sciences Centre, Imperial College School of Medicine, Hammersmith Hospital, London, England.
| | | | | |
Collapse
|
22
|
Jordan IK, Kondrashov FA, Rogozin IB, Tatusov RL, Wolf YI, Koonin EV. Constant relative rate of protein evolution and detection of functional diversification among bacterial, archaeal and eukaryotic proteins. Genome Biol 2001; 2:RESEARCH0053. [PMID: 11790256 PMCID: PMC64838 DOI: 10.1186/gb-2001-2-12-research0053] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2001] [Revised: 09/11/2001] [Accepted: 10/05/2001] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Detection of changes in a protein's evolutionary rate may reveal cases of change in that protein's function. We developed and implemented a simple relative rates test in an attempt to assess the rate constancy of protein evolution and to detect cases of functional diversification between orthologous proteins. The test was performed on clusters of orthologous protein sequences from complete bacterial genomes (Chlamydia trachomatis, C. muridarum and Chlamydophila pneumoniae), complete archaeal genomes (Pyrococcus horikoshii, P. abyssi and P. furiosus) and partially sequenced mammalian genomes (human, mouse and rat). RESULTS Amino-acid sequence evolution rates are significantly correlated on different branches of phylogenetic trees representing the great majority of analyzed orthologous protein sets from all three domains of life. However, approximately 1% of the proteins from each group of species deviates from this pattern and instead shows variation that is consistent with an acceleration of the rate of amino-acid substitution, which may be due to functional diversification. Most of the putative functionally diversified proteins from all three species groups are predicted to function at the periphery of the cells and mediate their interaction with the environment. CONCLUSIONS Relative rates of protein evolution are remarkably constant for the three species groups analyzed here. Deviations from this rate constancy are probably due to changes in selective constraints associated with diversification between orthologs. Functional diversification between orthologs is thought to be a relatively rare event. However, the resolution afforded by the test designed specifically for genomic-scale datasets allowed us to identify numerous cases of possible functional diversification between orthologous proteins.
Collapse
Affiliation(s)
- I K Jordan
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD20894, USA.
| | | | | | | | | | | |
Collapse
|
23
|
Hurst LD, Williams EJ. Covariation of GC content and the silent site substitution rate in rodents: implications for methodology and for the evolution of isochores. Gene 2000; 261:107-14. [PMID: 11164042 DOI: 10.1016/s0378-1119(00)00489-3] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Many attempts to test selectionist and neutralist models employ estimates of synonymous (Ks) and non-synonymous (Ka) substitution rates of orthologous genes. For example, a stronger Ka-Ks correlation than expected under neutrality has been argued to indicate a role for selection and the absence of a Ks-GC4 correlation has been argued to be inconsistent with neutral models for isochore evolution. However, both of these results, we have shown previously, are sensitive to the method by which Ka and Ks are estimated. Using a maximum likelihood (ML) estimator (GY94) we found a positive correlation between Ks and GC4 and only a weak correlation between Ka and Ks, lower than expected under neutral expectations. This ML method is computationally slow. Recently, a new ad hoc approximation of this ML method has been provided (YN00). This is effectively an extension of Li's protocol but that also allows for codon usage bias. This method is computationally near-instantaneous and therefore potentially of great utility for analysis of large datasets. Here we ask whether this method might have such applicability. To this end we ask whether it too recovers the two unusual results. We report that when the ML and earlier ad hoc methods disagree, YN00 recovers the results described by the ML methods, i.e. a positive correlation between GC4 and Ks and only a weak correlation between Ks and Ka. If the ML method can be trusted, then YN00 can also be considered an adequately reliable method for analysis of large datasets. Assuming this to be so we also analyze further the patterns. We show, for example, that the positive correlation between GC4 and Ks is probably in part a mutational bias, there being more methyl induced CpG-->TpG mutations in GC rich regions. As regards the evolution of isochores, it seems inappropriate to use the claimed lack of a correlation between GC and Ks as definitive evidence either against or for any model. If the positive correlation is real then, we argue, this is hard to reconcile with the biased gene conversion model for isochore formation as this predicts a negative correlation.
Collapse
Affiliation(s)
- L D Hurst
- Department of Biology and Biochemistry, University of Bath, Claverton Down, Bath BA2 7AY, UK.
| | | |
Collapse
|
24
|
Bielawski JP, Dunn KA, Yang Z. Rates of nucleotide substitution and mammalian nuclear gene evolution. Approximate and maximum-likelihood methods lead to different conclusions. Genetics 2000; 156:1299-308. [PMID: 11063703 PMCID: PMC1461304 DOI: 10.1093/genetics/156.3.1299] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Rates and patterns of synonymous and nonsynonymous substitutions have important implications for the origin and maintenance of mammalian isochores and the effectiveness of selection at synonymous sites. Previous studies of mammalian nuclear genes largely employed approximate methods to estimate rates of nonsynonymous and synonymous substitutions. Because these methods did not account for major features of DNA sequence evolution such as transition/transversion rate bias and unequal codon usage, they might not have produced reliable results. To evaluate the impact of the estimation method, we analyzed a sample of 82 nuclear genes from the mammalian orders Artiodactyla, Primates, and Rodentia using both approximate and maximum-likelihood methods. Maximum-likelihood analysis indicated that synonymous substitution rates were positively correlated with GC content at the third codon positions, but independent of nonsynonymous substitution rates. Approximate methods, however, indicated that synonymous substitution rates were independent of GC content at the third codon positions, but were positively correlated with nonsynonymous rates. Failure to properly account for transition/transversion rate bias and unequal codon usage appears to have caused substantial biases in approximate estimates of substitution rates.
Collapse
Affiliation(s)
- J P Bielawski
- Department of Biology, University College London, London NW1 2HE, United Kingdom.
| | | | | |
Collapse
|
25
|
Hellberg ME, Moy GW, Vacquier VD. Positive selection and propeptide repeats promote rapid interspecific divergence of a gastropod sperm protein. Mol Biol Evol 2000; 17:458-66. [PMID: 10723746 DOI: 10.1093/oxfordjournals.molbev.a026325] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Male-specific proteins have increasingly been reported as targets of positive selection and are of special interest because of the role they may play in the evolution of reproductive isolation. We report the rapid interspecific divergence of cDNA encoding a major acrosomal protein of unknown function (TMAP) of sperm from five species of teguline gastropods. A mitochondrial DNA clock (calibrated by congeneric species divided by the Isthmus of Panama) estimates that these five species diverged 2-10 MYA. Inferred amino acid sequences reveal a propeptide that has diverged rapidly between species. The mature protein has diverged faster still due to high nonsynonymous substitution rates (> 25 nonsynonymous substitutions per site per 10(9) years). cDNA encoding the mature protein (89-100 residues) shows evidence of positive selection (Dn/Ds > 1) for 4 of 10 pairwise species comparisons. cDNA and predicted secondary-structure comparisons suggest that TMAP is neither orthologous nor paralogous to abalone lysin, and thus marks a second, phylogenetically independent, protein subject to strong positive selection in free-spawning marine gastropods. In addition, an internal repeat in one species (Tegula aureotincta) produces a duplicated cleavage site which results in two alternatively processed mature proteins differing by nine amino acid residues. Such alternative processing may provide a mechanism for introducing novel amino acid sequence variation at the amino-termini of proteins. Highly divergent TMAP N-termini from two other tegulines (Tegula regina and Norrisia norrisii) may have originated by such a mechanism.
Collapse
Affiliation(s)
- M E Hellberg
- Department of Biological Sciences, Louisiana State University at Baton Rouge 70803, USA.
| | | | | |
Collapse
|
26
|
Abstract
BACKGROUND Nucleotide substitution rates and G + C content vary considerably among mammalian genes. It has been proposed that the mammalian genome comprises a mosaic of regions - termed isochores - with differing G + C content. The regional variation in gene G + C content might therefore be a reflection of the isochore structure of chromosomes, but the factors influencing the variation of nucleotide substitution rate are still open to question. RESULTS To examine whether nucleotide substitution rates and gene G + C content are influenced by the chromosomal location of genes, we compared human and murid (mouse or rat) orthologues known to belong to one of the chromosomal (autosomal) segments conserved between these species. Multiple members of gene families were excluded from the dataset. Sets of neighbouring genes were defined as those lying within 1 centiMorgan (cM) of each other on the mouse genetic map. For both synonymous substitution rates and G + C content at silent sites, neighbouring genes were found to be significantly more similar to each other than sets of genes randomly drawn from the dataset. Moreover, we demonstrated that the regional similarities in G + C content (isochores) and synonymous substitution rate were independent of each other. CONCLUSIONS Our results provide the first substantial statistical evidence for the existence of a regional variation in the synonymous substitution rate within the mammalian genome, indicating that different chromosomal regions evolve at different rates. This regional phenomenon which shapes gene evolution could reflect the existence of 'evolutionary rate units' along the chromosome.
Collapse
Affiliation(s)
- G Matassi
- Institute of Genetics, University of Nottingham, Queens Medical Centre, Nottingham, NG7 2UH, UK.
| | | | | |
Collapse
|
27
|
Abstract
Transferrin is an iron-binding protein that plays an important role in iron metabolism and resistance to bacterial infection in a variety of organisms. A comparison of transferrin coding sequences from four salmonid species shows that the rate of evolution at nonsynonymous sites is significantly higher than the rate at synonymous sites, suggesting that positive natural selection for new alleles has played an important role in the evolution of transferrin in some salmon species. We hypothesize that the selective agent driving rapid divergence is interactions between host transferrin and the iron-scavenging proteins of pathogenic bacteria.
Collapse
Affiliation(s)
- M J Ford
- National Marine Fisheries Service, Northwest Fisheries Science Center, Seattle, WA 98112, USA.
| | | | | |
Collapse
|
28
|
Walker DR, Bond JP, Tarone RE, Harris CC, Makalowski W, Boguski MS, Greenblatt MS. Evolutionary conservation and somatic mutation hotspot maps of p53: correlation with p53 protein structural and functional features. Oncogene 1999; 18:211-8. [PMID: 9926936 DOI: 10.1038/sj.onc.1202298] [Citation(s) in RCA: 147] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Missense mutations in p53 frequently occur at 'hotspot' amino acids which are highly conserved and represent regions of structural or functional importance. Using the p53 mutation database and the p53 DNA sequences for 11 species, we more precisely defined the relationships among conservation, mutation frequency and protein structure. We aligned the p53 sequences codon-by-codon and determined the degree of substitution among them. As a whole, p53 is evolving at an average rate for a mammalian protein-coding gene. As expected, the DNA binding domain is evolving more slowly than the carboxy and amino termini. A detailed map of evolutionary conservation shows that within the DNA binding domain there are repeating peaks and valleys of higher and lower evolutionary constraint. Mutation hotspots were identified by comparing the observed distribution of mutations to the pattern expected from a random multinomial distribution. Seventy-three hotspots were identified; these 19% of codons account for 88% of all reported p53 mutations. Both high evolutionary constraint and mutation hotspots are noted at amino acids close to the protein-DNA interface and at others more distant from DNA, often buried within the core of the folded protein but sometimes on its surface. The results indicate that targeting highly conserved regions for mutational and functional analysis may be efficient strategies for the study of cancer-related genes.
Collapse
Affiliation(s)
- D R Walker
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, Maryland 20894, USA
| | | | | | | | | | | | | |
Collapse
|
29
|
Domachowske JB, Bonville CA, Dyer KD, Rosenberg HF. Evolution of antiviral activity in the ribonuclease A gene superfamily: evidence for a specific interaction between eosinophil-derived neurotoxin (EDN/RNase 2) and respiratory syncytial virus. Nucleic Acids Res 1998; 26:5327-32. [PMID: 9826755 PMCID: PMC147995 DOI: 10.1093/nar/26.23.5327] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We have demonstrated that the human eosinophil-derived neurotoxin (EDN, RNase 2), a rapidly evolving secretory protein derived from eosinophilic leukocytes, mediates the ribonucleolytic destruction of extracellular virions of the single-stranded RNA virus respiratory syncytial virus (RSV). While RNase activity is crucial to antiviral activity, it is clearly not sufficient, as our results suggest that EDN has unique structural features apart from RNase activity that are necessary to promote antiviral activity. We demonstrate here that the interaction between EDN and extracellular virions of RSV is both saturatable and specific. Increasing concentrations of the antivirally inactivated, ribonucleolytically inactivated point mutant form of recombinant human EDN, rhEDNdK38, inhibits rhEDN's antiviral activity, while increasing concentrations of the related RNase, recombinant human RNase k6, have no effect whatsoever. Interestingly, acquisition of antiviral activity parallels the evolutionary development of the primate EDN lineage, having emerged some time after the divergence of the Old World from the New World monkeys. Using this information, we created ribonucleolytically active chimeras of human and New World monkey orthologs of EDN and, by evaluating their antiviral activity, we have identified an N-terminal segment of human EDN that contains one or more of the sequence elements that mediate its specific interaction with RSV.
Collapse
Affiliation(s)
- J B Domachowske
- Department of Pediatrics, Division of Infectious Diseases, State University of New York Health Science Center at Syracuse, Syracuse, NY 13210, USA
| | | | | | | |
Collapse
|
30
|
Comeron JM, Kreitman M. The correlation between synonymous and nonsynonymous substitutions in Drosophila: mutation, selection or relaxed constraints? Genetics 1998; 150:767-75. [PMID: 9755207 PMCID: PMC1460343 DOI: 10.1093/genetics/150.2.767] [Citation(s) in RCA: 43] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Codon usage bias, the preferential use of particular codons within each codon family, is characteristic of synonymous base composition in many species, including Drosophila, yeast, and many bacteria. Preferential usage of particular codons in these species is maintained by natural selection acting largely at the level of translation. In Drosophila, as in bacteria, the rate of synonymous substitution per site is negatively correlated with the degree of codon usage bias, indicating stronger selection on codon usage in genes with high codon bias than in genes with low codon bias. Surprisingly, in these organisms, as well as in mammals, the rate of synonymous substitution is also positively correlated with the rate of nonsynonymous substitution. To investigate this correlation, we carried out a phylogenetic analysis of substitutions in 22 genes between two species of Drosophila, Drosophila pseudoobscura and D. subobscura, in codons that differ by one replacement and one synonymous change. We provide evidence for a relative excess of double substitutions in the same species lineage that cannot be explained by the simultaneous mutation of two adjacent bases. The synonymous changes in these codons also cannot be explained by a shift to a more preferred codon following a replacement substitution. We, therefore, interpret the excess of double codon substitutions within a lineage as being the result of relaxed constraints on both kinds of substitutions in particular codons.
Collapse
Affiliation(s)
- J M Comeron
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA.
| | | |
Collapse
|
31
|
Hughes AL, Verra F. Ancient polymorphism and the hypothesis of a recent bottleneck in the malaria parasite Plasmodium falciparum. Genetics 1998; 150:511-3. [PMID: 9841224 PMCID: PMC1460300 DOI: 10.1093/genetics/150.1.511] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
32
|
Rich SM, Ayala FJ. The recent origin of allelic variation in antigenic determinants of Plasmodium falciparum. Genetics 1998; 150:515-7. [PMID: 9841225 PMCID: PMC1460303 DOI: 10.1093/genetics/150.1.515] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
33
|
Deming MS, Dyer KD, Bankier AT, Piper MB, Dear PH, Rosenberg HF. Ribonuclease k6: chromosomal mapping and divergent rates of evolution within the RNase A gene superfamily. Genome Res 1998; 8:599-607. [PMID: 9647635 DOI: 10.1101/gr.8.6.599] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
We have localized the gene encoding human RNase k6 to within approximately 120 kb on the long (q) arm of chromosome 14 by HAPPY mapping. With this information, the relative positions of the six human RNase A ribonucleases that have been mapped to this locus can be inferred. To further our understanding of the individual lineages comprising the RNase A superfamily, we have isolated and characterized 10 novel genes orthologous to that encoding human RNase k6 from Great Ape, Old World, and New World monkey genomes. Each gene encodes a complete ORF with no less than 86% amino acid sequence identity to human RNase k6 with the eight cysteines and catalytic histidines (H15 and H123) and lysine (K38) typically observed among members of the RNase A superfamily. Interesting trends include an unusually low number of synonymous substitutions (Ks) observed among the New World monkey RNase k6 genes. When considering nonsilent mutations, RNase k6 is a relatively stable lineage, with a nonsynonymous substitution rate of 0.40 x 10(-9) nonsynonymous substitutions/nonsynonymous site/year (ns/ns/yr). These results stand in contrast to those determined for the primate orthologs of the two closely related ribonucleases, the eosinophil-derived neurotoxin (EDN) and eosinophil cationic protein (ECP), which have incorporated nonsilent mutations at very rapid rates (1.9 x 10(-9) and 2.0 x 10(-9) ns/ns/yr, respectively). The uneventful trends observed for RNase k6 serve to spotlight the unique nature of EDN and ECP and the unusual evolutionary constraints to which these two ribonuclease genes must be responding. [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. AF037081-AF037090.]
Collapse
Affiliation(s)
- M S Deming
- Laboratory of Host Defenses, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | | | | | | | |
Collapse
|
34
|
Abstract
A highly variable family of related DNA sequences was examined in order to determine the effect of local sequence environment on substitution mutation; 29 sequences from the Brassica self-incompatibility gene family, which possess a high level of nonsynonymous mutations, were aligned and grouped according to their similarity and function. The level and distribution of substitution mutations were calculated. A nonrandom distribution of sequence variation was observed along the sequences. The effect of neighbor biases and structural and thermodynamic measures were then compared in the absence of strong codon conservation. Biases were observed in the rates of substitution of the same base pair in different local sequence environments. The effect of the 5' neighbor was such that nucleotide A or C was associated with more mutations than G or T. There were significant interactions of certain dinucleotides with the frequency of mutation. Sequence-dependent measures of helical stability, intrinsic curvature, components of curvature, and stacking interactions were calculated for each sequence. Decreased helical stability was found to be associated with increased mutation. The compound measure of curvature, calculated according to the "wedge" model, showed little association with mutation. However, the components of increased wedge angle and decreased twist both showed an association with increased mutation. A small effect of A-type DNA stacking was found to be associated with mutated bases.
Collapse
Affiliation(s)
- G J King
- Breeding and Genetics Department, Horticulture Research International, Wellesbourne, Warwick, United Kingdom
| | | |
Collapse
|
35
|
Swanson WJ, Vacquier VD. Extraordinary divergence and positive Darwinian selection in a fusagenic protein coating the acrosomal process of abalone spermatozoa. Proc Natl Acad Sci U S A 1995; 92:4957-61. [PMID: 7761431 PMCID: PMC41826 DOI: 10.1073/pnas.92.11.4957] [Citation(s) in RCA: 102] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
During fertilization in marine invertebrates, fusion between sperm and egg cell membranes occurs at the tip of the sperm acrosomal process. In abalone sperm the acrosomal process is coated with an 18-kDa protein. In situ, this protein has no effect on the egg vitelline envelope, but in vitro it is a potent fusagen of liposomes. Thus, the 18-kDa protein may mediate membrane fusion between the gametes, a step in gamete recognition known to restrict heterospecific fertilization in other species. The cDNA and deduced amino acid sequences of the 18-kDa protein were determined for five species of California abalone. The deduced amino acid sequences exhibit extraordinary divergence; the percent identity varies from 27% to 87%. Analysis of nucleotide substitution shows extremely high frequencies of amino acid-altering substitution compared to silent substitution, demonstrating that positive Darwinian selection promotes the divergence of this protein. However, amino acid replacement is conservative with respect to size and polarity of residue. The data support the developing idea that in free-spawning marine invertebrates, the proteins mediating fertilization may be subjected to intense, and as yet unknown, selective forces. The extraordinary divergence of fertilization proteins may be related to the establishment of barriers to heterospecific fertilization.
Collapse
Affiliation(s)
- W J Swanson
- Marine Biology Research Division, Scripps Institution of Oceanography, University of California, San Diego, La Jolla 92093-0202, USA
| | | |
Collapse
|
36
|
Mouchiroud D, Gautier C, Bernardi G. Frequencies of synonymous substitutions in mammals are gene-specific and correlated with frequencies of nonsynonymous substitutions. J Mol Evol 1995; 40:107-13. [PMID: 7714909 DOI: 10.1007/bf00166602] [Citation(s) in RCA: 66] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
The frequencies of synonymous substitutions of mammalian genes cover a much wider range than previously thought. We report here that the different frequencies found in homologous genes from a given mammalian pair are correlated with those in the same homologous genes from a different mammalian pair. This indicates that the frequencies of synonymous substitutions are gene-specific (as are the frequencies of nonsynonymous substitutions), or, in other words, that "fast" and "slow" genes in one mammal are fast and slow, respectively, in any other one. Moreover, the frequencies of synonymous substitutions are correlated with the frequencies of nonsynonymous substitution in the same genes.
Collapse
Affiliation(s)
- D Mouchiroud
- Laboratoire de Biométrie, Génétique et Biologie des Populations, U.R.A. 243, Université Claude Bernard, Villeurbanne, France
| | | | | |
Collapse
|
37
|
Abstract
Chimpanzee, tamarin, and marmoset interleukin-3 (IL-3) genes were cloned, sequenced, and expressed. Western blot analysis demonstrated that functional genes were isolated. IL-3 sequences were compared with those of mouse, rat, rhesus monkey, gibbon, and man. Multiple alignment of the IL-3 coding regions showed that only a few regions had been conserved during mammalian evolution, which are likely associated with functional domains of the IL-3 protein. Substitution rates for the various lineages were calculated and the numbers of synonymous and nonsynonymous substitutions were estimated separately. Distance matrices of the IL-3 coding regions were used to construct phylogenetic trees which revealed large differences in IL-3 evolution rate as well as a more rapid substitution rate for rodents and a rate slowdown during hominoid evolution. Extremes were rhesus monkey IL-3, which accumulated few synonymous substitutions, and gibbon IL-3, which had almost exclusively synonymous substitutions. In rhesus monkey IL-3, nonsynonymous substitutions outnumbered synonymous substitutions, which could not be readily explained by a random process of substitutions. We assume that during evolution of IL-3, the majority of the amino acid replacements and the impaired interspecies functional cross-reactivity originate from selection mechanisms with the most likely selective force being the structure of the heterodimeric IL.3 cell-surface receptor. Insight into IL-3 architecture and structural analysis of the IL-3 receptor are needed to analyze the unusually fast evolution of IL-3 in more detail.
Collapse
Affiliation(s)
- H Burger
- Department of Medical Oncology, Dr. Daniel den Hoed Cancer Center/Dijkzigt, University Hospital Rotterdam, The Netherlands
| | | | | | | |
Collapse
|
38
|
Wolfe KH, Sharp PM. Mammalian gene evolution: nucleotide sequence divergence between mouse and rat. J Mol Evol 1993; 37:441-56. [PMID: 8308912 DOI: 10.1007/bf00178874] [Citation(s) in RCA: 147] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
As a paradigm of mammalian gene evolution, the nature and extent of DNA sequence divergence between homologous protein-coding genes from mouse and rat have been investigated. The data set examined includes 363 genes totalling 411 kilobases, making this by far the largest comparison conducted between a single pair of species. Mouse and rat genes are on average 93.4% identical in nucleotide sequence and 93.9% identical in amino acid sequence. Individual genes vary substantially in the extent of nonsynonymous nucleotide substitution, as expected from protein evolution studies; here the variation is characterized. The extent of synonymous (or silent) substitution also varies considerably among genes, though the coefficient of variation is about four times smaller than for nonsynonymous substitutions. A small number of genes mapped to the X-chromosome have a slower rate of molecular evolution than average, as predicted if molecular evolution is "male-driven." Base composition at silent sites varies from 33% to 95% G+C in different genes; mouse and rat homologues differ on average by only 1.7% in silent-site G+C, but it is shown that this is not necessarily due to any selective constraint on their base composition. Synonymous substitution rates and silent site base composition appear to be related (genes at intermediate G+C have on average higher rates), but the relationship is not as strong as in our earlier analyses. Rates of synonymous and nonsynonymous substitution are correlated, apparently because of an excess of substitutions involving adjacent pairs of nucleotides. Several factors suggest that synonymous codon usage in rodent genes is not subject to selection.
Collapse
Affiliation(s)
- K H Wolfe
- Department of Genetics, University of Dublin, Trinity College, Ireland
| | | |
Collapse
|
39
|
Eyre-Walker A. The role of DNA replication and isochores in generating mutation and silent substitution rate variance in mammals. Genet Res (Camb) 1992; 60:61-7. [PMID: 1452015 DOI: 10.1017/s0016672300030676] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
It has been suggested that isochores are maintained by mutation biases, and that this leads to variation in the rate of mutation across the genome. A model of DNA replication is presented in which the probabilities of misincorporation and proofreading are affected by the composition and concentration of the free nucleotide pools. The relationship between sequence G+C content and the mutation rate is investigated. It is found that there is very little variation in the mutation rate between sequences of different G+C contents if the total concentration of the free nucleotides remains constant. However, variation in the mutation rate can be arbitrarily large if some mismatches are proofread and the total concentration of free nucleotides varies. Hence the model suggests that the maintenance of isochores by the replication of DNA in free nucleotide pools of biased composition does not lead per se to mutation rate variance. However, it is possible that changes in composition could be accompanied by changes in concentration, thus generating mutation rate variance. Furthermore, there is the possibility that germ-line selection could lead to alterations in the overall free nucleotide concentration through the cell cycle. These findings are discussed with reference to the variance in mammalian silent substitution rates.
Collapse
Affiliation(s)
- A Eyre-Walker
- Institute of Cell Animal and Population Biology, University of Edinburgh, Great Britain
| |
Collapse
|
40
|
Abstract
Some evolutionary consequences of different rates and trends in DNA damage and repair are explained. Different types of DNA damaging agents cause nonrandom lesions along the DNA. The type of DNA sequence motifs to be preferentially attacked depends upon the chemical or physical nature of the assaulting agent and the DNA base composition. Higher-order chromatin structure, the nonrandom nucleosome positioning along the DNA, the absence of nucleosomes from the promoter regions of active genes, curved DNA, the presence of sequence-specific binding proteins, and the torsional strain on the DNA induced by an increased transcriptional activity all are expected to affect rates of damage of individual genes. Furthermore, potential Z-DNA, H-DNA, slippage, and cruciform structures in the regulatory region of some genes or in other genomic loci induced by torsional strain on the DNA are more prone to modification by genotoxic agents. A specific actively transcribed gene may be preferentially damaged over nontranscribed genes only in specific cell types that maintain this gene in active chromatin fractions because of (1) its decondensed chromatin structure, (2) torsional strain in its DNA, (3) absence of nucleosomes from its regulatory region, and (4) altered nucleosome structure in its coding sequence due to the presence of modified histones and HMG proteins. The situation in this regard of germ cell lineages is, of course, the only one to intervene in evolution. Most lesions in DNA such as those caused by UV or DNA alkylating agents tend to diminish the GC content of genomes. Thus, DNA sequences not bound by selective constraints, such as pseudogenes, will show an increase in their AT content during evolution as evidenced by experimental observations. On the other hand, transcriptionally active parts may be repaired at rates higher than inactive parts of the genome, and proliferating cells may display higher repair activities than quiescent cells. This might arise from a tight coupling of the repair process with both transcription and replication, all these processes taking place on the nuclear matrix. Repair activities differ greatly among species, and there is a good correlation between life span and repair among mammals. It is predicted that genes that are transcriptionally active in germ-cell lineages have a lower mutation rate than bulk DNA, a circumstance that is expected to be reflected in evolution. Exception to this rule might be genes containing potential Z-DNA, H-DNA, or cruciform structures in their coding or regulatory regions that appear to be refractory to repair.(ABSTRACT TRUNCATED AT 400 WORDS)
Collapse
Affiliation(s)
- T Boulikas
- Linus Pauling Institute of Science and Medicine, Palo Alto, CA
| |
Collapse
|
41
|
Morden CW, Golden SS. Sequence analysis and phylogenetic reconstruction of the genes encoding the large and small subunits of ribulose-1,5-bisphosphate carboxylase/oxygenase from the chlorophyll b-containing prokaryote Prochlorothrix hollandica. J Mol Evol 1991; 32:379-95. [PMID: 1904095 DOI: 10.1007/bf02101278] [Citation(s) in RCA: 63] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Prochlorophytes similar to Prochloron sp. and Prochlorothrix hollandica have been suggested as possible progenitors of the plastids of green algae and land plants because they are prokaryotic organisms that possess chlorophyll b (chl b). We have sequenced the Prochlorothrix genes encoding the large and small subunits of ribulose-1,5-bisphosphate carboxylase/oxygenase(rubisco), rbcL and rbcS, for comparison with those of other taxa to assess the phylogenetic relationship of this species. Length differences in the large subunit polypeptide among all sequences compared occur primarily at the amino terminus, where numerous short gaps are present, and at the carboxy terminus, where sequences of Alcaligenes eutrophus and non-chlorophyll b algae are several amino acids longer. Some domains in the small subunit polypeptide are conserved among all sequences analyzed, yet in other domains the sequences of different phylogenetic groups exhibit specific structural characteristics. Phylogenetic analyses of rbcL and rbcS using Wagner parsimony analysis of deduced amino acid sequences indicate that Prochlorothrix is more closely related to cyanobacteria than to the green plastid lineage. The molecular phylogenies suggest that plastids originated by at least three separate primary endosymbiotic events, i.e., once each leading to green algae and land plants, to red algae, and to Cyanophora paradoxa. The Prochlorothrix rubisco genes show a strong GC bias, with 68% of the third codon positions being G or C. Factors that may affect the GC content of different genomes are discussed.
Collapse
Affiliation(s)
- C W Morden
- Department of Biology, Texas A&M University, College Station 77843
| | | |
Collapse
|
42
|
Abstract
Experimental studies have shown that the fidelity of DNA replication can be affected by the concentrations of free deoxyribonucleotides present in the cell. Replication of mammalian chromosomes is achieved using pools of newly-synthesized deoxyribonucleotides which fluctuate during the cell cycle. Since regions of mammalian chromosomes are replicated sequentially, there is the potential for differences among mammalian loci in both the relative and absolute frequencies of the various transitional and transversional mutations which may occur. Where these mutations are effectively neutral, at silent sites in genes and in non-coding sequences, this may result in different rates of evolution and in different base compositions, as have been observed in data from mammalian genes. A simple model of the DNA replication process is developed to describe how the mutation rate could be affected by the G + C contents of the deoxyribonucleotide pools and of the replicating DNA. Mutation rates are predicted to vary from locus to locus; only in the particular case of identical G + C contents in the DNA locus and the deoxyribonucleotide pools, and no proofreading, will the mutation rate be uniform over all loci.
Collapse
Affiliation(s)
- K H Wolfe
- Department of Genetics, University of Dublin, Trinity College, Republic of Ireland
| |
Collapse
|
43
|
Martínez-Cruzado JC. Evolution of the autosomal chorion cluster in Drosophila. IV. The Hawaiian Drosophila: rapid protein evolution and constancy in the rate of DNA divergence. J Mol Evol 1990; 31:402-23. [PMID: 2124630 DOI: 10.1007/bf02106055] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Autosomal chorion genes s18, s15, and s19 are shown to diverge at extremely rapid rates in closely related taxa of Hawaiian Drosophila. Their nucleotide divergence rates are at least as fast as those of intergenic regions that are known to evolve more extensively between distantly related species. Their amino acid divergence rates are the fastest known to date. There are two nucleotide replacement substitutions for every synonymous one. The molecular basis for observed length and substitution mutations is analyzed. Length mutations are strongly associated with direct repeats in general, and with tandem repeats in particular, whereas the rate for an average transition is twice that for an average transversion. The DNA sequence of the cluster was used to construct a phylogenetic tree for five taxa of the Hawaiian picture-winged species group of Drosophila. Assignment of observed base substitutions occurring in various branches of the tree reveals an excess of would-be homoplasies in a centrally localized 1.8-kb segment containing the s15 gene. This observation may be a reflection of ancestral excess polymorphisms in the segment. The chorion cluster appears to evolve at a constant rate regardless of whether the central 1.8-kb segment is included or not in the analysis. Assuming that the time of divergence of Drosophila grimshawi and the planitibia subgroup coincides with the emergence of the island of Kauai, the overall rate of base substitution in the cluster is estimated to be 0.8% million years, whereas synonymous sites are substituted at a rate of 1.2% million years.
Collapse
Affiliation(s)
- J C Martínez-Cruzado
- Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts 02138
| |
Collapse
|
44
|
Mouchiroud D, Gautier C. Codon usage changes and sequence dissimilarity between human and rat. J Mol Evol 1990; 31:81-91. [PMID: 2120453 DOI: 10.1007/bf02109477] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
This paper reports on the relationship between the number of silent differences and the codon usage changes in the lineages leading to human and rat. Examination of 102 pairs of homologous genes gives rise to four main conclusions: (1) We have previously demonstrated the existence of a codon usage change (called the minor shift) between human and rat; this was confirmed here with a larger sample. For genes with extreme C & G frequencies, the C & G level in the third codon position is less extreme in rat than in human. (2) Protein similarity and percentage of positive differences are the two main factors that discriminate homologous genes when characterized by differences between rat and human. By definition, positive differences result from silent changes between A or T and C or G with a direction implying a C & G content variation in the same direction as the overall gene variation. (3) For genes showing both codon usage change and low protein similarity, a majority of amino acid replacements contributes to C & G level variation in positions I and II in the same direction as the variation in position III. This is thus a new example of protein evolution due to constraints acting at the DNA level. (4) In heavy isochores (high C & G content) no direct correlation exists between codon usage change (measured by the dissymmetry of differences) and silent dissimilarity. In light isochores the opposite situation is observed: modification of codon usage is associated with a high synonymous dissimilarity. This result shows that, in some cases, modification of constrains acting at the DNA level could accelerate divergence between genomes.
Collapse
Affiliation(s)
- D Mouchiroud
- Laboratoire de Biométrie, Génétique et Biologie des Populations (CNRS U.R.A. 243), Université Claude Bernard, Villeurbanne, France
| | | |
Collapse
|
45
|
Evolution of DNA Sequence Contributions of Mutational Bias and Selection to the Origin of Chromosomal Compartments. ACTA ACUST UNITED AC 1990. [DOI: 10.1007/978-3-642-75599-6_1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
46
|
Graur D, Shuali Y, Li WH. Deletions in processed pseudogenes accumulate faster in rodents than in humans. J Mol Evol 1989; 28:279-85. [PMID: 2499684 DOI: 10.1007/bf02103423] [Citation(s) in RCA: 105] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The relative rates of point nucleotide substitution and accumulation of gap events (deletions and insertions) were calculated for 22 human and 30 rodent processed pseudogenes. Deletion events not only outnumbered insertions (the ratio being 7:1 and 3:1 for human and rodent pseudogenes, respectively), but also the total length of deletions was greater than that of insertions. Compared with their functional homologs, human processed pseudogenes were found to be shorter by about 1.2%, and rodent pseudogenes by about 2.3%. DNA loss from processed pseudogenes through deletion is estimated to be at least seven times faster in rodents than in humans. In comparison with the rate of point substitutions, the abridgment of pseudogenes during evolutionary times is a slow process that probably does not retard the rate of growth of the genome due to the proliferation of processed pseudogenes.
Collapse
Affiliation(s)
- D Graur
- Department of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel
| | | | | |
Collapse
|