1
|
Cornejo-Páramo P, Petrova V, Zhang X, Young RS, Wong ES. Emergence of enhancers at late DNA replicating regions. Nat Commun 2024; 15:3451. [PMID: 38658544 PMCID: PMC11043393 DOI: 10.1038/s41467-024-47391-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 03/26/2024] [Indexed: 04/26/2024] Open
Abstract
Enhancers are fast-evolving genomic sequences that control spatiotemporal gene expression patterns. By examining enhancer turnover across mammalian species and in multiple tissue types, we uncover a relationship between the emergence of enhancers and genome organization as a function of germline DNA replication time. While enhancers are most abundant in euchromatic regions, enhancers emerge almost twice as often in late compared to early germline replicating regions, independent of transposable elements. Using a deep learning sequence model, we demonstrate that new enhancers are enriched for mutations that alter transcription factor (TF) binding. Recently evolved enhancers appear to be mostly neutrally evolving and enriched in eQTLs. They also show more tissue specificity than conserved enhancers, and the TFs that bind to these elements, as inferred by binding sequences, also show increased tissue-specific gene expression. We find a similar relationship with DNA replication time in cancer, suggesting that these observations may be time-invariant principles of genome evolution. Our work underscores that genome organization has a profound impact in shaping mammalian gene regulation.
Collapse
Affiliation(s)
- Paola Cornejo-Páramo
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, Sydney, NSW, Australia
| | - Veronika Petrova
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, Sydney, NSW, Australia
| | - Xuan Zhang
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia
| | - Robert S Young
- Usher Institute, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, United Kingdom
- Zhejiang University - University of Edinburgh Institute, Zhejiang University, 718 East Haizhou Road, 314400, Haining, PR China
| | - Emily S Wong
- Victor Chang Cardiac Research Institute, Darlinghurst, NSW, Australia.
- School of Biotechnology and Biomolecular Sciences, Sydney, NSW, Australia.
| |
Collapse
|
2
|
Lee Y, Cho CH, Noh C, Yang JH, Park SI, Lee YM, West JA, Bhattacharya D, Jo K, Yoon HS. Origin of minicircular mitochondrial genomes in red algae. Nat Commun 2023; 14:3363. [PMID: 37291154 PMCID: PMC10250338 DOI: 10.1038/s41467-023-39084-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 05/30/2023] [Indexed: 06/10/2023] Open
Abstract
Eukaryotic organelle genomes are generally of conserved size and gene content within phylogenetic groups. However, significant variation in genome structure may occur. Here, we report that the Stylonematophyceae red algae contain multipartite circular mitochondrial genomes (i.e., minicircles) which encode one or two genes bounded by a specific cassette and a conserved constant region. These minicircles are visualized using fluorescence microscope and scanning electron microscope, proving the circularity. Mitochondrial gene sets are reduced in these highly divergent mitogenomes. Newly generated chromosome-level nuclear genome assembly of Rhodosorus marinus reveals that most mitochondrial ribosomal subunit genes are transferred to the nuclear genome. Hetero-concatemers that resulted from recombination between minicircles and unique gene inventory that is responsible for mitochondrial genome stability may explain how the transition from typical mitochondrial genome to minicircles occurs. Our results offer inspiration on minicircular organelle genome formation and highlight an extreme case of mitochondrial gene inventory reduction.
Collapse
Affiliation(s)
- Yongsung Lee
- Department of Biological Sciences, Sungkyunkwan University, Suwon, 16419, Korea
| | - Chung Hyun Cho
- Department of Biological Sciences, Sungkyunkwan University, Suwon, 16419, Korea
| | - Chanyoung Noh
- Department of Chemistry, Sogang University, Seoul, 04107, Korea
| | - Ji Hyun Yang
- Department of Biological Sciences, Sungkyunkwan University, Suwon, 16419, Korea
| | - Seung In Park
- Department of Biological Sciences, Sungkyunkwan University, Suwon, 16419, Korea
| | - Yu Min Lee
- Department of Biological Sciences, Sungkyunkwan University, Suwon, 16419, Korea
| | - John A West
- School of Biosciences 2, University of Melbourne, Parkville, Victoria, 3010, Australia
| | - Debashish Bhattacharya
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, 08901, USA
| | - Kyubong Jo
- Department of Chemistry, Sogang University, Seoul, 04107, Korea.
| | - Hwan Su Yoon
- Department of Biological Sciences, Sungkyunkwan University, Suwon, 16419, Korea.
| |
Collapse
|
3
|
Benisty H, Hernandez-Alias X, Weber M, Anglada-Girotto M, Mantica F, Radusky L, Senger G, Calvet F, Weghorn D, Irimia M, Schaefer MH, Serrano L. Genes enriched in A/T-ending codons are co-regulated and conserved across mammals. Cell Syst 2023; 14:312-323.e3. [PMID: 36889307 DOI: 10.1016/j.cels.2023.02.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Revised: 07/11/2022] [Accepted: 02/09/2023] [Indexed: 03/09/2023]
Abstract
Codon usage influences gene expression distinctly depending on the cell context. Yet, the importance of codon bias in the simultaneous turnover of specific groups of protein-coding genes remains to be investigated. Here, we find that genes enriched in A/T-ending codons are expressed more coordinately in general and across tissues and development than those enriched in G/C-ending codons. tRNA abundance measurements indicate that this coordination is linked to the expression changes of tRNA isoacceptors reading A/T-ending codons. Genes with similar codon composition are more likely to be part of the same protein complex, especially for genes with A/T-ending codons. The codon preferences of genes with A/T-ending codons are conserved among mammals and other vertebrates. We suggest that this orchestration contributes to tissue-specific and ontogenetic-specific expression, which can facilitate, for instance, timely protein complex formation.
Collapse
Affiliation(s)
- Hannah Benisty
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain.
| | - Xavier Hernandez-Alias
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Marc Weber
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Miquel Anglada-Girotto
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Federica Mantica
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Leandro Radusky
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Gökçe Senger
- Department of Experimental Oncology, European Institute of Oncology (IEO) IRCCS, Via Adamello 16, Milan 20139, Italy
| | - Ferriol Calvet
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Donate Weghorn
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Manuel Irimia
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain; ICREA, Pg. Lluis Companys 23, Barcelona 08010, Spain
| | - Martin H Schaefer
- Department of Experimental Oncology, European Institute of Oncology (IEO) IRCCS, Via Adamello 16, Milan 20139, Italy
| | - Luis Serrano
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain; Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain; ICREA, Pg. Lluis Companys 23, Barcelona 08010, Spain.
| |
Collapse
|
4
|
Casella AM, Colantuoni C, Ament SA. Identifying enhancer properties associated with genetic risk for complex traits using regulome-wide association studies. PLoS Comput Biol 2022; 18:e1010430. [PMID: 36070311 PMCID: PMC9484640 DOI: 10.1371/journal.pcbi.1010430] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 09/19/2022] [Accepted: 07/23/2022] [Indexed: 11/17/2022] Open
Abstract
Genetic risk for complex traits is strongly enriched in non-coding genomic regions involved in gene regulation, especially enhancers. However, we lack adequate tools to connect the characteristics of these disruptions to genetic risk. Here, we propose RWAS (Regulome Wide Association Study), a new application of the MAGMA software package to identify the characteristics of enhancers that contribute to genetic risk for disease. RWAS involves three steps: (i) assign genotyped SNPs to cell type- or tissue-specific regulatory features (e.g., enhancers); (ii) test associations of each regulatory feature with a trait of interest for which genome-wide association study (GWAS) summary statistics are available; (iii) perform enhancer-set enrichment analyses to identify quantitative or categorical features of regulatory elements that are associated with the trait. These steps are implemented as a novel application of MAGMA, a tool originally developed for gene-based GWAS analyses. Applying RWAS to interrogate genetic risk for schizophrenia, we discovered a class of risk-associated AT-rich enhancers that are active in the developing brain and harbor binding sites for multiple transcription factors with neurodevelopmental functions. RWAS utilizes open-source software, and we provide a comprehensive collection of annotations for tissue-specific enhancer locations and features, including their evolutionary conservation, AT content, and co-localization with binding sites for hundreds of TFs. RWAS will enable researchers to characterize properties of regulatory elements associated with any trait of interest for which GWAS summary statistics are available. Enhancers are regulatory regions that influence gene expression via the binding of transcription factors. Risk for many heritable diseases is enriched in regulatory regions, including enhancers. In this study, we introduce a novel application of the MAGMA software tool that enables testing for associations between enhancer attributes and risk, and we use this method to determine the enhancer characteristics that are associated with risk for schizophrenia. We found that enhancers associated with schizophrenia risk are both evolutionarily conserved and in physical contact with mutation-intolerant genes, many of which have neurodevelopmental functions. Risk-associated enhancers are also AT-rich and contain binding sites for neurodevelopmental transcription factors.
Collapse
Affiliation(s)
- Alex M. Casella
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
- Medical Scientist Training Program, UMSOM, Baltimore, Maryland, United States of America
| | - Carlo Colantuoni
- Departments of Neurology and Neuroscience, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States of America
| | - Seth A. Ament
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
- Department of Psychiatry, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
5
|
Comparative Genomic Analysis of Statistically Significant Genomic Islands of Helicobacter pylori strains for better understanding the disease prognosis. Biosci Rep 2022; 42:230988. [PMID: 35258077 PMCID: PMC8935386 DOI: 10.1042/bsr20212084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 02/25/2022] [Accepted: 03/07/2022] [Indexed: 11/17/2022] Open
Abstract
Bacterial virulence factors are often located in their genomic islands (GIs). Helicobacter pylori, a highly diverse organism is reported to be associated with several gastrointestinal diseases like, gastritis, gastric cancer, peptic ulcer, duodenal ulcer etc. A novel similarity score-based comparative analysis with GIs of fifty H. pylori strains revealed clear idea of the various factors which promote disease progression. Two putative pathogenic GIs in some of the H. pylori strains were identified. One GI, having a putative labile enterotoxin and other dynamin-like proteins (DLPs), is predicted to increase the release of toxin by membrane vesicular formation. Another island contains a virulence-associated protein D (vapD) which is a component of a type-II toxin-antitoxin system (TAs), leads to enhance the severity of the H. pylori infection. Besides the well-known virulence factors like CagA, and VacA, several GIs have been identified which showed to have direct or indirect impact on H. pylori clinical outcomes. One such GI, containing lipopolysaccharide (LPS) biosynthesis genes was revealed to be directly connected with disease development by inhibiting the immune response. Another collagenase-containing GI worsens ulcers by slowing down the healing process. GI consisted of fliD operon was found to be connected to flagellar assembly and biofilm production. By residing in biofilms, bacteria can avoid antibiotic therapy, resulting in chronic infection. Along with well-studied CagA and VacA virulent genes, it is equally important to study these identified virulence factors for better understanding H. pylori induced disease prognosis.
Collapse
|
6
|
Mordstein C, Savisaar R, Young RS, Bazile J, Talmane L, Luft J, Liss M, Taylor MS, Hurst LD, Kudla G. Codon Usage and Splicing Jointly Influence mRNA Localization. Cell Syst 2020; 10:351-362.e8. [PMID: 32275854 PMCID: PMC7181179 DOI: 10.1016/j.cels.2020.03.001] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Revised: 12/19/2019] [Accepted: 03/05/2020] [Indexed: 12/11/2022]
Abstract
In the human genome, most genes undergo splicing, and patterns of codon usage are splicing dependent: guanine and cytosine (GC) content is the highest within single-exon genes and within first exons of multi-exon genes. However, the effects of codon usage on gene expression are typically characterized in unspliced model genes. Here, we measured the effects of splicing on expression in a panel of synonymous reporter genes that varied in nucleotide composition. We found that high GC content increased protein yield, mRNA yield, cytoplasmic mRNA localization, and translation of unspliced reporters. Splicing did not affect the expression of GC-rich variants. However, splicing promoted the expression of AT-rich variants by increasing their steady-state protein and mRNA levels, in part through promoting cytoplasmic localization of mRNA. We propose that splicing promotes the nuclear export of AU-rich mRNAs and that codon- and splicing-dependent effects on expression are under evolutionary pressure in the human genome.
Collapse
Affiliation(s)
- Christine Mordstein
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK; Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Rosina Savisaar
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK; Instituto de Medicina Molecular, João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, Lisboa, Portugal
| | - Robert S Young
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK; Centre for Global Health Research, Usher Institute, The University of Edinburgh, Edinburgh, UK
| | - Jeanne Bazile
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Lana Talmane
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Juliet Luft
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Michael Liss
- Thermo Fisher Scientific, GENEART GmbH, Regensburg, Germany
| | - Martin S Taylor
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK
| | - Laurence D Hurst
- Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute for Genetics and Molecular Medicine, The University of Edinburgh, Edinburgh, UK.
| |
Collapse
|
7
|
Arhondakis S, Milanesi M, Castrignanò T, Gioiosa S, Valentini A, Chillemi G. Evidence of distinct gene functional patterns in GC-poor and GC-rich isochores in Bos taurus. Anim Genet 2020; 51:358-368. [PMID: 32069522 DOI: 10.1111/age.12917] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/20/2020] [Indexed: 01/10/2023]
Abstract
Vertebrate genomes are mosaics of megabase-size DNA segments with a fairly homogeneous base composition, called isochores. They are divided into five families characterized by different guanine-cytosine (GC) levels and linked to several functional and structural properties. The increased availability of fully sequenced genomes allows the investigation of isochores in several species, assessing their level of conservation across vertebrate genomes. In this work, we characterized the isochores in Bos taurus using the ARS-UCD1.2 genome version. The comparison of our results with the well-studied human isochores and those of other mammals revealed a large conservation in isochore families, in number, average GC levels and gene density. Exceptions to the established increase in gene density with the increase in isochores (GC%) were observed for the following gene biotypes: tRNA, small nuclear RNA, small nucleolar RNA and pseudogenes that have their maximum number in H2 and H1 isochores. Subsequently, we assessed the ontology of all gene biotypes looking for functional classes that are statistically over- or under-represented in each isochore. Receptor activity and sensory perception pathways were significantly over-represented in L1 and L2 (GC-poor) isochores. This was also validated for the horse genome. Our analysis of housekeeping genes confirmed a preferential localization in GC-rich isochores, as reported in other species. Finally, we assessed the SNP distribution of a bovine high-density SNP chip across the isochores, finding a higher density in the GC-rich families, reflecting a potential bias in the chip, widely used for genetic selection and biodiversity studies.
Collapse
Affiliation(s)
- S Arhondakis
- Bioinformatics and Computational Science (BioCoS), Boniali 11-19, Chania, 73134, Crete, Greece
| | - M Milanesi
- Department of Support, Production and Animal Health, School of Veterinary Medicine, São Paulo State University, 16050-680 R. Clóvis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,International Atomic Energy Agency Collaborating Centre on Animal Genomics and Bioinformatics, 16050-680 R. Clóvis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil
| | - T Castrignanò
- SCAI - Super Computing Applications and Innovation Department, CINECA, Rome, Italy
| | - S Gioiosa
- SCAI - Super Computing Applications and Innovation Department, CINECA, Rome, Italy
| | - A Valentini
- Department for Innovation in Biological, Agro-food and Forest Systems, DIBAF, University of Tuscia, via S. Camillo de Lellis s.n.c, 01100, Viterbo, Italy
| | - G Chillemi
- Department for Innovation in Biological, Agro-food and Forest Systems, DIBAF, University of Tuscia, via S. Camillo de Lellis s.n.c, 01100, Viterbo, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, IBIOM, CNR, Bari, Italy
| |
Collapse
|
8
|
Cardiello JF, Sanchez GJ, Allen MA, Dowell RD. Lessons from eRNAs: understanding transcriptional regulation through the lens of nascent RNAs. Transcription 2019; 11:3-18. [PMID: 31856658 DOI: 10.1080/21541264.2019.1704128] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Nascent transcription assays, such as global run-on sequencing (GRO-seq) and precision run-on sequencing (PRO-seq), have uncovered a myriad of unstable RNAs being actively produced from numerous sites genome-wide. These transcripts provide a more complete and immediate picture of the impact of regulatory events. Transcription factors recruit RNA polymerase II, effectively initiating the process of transcription; repressors inhibit polymerase recruitment. Efficiency of recruitment is dictated by sequence elements in and around the RNA polymerase loading zone. A combination of sequence elements and RNA binding proteins subsequently influence the ultimate stability of the resulting transcript. Some of these transcripts are capable of providing feedback on the process, influencing subsequent transcription. By monitoring RNA polymerase activity, nascent assays provide insights into every step of the regulated process of transcription.
Collapse
Affiliation(s)
| | - Gilson J Sanchez
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
| | - Mary A Allen
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA
| | - Robin D Dowell
- BioFrontiers Institute, University of Colorado, Boulder, CO, USA.,Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO, USA
| |
Collapse
|
9
|
Quan H, Yang Y, Liu S, Tian H, Xue Y, Gao YQ. Chromatin structure changes during various processes from a DNA sequence view. Curr Opin Struct Biol 2019; 62:1-8. [PMID: 31765966 DOI: 10.1016/j.sbi.2019.10.010] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 10/14/2019] [Accepted: 10/28/2019] [Indexed: 12/19/2022]
Abstract
Chromatin mainly consists of protein and DNA, and the sequence information of DNA contributes to controlling the spatial structure of chromatin. Genome-wide contact patterns of chromosome at high precision uncover fine structural properties, conductive to exploring underlying mechanisms on structure establishment and function realization for chromatin. In this short review, we describe changes of chromatin structure during various biological processes from a DNA sequence view, with an increase of the overall domain segregation from birth to senescence and establishment of cell identity related cross-domain contacts. Segregation patterns vary with cell stage and genomic distance. Meanwhile, possible effects of cell cycle, temperature, nuclear lamina and nucleolus on chromatin structure are discussed. At last, important roles of transcription factors and other proteins in proper chromatin organization are also discussed.
Collapse
Affiliation(s)
- Hui Quan
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Ying Yang
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Sirui Liu
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Hao Tian
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yue Xue
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yi Qin Gao
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China; Biomedical Pioneering Innovation Center (BIOPIC), School of Life Sciences, Peking University, Beijing 100871, China.
| |
Collapse
|
10
|
Payne BL, Alvarez-Ponce D. Codon Usage Differences among Genes Expressed in Different Tissues of Drosophila melanogaster. Genome Biol Evol 2019; 11:1054-1065. [PMID: 30859203 PMCID: PMC6456009 DOI: 10.1093/gbe/evz051] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/08/2019] [Indexed: 12/22/2022] Open
Abstract
Codon usage patterns are affected by both mutational biases and translational selection. The frequency at which each codon is used in the genome is directly linked to the cellular concentrations of their corresponding tRNAs. Transfer RNA abundances—as well as the abundances of other potentially relevant factors, such as RNA-binding proteins—may vary across different tissues, making it possible that genes expressed in different tissues are subject to different translational selection regimes, and thus differ in their patterns of codon usage. These differences, however, are poorly understood, having been studied only in Arabidopsis, rice and human, with controversial results in human. Drosophila melanogaster is a suitable model organism to study tissue-specific codon adaptation given its large effective population size. Here, we compare 2,046 genes, each expressed specifically in one tissue of D. melanogaster. We show that genes expressed in different tissues exhibit significant differences in their patterns of codon usage, and that these differences are only partially due to differences in GC content, expression levels, or protein lengths. Remarkably, these differences are stronger when analyses are restricted to highly expressed genes. Our results strongly suggest that genes expressed in different tissues are subject to different regimes of translational selection.
Collapse
|
11
|
Uddin A, Paul N, Chakraborty S. The codon usage pattern of genes involved in ovarian cancer. Ann N Y Acad Sci 2019; 1440:67-78. [PMID: 30843242 DOI: 10.1111/nyas.14019] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2018] [Revised: 01/04/2019] [Accepted: 01/14/2019] [Indexed: 12/20/2022]
Abstract
In this study, we analyzed the compositional dynamics and codon usage pattern of genes involved in ovarian cancer (OC) using a computational method. Mutations in specific genes are associated with OC, and some genes are risk factors for progression of OC, but no work has been reported yet on the codon usage pattern of genes involved in OC. Nucleotide composition analysis of OC-related genes suggested that the overall GC content was higher than AT content; that is, the genes were GC rich. The improved effective number of codons indicated that the overall extent of codon usage bias of genes involved in OC was low. The codons AGC, CTG, ATC, ACC, GTG, and GCC were overrepresented, while the codons TCG, TTA, CTA, CCG, CAA, CGT, ATA, ACG, GTA, GTT, GCG, and GGT were underrepresented in the genes. Correspondence analysis suggested that the codon usage pattern was different in different genes. A highly significant correlation was observed between GC12 and GC3 (r = 0.587, P < 0.01) of genes, suggesting that directional mutation affected the three codon positions. Our report on the codon usage pattern of genes involved in OC includes a new perspective for elucidating the mechanisms of biased usage of synonymous codons, as well as providing useful clues for molecular genetic engineering.
Collapse
Affiliation(s)
- Arif Uddin
- Department of Zoology, Moinul Hoque Choudhury Memorial Science College, Assam, India
| | - Nirmal Paul
- Department of Biotechnology, Assam University, Assam, India
| | | |
Collapse
|
12
|
Wei K, Zhang T, Ma L. Divergent and convergent evolution of housekeeping genes in human-pig lineage. PeerJ 2018; 6:e4840. [PMID: 29844985 PMCID: PMC5971102 DOI: 10.7717/peerj.4840] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 05/03/2018] [Indexed: 11/27/2022] Open
Abstract
Housekeeping genes are ubiquitously expressed and maintain basic cellular functions across tissue/cell type conditions. The present study aimed to develop a set of pig housekeeping genes and compare the structure, evolution and function of housekeeping genes in the human–pig lineage. By using RNA sequencing data, we identified 3,136 pig housekeeping genes. Compared with human housekeeping genes, we found that pig housekeeping genes were longer and subjected to slightly weaker purifying selection pressure and faster neutral evolution. Common housekeeping genes, shared by the two species, achieve stronger purifying selection than species-specific genes. However, pig- and human-specific housekeeping genes have similar functions. Some species-specific housekeeping genes have evolved independently to form similar protein active sites or structure, such as the classical catalytic serine–histidine–aspartate triad, implying that they have converged for maintaining the basic cellular function, which allows them to adapt to the environment. Human and pig housekeeping genes have varied structures and gene lists, but they have converged to maintain basic cellular functions essential for the existence of a cell, regardless of its specific role in the species. The results of our study shed light on the evolutionary dynamics of housekeeping genes.
Collapse
Affiliation(s)
- Kai Wei
- College of Life Science, Shihezi University, Shihezi, Xinjiang, China
| | - Tingting Zhang
- College of Life Science, Shihezi University, Shihezi, Xinjiang, China
| | - Lei Ma
- College of Life Science, Shihezi University, Shihezi, Xinjiang, China
| |
Collapse
|
13
|
Abstract
The prevalence of purifying selection in the nature suggests that larger organisms bear a higher number of slightly deleterious mutations because of smaller populations and therefore weaker selection. In this work redistribution of purifying selection in favor of information genes, pathways and processes was found in primates compared with treeshrew and rodents on the ground of genome-wide analysis. The genes which are more favored in primates belong mainly to regulation of gene expression and development, in treeshrew and rodents, to metabolism, transport, energetics, reproduction and olfaction. The former occur predominantly in the nucleus, the latter, in the cytoplasm and membranes. Thus, although purifying selection is on average weaker in the primates, it is stronger concentrated on the "information technology" of life (regulation of gene expression and development). Increased accuracy of information processes probably allows escaping "error catastrophes" in spite of more complex organization, larger body size and higher longevity.
Collapse
|
14
|
Kryuchkova-Mostacci N, Robinson-Rechavi M. A benchmark of gene expression tissue-specificity metrics. Brief Bioinform 2017; 18:205-214. [PMID: 26891983 PMCID: PMC5444245 DOI: 10.1093/bib/bbw008] [Citation(s) in RCA: 161] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Indexed: 01/06/2023] Open
Abstract
One of the major properties of genes is their expression pattern. Notably, genes are often classified as tissue specific or housekeeping. This property is of interest to molecular evolution as an explanatory factor of, e.g. evolutionary rate, as well as a functional feature which may in itself evolve. While many different methods of measuring tissue specificity have been proposed and used for such studies, there has been no comparison or benchmarking of these methods to our knowledge, and little justification of their use. In this study, we compare nine measures of tissue specificity. Most methods were established for ESTs and microarrays, and several were later adapted to RNA-seq. We analyse their capacity to distinguish gene categories, their robustness to the choice and number of tissues used and their capture of evolutionary conservation signal.
Collapse
Affiliation(s)
- Nadezda Kryuchkova-Mostacci
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
15
|
DNA helix: the importance of being AT-rich. Mamm Genome 2017; 28:455-464. [PMID: 28836096 DOI: 10.1007/s00335-017-9713-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 08/12/2017] [Indexed: 01/02/2023]
Abstract
The AT-rich DNA is mostly associated with condensed chromatin, whereas the GC-rich sequence is preferably located in the dispersed chromatin. The AT-rich genes are prone to be tissue-specific (silenced in most tissues), while the GC-rich genes tend to be housekeeping (expressed in many tissues). This paper reports another important property of DNA base composition, which can affect repertoire of genes with high AT content. The GC-rich sequence is more liable to mutation. We found that Spearman correlation between human gene GC content and mutation probability is above 0.9. The change of base composition even in synonymous sites affects mutation probability of nonsynonymous sites and thus of encoded proteins. There is a unique type of housekeeping genes, which are especially unsafe when prone to mutation. Natural selection which usually removes deleterious mutations, in the case of these genes only increases the hazard because it can descend to suborganismal (cellular) level. These are cell cycle-related genes. In accordance with the proposed concept, they have low GC content of synonymous sites (despite them being housekeeping). The gene-centred protein interaction enrichment analysis (PIEA) showed the core clusters of genes whose interactants are modularly enriched in genes with AT-rich synonymous codons. This interconnected network is involved in double-strand break repair, DNA integrity checkpoints and chromosome pairing at mitosis. The damage of these genes results in genome and chromosome instability leading to cancer and other 'error catastrophes'. Reducing the nonsynonymous mutations, the usage of AT-rich synonymous codons can decrease probability of cancer by above 20-fold.
Collapse
|
16
|
Pouyet F, Mouchiroud D, Duret L, Sémon M. Recombination, meiotic expression and human codon usage. eLife 2017; 6:27344. [PMID: 28826480 PMCID: PMC5576983 DOI: 10.7554/elife.27344] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 08/14/2017] [Indexed: 12/17/2022] Open
Abstract
Synonymous codon usage (SCU) varies widely among human genes. In particular, genes involved in different functional categories display a distinct codon usage, which was interpreted as evidence that SCU is adaptively constrained to optimize translation efficiency in distinct cellular states. We demonstrate here that SCU is not driven by constraints on tRNA abundance, but by large-scale variation in GC-content, caused by meiotic recombination, via the non-adaptive process of GC-biased gene conversion (gBGC). Expression in meiotic cells is associated with a strong decrease in recombination within genes. Differences in SCU among functional categories reflect differences in levels of meiotic transcription, which is linked to variation in recombination and therefore in gBGC. Overall, the gBGC model explains 70% of the variance in SCU among genes. We argue that the strong heterogeneity of SCU induced by gBGC in mammalian genomes precludes any optimization of the tRNA pool to the demand in codon usage.
Collapse
Affiliation(s)
- Fanny Pouyet
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Claude Bernard, Villeurbanne, France
| | - Dominique Mouchiroud
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Claude Bernard, Villeurbanne, France
| | - Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Claude Bernard, Villeurbanne, France
| | - Marie Sémon
- Laboratory of Biology and Modelling of the Cell, UnivLyon, ENS de Lyon, Univ Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratoire de Biologie et Modélisation de la Cellule, Lyon, France
| |
Collapse
|
17
|
Tarallo A, Gambi MC, D'Onofrio G. Lifestyle and DNA base composition in polychaetes. Physiol Genomics 2016; 48:883-888. [PMID: 27764763 DOI: 10.1152/physiolgenomics.00018.2016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2016] [Accepted: 09/27/2016] [Indexed: 11/22/2022] Open
Abstract
A comparative analysis of polychaete species, classified as motile and low-motile forms, highlighted that the former were characterized not only by a higher metabolic rate (MR), but also by a higher genomic GC content. The fluctuation of both variables was not affected by the phylogenetic relationship of the species. Thus, present results further support that a very active lifestyle affects MR and GC at the same time, showing an unexpected similarity between invertebrates and vertebrates. In teleosts, indeed, a similar pattern has been also observed in comparisons of migratory and nonmigratory species. A cause-effect link between MR and GC has not yet been proved, but the fact that the two variables are significantly linked in all the organisms so far analyzed is, most probably, of relevant biological and evolutionary meaning. The present results fit very well within the frame of the metabolic rate hypothesis proposed to explain the DNA base composition variability among organisms. On the contrary, the thermostability hypothesis was not supported. At present, no data about the recombination rate in polychaetes were available to test the biased gene conversion (BGC hypothesis).
Collapse
Affiliation(s)
- Andrea Tarallo
- Stazione Zoologica Anton Dohrn, Department of Biology and Evolution of Marine Organisms, Naples, Italy; and
| | - Maria Cristina Gambi
- Stazione Zoologica Anton Dohrn, Department of Integrative Marine Ecology (Villa Dohrn-Benthic Ecology Center), Ischia, Naples, Italy
| | - Giuseppe D'Onofrio
- Stazione Zoologica Anton Dohrn, Department of Biology and Evolution of Marine Organisms, Naples, Italy; and
| |
Collapse
|
18
|
Sizova TV, Karpova OI. The length of chromatin loops in meiotic prophase I of warm-blooded vertebrates depends on the DNA compositional organization. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416110144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
19
|
Katsumura T, Fukuyo Y, Kawamura S, Oota H. A comparative study on the regulatory region of the PERIOD1 gene among diurnal/nocturnal primates. J Physiol Anthropol 2016; 35:21. [PMID: 27680326 PMCID: PMC5039903 DOI: 10.1186/s40101-016-0111-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2016] [Accepted: 09/14/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The circadian clock is set up around a 24-h period in humans who are awake in the daytime and sleep in the nighttime, accompanied with physiological and metabolic rhythms. Most haplorhine primates, including humans, are diurnal, while most "primitive" strepsirrhine primates are nocturnal, suggesting primates have evolved from nocturnal to diurnal habits. The mechanisms of physiological changes causing the habits and of genetic changes causing the physiological changes are, however, unknown. To reveal these mechanisms, we focus on the nucleotide sequences of the regulatory region of the PERIOD1 (PER1) gene that is known as one of the key elements of the circadian clock in mammalians. METHODS We determined nucleotide sequences of the regulatory region of PER1 concerning the gene expression for six primates and compared those with those of eight primates from the international DNA database. Based on the sequence data, we constructed a phylogenetic tree including both the diurnal/nocturnal species and investigated the guanine and cytosine (GC) content in the regulatory region. RESULTS The motif sequences regulating gene expression were evolutionary conservative in the primates examined. The phylogenetic tree simply showed phylogenetic relationship among the species and no branching pattern distinguishable between the diurnal and nocturnal groups. We found two cores showing a statistically significant difference between the diurnal and the nocturnal habits related to the GC contents of the regulatory region of PER1. CONCLUSION Our results suggest the possibility that the two cores in the upstream region of PER1 are related to the regulation of gene expression leading to behavioral differences between diurnal and nocturnal primates.
Collapse
Affiliation(s)
- Takafumi Katsumura
- Department of Anatomy, Kitasato University School of Medicine, 1-15-1 Kitasato, Minami-ku, Sagamihara, Kanagawa, 252-0374, Japan
| | - Yukiko Fukuyo
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8562, Japan
| | - Shoji Kawamura
- Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8562, Japan
| | - Hiroki Oota
- Department of Anatomy, Kitasato University School of Medicine, 1-15-1 Kitasato, Minami-ku, Sagamihara, Kanagawa, 252-0374, Japan. .,Department of Integrated Biosciences, Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba, 277-8562, Japan.
| |
Collapse
|
20
|
Barton C, Iliopoulos CS, Pissis SP, Arhondakis S. Transcriptome activity of isochores during preimplantation process in human and mouse. FEBS Lett 2016; 590:2297-306. [PMID: 27279593 DOI: 10.1002/1873-3468.12245] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Revised: 05/27/2016] [Accepted: 06/03/2016] [Indexed: 12/17/2022]
Abstract
This work investigates the role of isochores during preimplantation process. Using RNA-seq data from human and mouse preimplantation stages, we created the spatio-temporal transcriptional profiles of the isochores during preimplantation. We found that from early to late stages, GC-rich isochores increase their expression while GC-poor ones decrease it. Network analysis revealed that modules with few coexpressed isochores are GC-poorer than medium-large ones, characterized by an opposite expression as preimplantation advances, decreasing and increasing respectively. Our results reveal a functional contribution of the isochores, supporting the presence of structural-functional interactions during maturation and early-embryonic development.
Collapse
Affiliation(s)
- Carl Barton
- The Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, UK
| | | | | | - Stilianos Arhondakis
- Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology - Hellas (FORTH), Heraklion, Crete, Greece
| |
Collapse
|
21
|
Tarallo A, Angelini C, Sanges R, Yagi M, Agnisola C, D'Onofrio G. On the genome base composition of teleosts: the effect of environment and lifestyle. BMC Genomics 2016; 17:173. [PMID: 26935583 PMCID: PMC4776435 DOI: 10.1186/s12864-016-2537-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 02/25/2016] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND The DNA base composition is well known to be highly variable among organisms. Bio-physic studies on the effect of the GC increments on the DNA structure have shown that GC-richer DNA sequences are more bendable. The result was the keystone of the hypothesis proposing the metabolic rate as the major force driving the GC content variability, since an increased resistance to the torsion stress is mainly required during the transcription process to avoid DNA breakage. Hence, the aim of the present work is to test if both salinity and migration, suggested to affect the metabolic rate of teleostean fishes, affect the average genomic GC content as well. Moreover, since the gill surface has been reported to be a major morphological expression of metabolic rate, this parameter was also analyzed in the light of the above hypothesis. RESULTS Teleosts living in different environments (freshwater and seawater) and with different lifestyles (migratory and non-migratory) were analyzed studying three variables: routine metabolic rate, gill area and genomic GC-content, none of them showing a phylogenetic signal among fish species. Routine metabolic rate, specific gill area and average genomic GC were higher in seawater than freshwater species. The same trend was observed comparing migratory versus non-migratory species. Crossing salinity and lifestyle, the active migratory species living in seawater show coincidentally the highest routine metabolic rate, the highest specific gill area and the highest average genomic GC content. CONCLUSIONS The results clearly highlight that environmental factors (salinity) and lifestyle (migration) affect not only the physiology (i.e. the routine metabolic rate), and the morphology (i.e. gill area) of teleosts, but also basic genome feature (i.e. the GC content), thus opening to an interesting liaison among the three variables in the light of the metabolic rate hypothesis.
Collapse
Affiliation(s)
- Andrea Tarallo
- Genome Evolution and Organization - Department BEOM, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121, Naples, Italy
| | - Claudia Angelini
- Istituto per le Applicazioni del Calcolo "Mauro Picone" - CNR, Via Pietro Castellino, 111, 80131, Naples, Italy
| | - Remo Sanges
- Genome Evolution and Organization - Department BEOM, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121, Naples, Italy
| | - Mitsuharu Yagi
- Faculty of Fisheries, Nagasaki University, 1-14 Bunkyo, Nagasaki, 852-8521, Japan
| | - Claudio Agnisola
- Department of Biology, Complesso Universitario di Monte Sant'Angelo, University of Naples Federico II, Edificio 7, Via Cinthia, 80126, Naples, Italy
| | - Giuseppe D'Onofrio
- Genome Evolution and Organization - Department BEOM, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121, Naples, Italy.
| |
Collapse
|
22
|
Vlasschaert C, Xia X, Coulombe J, Gray DA. Evolution of the highly networked deubiquitinating enzymes USP4, USP15, and USP11. BMC Evol Biol 2015; 15:230. [PMID: 26503449 PMCID: PMC4624187 DOI: 10.1186/s12862-015-0511-1] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Accepted: 10/17/2015] [Indexed: 12/19/2022] Open
Abstract
Background USP4, USP15 and USP11 are paralogous deubiquitinating enzymes as evidenced by structural organization and sequence similarity. Based on known interactions and substrates it would appear that they have partially redundant roles in pathways vital to cell proliferation, development and innate immunity, and elevated expression of all three has been reported in various human malignancies. The nature and order of duplication events that gave rise to these extant genes has not been determined, nor has their functional redundancy been established experimentally at the organismal level. Methods We have employed phylogenetic and syntenic reconstruction methods to determine the chronology of the duplication events that generated the three paralogs and have performed genetic crosses to evaluate redundancy in mice. Results Our analyses indicate that USP4 and USP15 arose from whole genome duplication prior to the emergence of jawed vertebrates. Despite having lower sequence identity USP11 was generated later in vertebrate evolution by small-scale duplication of the USP4-encoding region. While USP11 was subsequently lost in many vertebrate species, all available genomes retain a functional copy of either USP4 or USP15, and through genetic crosses of mice with inactivating mutations we have confirmed that viability is contingent on a functional copy of USP4 or USP15. Loss of ubiquitin-exchange regulation, constitutive skipping of the seventh exon and neural-specific expression patterns are derived states of USP11. Post-translational modification sites differ between USP4, USP15 and USP11 throughout evolution. Conclusions In isolation sequence alignments can generate erroneous USP gene phylogenies. Through a combination of methodologies the gene duplication events that gave rise to USP4, USP15, and USP11 have been established. Although it operates in the same molecular pathways as the other USPs, the rapid divergence of the more recently generated USP11 enzyme precludes its functional interchangeability with USP4 and USP15. Given their multiplicity of substrates the emergence (and in some cases subsequent loss) of these USP paralogs would be expected to alter the dynamics of the networks in which they are embedded. Electronic supplementary material The online version of this article (doi:10.1186/s12862-015-0511-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Caitlyn Vlasschaert
- Department of Biology, University of Ottawa, Ottawa, Canada. .,Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Canada. .,The Ottawa Hospital Research Institute, Ottawa, Canada.
| | - Xuhua Xia
- Department of Biology, University of Ottawa, Ottawa, Canada. .,Ottawa Institute of Systems Biology, Ottawa, Canada.
| | | | - Douglas A Gray
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Canada. .,The Ottawa Hospital Research Institute, Ottawa, Canada. .,Centre for Cancer Therapeutics, Ottawa Hospital Research Institute, 501 Smyth Road, Ottawa, ON, K1H 8L6, Canada.
| |
Collapse
|
23
|
Vinogradov AE. Consolidation of slow or fast but not moderately evolving genes at the level of pathways and processes. Gene 2015; 561:30-4. [PMID: 25707747 DOI: 10.1016/j.gene.2015.01.066] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Revised: 01/04/2015] [Accepted: 01/09/2015] [Indexed: 11/15/2022]
Abstract
Conservatism versus innovation is probably the most important dichotomy of all evolving systems. In molecular evolution the distinction between conservative (negative) selection, innovative (positive) selection and unconstrained evolution (drift) is usually ambiguous at the gene level. Only rare cases with the ratio of nonsynonymous to synonymous nucleotide substitutions above unity (dN/dS>1) are thought to be due to positive selection, whereas the lower dN/dS ratio may indicate negative selection in combination with drift. The density of the dN/dS ratio for orthologous genes forms a unimodal distribution where no particular regions can be discerned. Here it is shown that at the level of overrepresented pathways and processes the picture is strikingly different. The distribution is strongly polarized with a wide completely depressed middle part. This three-phase distribution is very robust. It is observed with various substitution models and remains at very low significance of overrepresentation (up to p<0.99). This fact suggests consolidation of either negative or positive selection but not of unconstrained evolution at the level of pathways/processes. The effect is demonstrated for different phylogenetic distances: from human to other primates, mammals and vertebrates. This approach suggests estimating the boundaries for conservative and innovative selection using the pathway/process level. Emphasizing the role of a critical mass of negatively or positively selected genes in a pathway/process, it can elucidate how the bridge between 'tinkering' at the gene level and 'design' at the higher levels is forming.
Collapse
|
24
|
Scala G, Affinito O, Miele G, Monticelli A, Cocozza S. Evidence for evolutionary and nonevolutionary forces shaping the distribution of human genetic variants near transcription start sites. PLoS One 2014; 9:e114432. [PMID: 25474578 PMCID: PMC4256220 DOI: 10.1371/journal.pone.0114432] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Accepted: 11/09/2014] [Indexed: 11/19/2022] Open
Abstract
The regions surrounding transcription start sites (TSSs) of genes play a critical role in the regulation of gene expression. At the same time, current evidence indicates that these regions are particularly stressed by transcription-related mutagenic phenomena. In this work we performed a genome-wide analysis of the distribution of single nucleotide polymorphisms (SNPs) inside the 10 kb region flanking human TSSs by dividing SNPs into four classes according to their frequency (rare, two intermediate classes, and common). We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS. We found that the distribution of variants is generally different for TSSs located inside or outside of CpG islands. We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores. Furthermore, our analysis suggests that evolutionary (purifying selection) and nonevolutionary (biased gene conversion) forces both play a role in determining the relative SNP frequency around TSSs. Finally, we analyzed the potential pathogenicity of each class of variant using the Combined Annotation Dependent Depletion score. In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.
Collapse
Affiliation(s)
- Giovanni Scala
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Dipartimento di Fisica, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Naples, Italy
- * E-mail:
| | - Ornella Affinito
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Dipartimento di Medicina Molecolare e Biotecnologie Mediche, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Istituto di Endocrinologia ed Oncologia Sperimentale (IEOS), CNR, Naples, Italy
| | - Gennaro Miele
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Dipartimento di Fisica, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Naples, Italy
| | - Antonella Monticelli
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Istituto di Endocrinologia ed Oncologia Sperimentale (IEOS), CNR, Naples, Italy
| | - Sergio Cocozza
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Dipartimento di Medicina Molecolare e Biotecnologie Mediche, Università degli Studi di Napoli “Federico II”, Naples, Italy
| |
Collapse
|
25
|
Implications of human genome structural heterogeneity: functionally related genes tend to reside in organizationally similar genomic regions. BMC Genomics 2014; 15:252. [PMID: 24684786 PMCID: PMC4234528 DOI: 10.1186/1471-2164-15-252] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2012] [Accepted: 03/21/2014] [Indexed: 01/30/2023] Open
Abstract
Background In an earlier study, we hypothesized that genomic segments with different sequence
organization patterns (OPs) might display functional specificity despite their
similar GC content. Here we tested this hypothesis by dividing the human genome
into 100 kb segments, classifying these segments into five compositional
groups according to GC content, and then characterizing each segment within the
five groups by oligonucleotide counting (k-mer analysis; also referred to as
compositional spectrum analysis, or CSA), to examine the distribution of sequence
OPs in the segments. We performed the CSA on the entire DNA, i.e., its coding and
non-coding parts the latter being much more abundant in the genome than the
former. Results We identified 38 OP-type clusters of segments that differ in their compositional
spectrum (CS) organization. Many of the segments that shared the same OP type were
enriched with genes related to the same biological processes (developmental,
signaling, etc.), components of biochemical complexes, or organelles. Thirteen
OP-type clusters showed significant enrichment in genes connected to specific
gene-ontology terms. Some of these clusters seemed to reflect certain events
during periods of horizontal gene transfer and genome expansion, and subsequent
evolution of genomic regions requiring coordinated regulation. Conclusions There may be a tendency for genes that are involved in the same biological
process, complex or organelle to use the same OP, even at a distance of ~
100 kb from the genes. Although the intergenic DNA is non-coding, the general
pattern of sequence organization (e.g., reflected in over-represented
oligonucleotide “words”) may be important and were protected, to some
extent, in the course of evolution.
Collapse
|
26
|
Mugal CF, Arndt PF, Ellegren H. Twisted signatures of GC-biased gene conversion embedded in an evolutionary stable karyotype. Mol Biol Evol 2013; 30:1700-12. [PMID: 23564940 PMCID: PMC3684855 DOI: 10.1093/molbev/mst067] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The genomes of many vertebrates show a characteristic heterogeneous distribution of GC content, the so-called GC isochore structure. The origin of isochores has been explained via the mechanism of GC-biased gene conversion (gBGC). However, although the isochore structure is declining in many mammalian genomes, the heterogeneity in GC content is being reinforced in the avian genome. Despite this discrepancy, which remains unexplained, examinations of individual substitution frequencies in mammals and birds are both consistent with the gBGC model of isochore evolution. On the other hand, a negative correlation between substitution and recombination rate found in the chicken genome is inconsistent with the gBGC model. It should therefore be important to consider along with gBGC other consequences of recombination on the origin and fate of mutations, as well as to account for relationships between recombination rate and other genomic features. We therefore developed an analytical model to describe the substitution patterns found in the chicken genome, and further investigated the relationships between substitution patterns and several genomic features in a rigorous statistical framework. Our analysis indicates that GC content itself, either directly or indirectly via interrelations to other genomic features, has an impact on the substitution pattern. Further, we suggest that this phenomenon is particularly visible in avian genomes due to their unusually low rate of chromosomal evolution. Because of this, interrelations between GC content and other genomic features are being reinforced, and are as such more pronounced in avian genomes as compared with other vertebrate genomes with a less stable karyotype.
Collapse
Affiliation(s)
- Carina F Mugal
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | | | | |
Collapse
|
27
|
Rao YS, Chai XW, Wang ZF, Nie QH, Zhang XQ. Impact of GC content on gene expression pattern in chicken. Genet Sel Evol 2013; 45:9. [PMID: 23557030 PMCID: PMC3641017 DOI: 10.1186/1297-9686-45-9] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2012] [Accepted: 03/16/2013] [Indexed: 11/21/2022] Open
Abstract
Background GC content varies greatly between different genomic regions in many eukaryotes. In order to determine whether this organization named isochore organization influences gene expression patterns, the relationship between GC content and gene expression has been investigated in man and mouse. However, to date, this question is still a matter for debate. Among the avian species, chicken (Gallus gallus) is the best studied representative with a complete genome sequence. The distinctive features and organization of its sequence make it a good model to explore important issues in genome structure and evolution. Methods Only nuclear genes with complete information on protein-coding sequence with no evidence of multiple-splicing forms were included in this study. Chicken protein coding sequences, complete mRNA sequences (or full length cDNA sequences), and 5′ untranslated region sequences (5′ UTR) were downloaded from Ensembl and chicken expression data originated from a previous work. Three indices i.e. expression level, expression breadth and maximum expression level were used to measure the expression pattern of a given gene. CpG islands were identified using hgTables of the UCSC Genome Browser. Correlation analysis between variables was performed by SAS Proprietary Software Release 8.1. Results In chicken, the GC content of 5′ UTR is significantly and positively correlated with expression level, expression breadth, and maximum expression level, whereas that of coding sequences and introns and at the third coding position are negatively correlated with expression level and expression breadth, and not correlated with maximum expression level. These significant trends are independent of recombination rate, chromosome size and gene density. Furthermore, multiple linear regression analysis indicated that GC content in genes could explain approximately 10% of the variation in gene expression. Conclusions GC content is significantly associated with gene expression pattern and could be one of the important regulation factors in the chicken genome.
Collapse
Affiliation(s)
- You Sheng Rao
- Department of Biological Technology, Jiangxi Educational Institute, Jiangxi, Nanchang 330029, China
| | | | | | | | | |
Collapse
|
28
|
Frousios K, Iliopoulos CS, Tischler G, Kossida S, Pissis SP, Arhondakis S. Transcriptome map of mouse isochores in embryonic and neonatal cortex. Genomics 2012. [PMID: 23195409 DOI: 10.1016/j.ygeno.2012.11.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Several studies on adult tissues agree on the presence of a positive effect of the genomic and genic base composition on mammalian gene expression. Recent literature supports the idea that during developmental processes GC-poor genomic regions are preferentially implicated. We investigate the relationship between the compositional properties of the isochores and of the genes with their respective expression activity during developmental processes. Using RNA-seq data from two distinct developmental stages of the mouse cortex, embryonic day 18 (E18) and postnatal day 7 (P7), we established for the first time a developmental-related transcriptome map of the mouse isochores. Additionally, for each stage we estimated the correlation between isochores' GC level and their expression activity, and the genes' expression patterns for each isochore family. Our analyses add evidence supporting the idea that during development GC-poor isochores are preferentially implicated, and confirm the positive effect of genes' GC level on their expression activity.
Collapse
Affiliation(s)
- Kimon Frousios
- Department of Informatics, King's College London, The Strand, London WC2R 2LS, UK
| | - Costas S Iliopoulos
- Department of Informatics, King's College London, The Strand, London WC2R 2LS, UK; School of Mathematics and Statistics, University of Western Australia, 35 Stirling Highway, Crawley, Perth WA 6009, Australia
| | - German Tischler
- Lehrstuhl für Informatik 2, Universität Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Sophia Kossida
- Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou, Athens 115 27, Greece
| | - Solon P Pissis
- Florida Museum of Natural History, University of Florida, 1659 Museum Road, Gainesville, FL 32611, USA; Heidelberg Institute for Theoretical Studies, 35 Schloss-Wolfsbrunnenweg, Heidelberg D-69118, Germany
| | - Stilianos Arhondakis
- Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou, Athens 115 27, Greece.
| |
Collapse
|
29
|
Mutational bias and translational selection shaping the codon usage pattern of tissue-specific genes in rice. PLoS One 2012; 7:e48295. [PMID: 23144748 PMCID: PMC3483185 DOI: 10.1371/journal.pone.0048295] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2012] [Accepted: 09/24/2012] [Indexed: 11/21/2022] Open
Abstract
The regulatory mechanisms of determining which genes specifically expressed in which tissues are still not fully elucidated, especially in plants. Using internal correspondence analysis, I first establish that tissue-specific genes exhibit significantly different synonymous codon usage in rice, although this effect is weak. The variability of synonymous codon usage between tissues accounts for 5.62% of the total codon usage variability, which has mainly arisen from the neutral evolutionary forces, such as GC content variation among tissues. Moreover, tissue-specific genes are under differential selective constraints, inferring that natural selection also contributes to the codon usage divergence between tissues. These findings may add further evidence in understanding the differentiation and regulation of tissue-specific gene products in plants.
Collapse
|
30
|
Frenkel S, Kirzhner V, Korol A. Organizational heterogeneity of vertebrate genomes. PLoS One 2012; 7:e32076. [PMID: 22384143 PMCID: PMC3288070 DOI: 10.1371/journal.pone.0032076] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2011] [Accepted: 01/23/2012] [Indexed: 01/06/2023] Open
Abstract
Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.
Collapse
Affiliation(s)
| | | | - Abraham Korol
- Department of Evolutionary and Environmental Biology and Institute of Evolution, University of Haifa, Mount Carmel, Haifa, Israel
| |
Collapse
|
31
|
Arhondakis S, Auletta F, Bernardi G. Isochores and the regulation of gene expression in the human genome. Genome Biol Evol 2012; 3:1080-9. [PMID: 21979159 PMCID: PMC3227402 DOI: 10.1093/gbe/evr017] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
It is well established that changes in the phenotype depend much more on changes in gene expression than on changes in protein-coding genes, and that cis-regulatory sequences and chromatin structure are two major factors influencing gene expression. Here, we investigated these factors at the genome-wide level by focusing on the trinucleotide patterns in the 0.1- to 25-kb regions flanking the human genes that are present in the GC-poorest L1 and GC-richest H3 isochore families, the other families exhibiting intermediate patterns. We could show 1) that the trinucleotide patterns of the 25-kb gene-flanking regions are representative of the very different patterns already reported for the whole isochores from the L1 and H3 families and, expectedly, identical in upstream and downstream locations; 2) that the patterns of the 0.1- to 0.5-kb regions in the L1 and H3 isochores are remarkably more divergent and more specific when compared with those of the 25-kb regions, as well as different in the upstream and downstream locations; and 3) that these patterns fade into the 25-kb patterns around 5kb in both upstream and downstream locations. The 25-kb findings indicate differences in nucleosome positioning and density in different isochore families, those of the 0.1- to 0.5-kb sequences indicate differences in the transcription factors that bind upstream and downstream of genes. These results indicate differences in the regulation of genes located in different isochore families, a point of functional and evolutionary relevance.
Collapse
Affiliation(s)
- Stilianos Arhondakis
- Bioinformatics and Medical Informatics Team, Biomedical Research Foundation of the Academy of Athens, Athens, Greece
| | | | | |
Collapse
|
32
|
Shaw GTW, Shih ESC, Chen CH, Hwang MJ. Preservation of ranking order in the expression of human Housekeeping genes. PLoS One 2011; 6:e29314. [PMID: 22216246 PMCID: PMC3245260 DOI: 10.1371/journal.pone.0029314] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2011] [Accepted: 11/24/2011] [Indexed: 01/26/2023] Open
Abstract
Housekeeping (HK) genes fulfill the basic needs for a cell to survive and function properly. Their ubiquitous expression, originally thought to be constant, can vary from tissue to tissue, but this variation remains largely uncharacterized and it could not be explained by previously identified properties of HK genes such as short gene length and high GC content. By analyzing microarray expression data for human genes, we uncovered a previously unnoted characteristic of HK gene expression, namely that the ranking order of their expression levels tends to be preserved from one tissue to another. Further analysis by tensor product decomposition and pathway stratification identified three main factors of the observed ranking preservation, namely that, compared to those of non-HK (NHK) genes, the expression levels of HK genes show a greater degree of dispersion (less overlap), stableness (a smaller variation in expression between tissues), and correlation of expression. Our results shed light on regulatory mechanisms of HK gene expression that are probably different for different HK genes or pathways, but are consistent and coordinated in different tissues.
Collapse
Affiliation(s)
- Grace T. W. Shaw
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
| | - Edward S. C. Shih
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
- Chemical Biology and Molecular Biophysics Program, Taiwan International Graduate Program, Institute of Biological Chemistry, Academia Sinica, Taipei, Taiwan
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, Taiwan
| | - Chun-Houh Chen
- Institute of Statistical Science, Academia Sinica, Taipei, Taiwan
| | - Ming-Jing Hwang
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan
- Chemical Biology and Molecular Biophysics Program, Taiwan International Graduate Program, Institute of Biological Chemistry, Academia Sinica, Taipei, Taiwan
- * E-mail:
| |
Collapse
|
33
|
Arhondakis S, Frousios K, Iliopoulos CS, Pissis SP, Tischler G, Kossida S. Transcriptome map of mouse isochores. BMC Genomics 2011; 12:511. [PMID: 22004510 PMCID: PMC3215772 DOI: 10.1186/1471-2164-12-511] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2011] [Accepted: 10/17/2011] [Indexed: 12/28/2022] Open
Abstract
Background The availability of fully sequenced genomes and the implementation of transcriptome technologies have increased the studies investigating the expression profiles for a variety of tissues, conditions, and species. In this study, using RNA-seq data for three distinct tissues (brain, liver, and muscle), we investigate how base composition affects mammalian gene expression, an issue of prime practical and evolutionary interest. Results We present the transcriptome map of the mouse isochores (DNA segments with a fairly homogeneous base composition) for the three different tissues and the effects of isochores' base composition on their expression activity. Our analyses also cover the relations between the genes' expression activity and their localization in the isochore families. Conclusions This study is the first where next-generation sequencing data are used to associate the effects of both genomic and genic compositional properties to their corresponding expression activity. Our findings confirm previous results, and further support the existence of a relationship between isochores and gene expression. This relationship corroborates that isochores are primarily a product of evolutionary adaptation rather than a simple by-product of neutral evolutionary processes.
Collapse
Affiliation(s)
- Stilianos Arhondakis
- Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou, 115 27, Athens, Greece.
| | | | | | | | | | | |
Collapse
|
34
|
Abstract
Efficient and prolonged human cystic fibrosis transmembrane conductance regulator (hCFTR) expression is a major goal for cystic fibrosis (CF) lung therapy. A hCFTR expression plasmid was optimized as a payload for compacted DNA nanoparticles formulated with polyethylene glycol (PEG)-substituted 30-mer lysine peptides. A codon-optimized and CpG-reduced hCFTR synthetic gene (CO-CFTR) was placed in a polyubiquitin C expression plasmid. Compared to hCFTR complementary DNA (cDNA), CO-CFTR produced a ninefold increased level of hCFTR protein in transfected HEK293 cells and, when compacted as DNA nanoparticles, produced a similar improvement in lung mRNA expression in Balb/c and fatty acid binding protein promoter (FABP) CF mice, although expression duration was transient. Various vector modifications were tested to extend duration of CO-CFTR expression. A novel prolonged expression (PE) element derived from the bovine growth hormone (BGH) gene 3' flanking sequence produced prolonged expression of CO-CFTR mRNA at biologically relevant levels. A time course study in the mouse lung revealed that CO-CFTR mRNA did not change significantly, with CO-CFTR/mCFTR geometric mean ratios of 94% on day 2, 71% on day 14, 53% on day 30, and 14% on day 59. Prolonged CO-CFTR expression is dependent on the orientation of the PE element and its transcription, is not specific to the UbC promoter, and is less dependent on other vector backbone elements.
Collapse
|
35
|
Dong B, Zhang P, Chen X, Liu L, Wang Y, He S, Chen R. Predicting housekeeping genes based on Fourier analysis. PLoS One 2011; 6:e21012. [PMID: 21687628 PMCID: PMC3110801 DOI: 10.1371/journal.pone.0021012] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2010] [Accepted: 05/18/2011] [Indexed: 11/19/2022] Open
Abstract
Housekeeping genes (HKGs) generally have fundamental functions in basic biochemical processes in organisms, and usually have relatively steady expression levels across various tissues. They play an important role in the normalization of microarray technology. Using Fourier analysis we transformed gene expression time-series from a Hela cell cycle gene expression dataset into Fourier spectra, and designed an effective computational method for discriminating between HKGs and non-HKGs using the support vector machine (SVM) supervised learning algorithm which can extract significant features of the spectra, providing a basis for identifying specific gene expression patterns. Using our method we identified 510 human HKGs, and then validated them by comparison with two independent sets of tissue expression profiles. Results showed that our predicted HKG set is more reliable than three previously identified sets of HKGs.
Collapse
Affiliation(s)
- Bo Dong
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Peng Zhang
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Xiaowei Chen
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Li Liu
- Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Yunfei Wang
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
- Graduate School of the Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Shunmin He
- Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Runsheng Chen
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, People's Republic of China
| |
Collapse
|
36
|
Chen B, Jia T, Ma R, Zhang B, Kang L. Evolution of hsp70 gene expression: a role for changes in AT-richness within promoters. PLoS One 2011; 6:e20308. [PMID: 21655251 PMCID: PMC3105046 DOI: 10.1371/journal.pone.0020308] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2011] [Accepted: 04/28/2011] [Indexed: 11/19/2022] Open
Abstract
In disparate organisms adaptation to thermal stress has been linked to changes in the expression of genes encoding heat-shock proteins (Hsp). The underlying genetics, however, remain elusive. We show here that two AT-rich sequence elements in the promoter region of the hsp70 gene of the fly Liriomyza sativae that are absent in the congeneric species, Liriomyza huidobrensis, have marked cis-regulatory consequences. We studied the cis-regulatory consequences of these elements (called ATRS1 and ATRS2) by measuring the constitutive and heat-shock-induced luciferase luminescence that they drive in cells transfected with constructs carrying them modified, deleted, or intact, in the hsp70 promoter fused to the luciferase gene. The elements affected expression level markedly and in different ways: Deleting ATRS1 augmented both the constitutive and the heat-shock-induced luminescence, suggesting that this element represses transcription. Interestingly, replacing the element with random sequences of the same length and A+T content delivered the wild-type luminescence pattern, proving that the element's high A+T content is crucial for its effects. Deleting ATRS2 decreased luminescence dramatically and almost abolished heat-shock inducibility and so did replacing the element with random sequences matching the element's length and A+T content, suggesting that ATRS2's effects on transcription and heat-shock inducibility involve a common mechanism requiring at least in part the element's specific primary structure. Finally, constitutive and heat-shock luminescence were reduced strongly when two putative binding sites for the Zeste transcription factor identified within ATRS2 were altered through site-directed mutagenesis, and the heat-shock-induced luminescence increased when Zeste was over-expressed, indicating that Zeste participates in the effects mapped to ATRS2 at least in part. AT-rich sequences are common in promoters and our results suggest that they should play important roles in regulatory evolution since they can affect expression markedly and constrain promoter DNA in at least two different ways.
Collapse
Affiliation(s)
- Bing Chen
- Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Tieliu Jia
- Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Ronghui Ma
- Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Bo Zhang
- Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Le Kang
- Institute of Zoology, Chinese Academy of Sciences, Beijing, China
- * E-mail:
| |
Collapse
|
37
|
Misawa K, Kikuno RF. Relationship between amino acid composition and gene expression in the mouse genome. BMC Res Notes 2011; 4:20. [PMID: 21272306 PMCID: PMC3038927 DOI: 10.1186/1756-0500-4-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2010] [Accepted: 01/27/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Codon bias is a phenomenon that refers to the differences in the frequencies of synonymous codons among different genes. In many organisms, natural selection is considered to be a cause of codon bias because codon usage in highly expressed genes is biased toward optimal codons. Methods have previously been developed to predict the expression level of genes from their nucleotide sequences, which is based on the observation that synonymous codon usage shows an overall bias toward a few codons called major codons. However, the relationship between codon bias and gene expression level, as proposed by the translation-selection model, is less evident in mammals. FINDINGS We investigated the correlations between the expression levels of 1,182 mouse genes and amino acid composition, as well as between gene expression and codon preference. We found that a weak but significant correlation exists between gene expression levels and amino acid composition in mouse. In total, less than 10% of variation of expression levels is explained by amino acid components. We found the effect of codon preference on gene expression was weaker than the effect of amino acid composition, because no significant correlations were observed with respect to codon preference. CONCLUSION These results suggest that it is difficult to predict expression level from amino acid components or from codon bias in mouse.
Collapse
Affiliation(s)
- Kazuharu Misawa
- Research Program for Computational Science, Research and Development Group for Next-Generation Integrated Living Matter Simulation, Fusion of Data and Analysis Research and Development Team, RIKEN, 4-6-1 Shirokane-dai, Minato-ku, Tokyo 108-8639, Japan.
| | | |
Collapse
|
38
|
Zhang W, Wu W, Lin W, Zhou P, Dai L, Zhang Y, Huang J, Zhang D. Deciphering heterogeneity in pig genome assembly Sscrofa9 by isochore and isochore-like region analyses. PLoS One 2010; 5:e13303. [PMID: 20948965 PMCID: PMC2952626 DOI: 10.1371/journal.pone.0013303] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2010] [Accepted: 09/15/2010] [Indexed: 11/18/2022] Open
Abstract
Background The isochore, a large DNA sequence with relatively small GC variance, is one of the most important structures in eukaryotic genomes. Although the isochore has been widely studied in humans and other species, little is known about its distribution in pigs. Principal Findings In this paper, we construct a map of long homogeneous genome regions (LHGRs), i.e., isochores and isochore-like regions, in pigs to provide an intuitive version of GC heterogeneity in each chromosome. The LHGR pattern study not only quantifies heterogeneities, but also reveals some primary characteristics of the chromatin organization, including the followings: (1) the majority of LHGRs belong to GC-poor families and are in long length; (2) a high gene density tends to occur with the appearance of GC-rich LHGRs; and (3) the density of LINE repeats decreases with an increase in the GC content of LHGRs. Furthermore, a portion of LHGRs with particular GC ranges (50%–51% and 54%–55%) tend to have abnormally high gene densities, suggesting that biased gene conversion (BGC), as well as time- and energy-saving principles, could be of importance to the formation of genome organization. Conclusion This study significantly improves our knowledge of chromatin organization in the pig genome. Correlations between the different biological features (e.g., gene density and repeat density) and GC content of LHGRs provide a unique glimpse of in silico gene and repeats prediction.
Collapse
Affiliation(s)
- Wenqian Zhang
- Bioinformatics Center, College of Life Science, Northwest A&F University, Xianyang, Shaanxi, China
| | - Wenwu Wu
- Bioinformatics Center, College of Life Science, Northwest A&F University, Xianyang, Shaanxi, China
| | - Wenchao Lin
- Bioinformatics Center, College of Life Science, Northwest A&F University, Xianyang, Shaanxi, China
| | - Pengfang Zhou
- Bioinformatics Center, College of Life Science, Northwest A&F University, Xianyang, Shaanxi, China
| | - Li Dai
- Bioinformatics Center, College of Life Science, Northwest A&F University, Xianyang, Shaanxi, China
| | - Yang Zhang
- Investigation Group of Molecular Virology, Immunology, Oncology and Systems Biology, and Bioinformatics Center, College of Veterinary Medicine, Northwest A&F University, Xianyang, Shaanxi, China
| | - Jingfei Huang
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, Yunnan, China
- * E-mail: (DZ); (JH)
| | - Deli Zhang
- Investigation Group of Molecular Virology, Immunology, Oncology and Systems Biology, and Bioinformatics Center, College of Veterinary Medicine, Northwest A&F University, Xianyang, Shaanxi, China
- * E-mail: (DZ); (JH)
| |
Collapse
|
39
|
Anatskaya OV, Vinogradov AE. Somatic polyploidy promotes cell function under stress and energy depletion: evidence from tissue-specific mammal transcriptome. Funct Integr Genomics 2010; 10:433-46. [PMID: 20625914 DOI: 10.1007/s10142-010-0180-5] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2010] [Revised: 06/12/2010] [Accepted: 06/16/2010] [Indexed: 02/08/2023]
Abstract
Polyploid cells show great among-species and among-tissues diversity and relation to developmental mode, suggesting their importance in adaptive evolution and developmental programming. At the same time, excessive polyploidization is a hallmark of functional impairment, aging, growth disorders, and numerous pathologies including cancer and cardiac diseases. To shed light on this paradox and to find out how polyploidy contributes to organ functions, we review here the ploidy-associated shifts in activity of narrowly expressed (tissue specific) genes in human and mouse heart and liver, which have the reciprocal pattern of polyploidization. For this purpose, we use the modular biology approach and genome-scale cross-species comparison. It is evident from this review that heart and liver show similar traits in response to polyploidization. In both organs, polyploidy protects vitality (mainly due to the activation of sirtuin-mediated pathways), triggers the reserve adenosine-5'-triphosphate (ATP) production, and sustains tissue-specific functions by switching them to energy saving mode. In heart, the strongest effects consisted in the concerted up-regulation of contractile proteins and substitution of energy intensive proteins with energy economic ones. As a striking example, the energy intensive alpha myosin heavy chain (providing fast contraction) decreased its expression by a factor of 10, allowing a 270-fold increase of expression of beta myosin heavy chain (providing slow contraction), which has approximately threefold lower ATP-hydrolyzing activity. The liver showed the enhancement of immunity, reactive oxygen species and xenobiotic detoxication, and numerous metabolic adaptations to long-term energy depletion. Thus, somatic polyploidy may be an ingenious evolutionary instrument for fast adaptation to stress and new environments allowing trade-offs between high functional demand, stress, and energy depletion.
Collapse
Affiliation(s)
- Olga V Anatskaya
- Institute of Cytology, Russian Academy of Sciences, Group of Bioinformatics and Functional Genomics, St Petersburg, Russia.
| | | |
Collapse
|
40
|
Tatarinova TV, Alexandrov NN, Bouck JB, Feldmann KA. GC3 biology in corn, rice, sorghum and other grasses. BMC Genomics 2010; 11:308. [PMID: 20470436 PMCID: PMC2895627 DOI: 10.1186/1471-2164-11-308] [Citation(s) in RCA: 105] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2009] [Accepted: 05/16/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The third, or wobble, position in a codon provides a high degree of possible degeneracy and is an elegant fault-tolerance mechanism. Nucleotide biases between organisms at the wobble position have been documented and correlated with the abundances of the complementary tRNAs. We and others have noticed a bias for cytosine and guanine at the third position in a subset of transcripts within a single organism. The bias is present in some plant species and warm-blooded vertebrates but not in all plants, or in invertebrates or cold-blooded vertebrates. RESULTS Here we demonstrate that in certain organisms the amount of GC at the wobble position (GC3) can be used to distinguish two classes of genes. We highlight the following features of genes with high GC3 content: they (1) provide more targets for methylation, (2) exhibit more variable expression, (3) more frequently possess upstream TATA boxes, (4) are predominant in certain classes of genes (e.g., stress responsive genes) and (5) have a GC3 content that increases from 5'to 3'. These observations led us to formulate a hypothesis to explain GC3 bimodality in grasses. CONCLUSIONS Our findings suggest that high levels of GC3 typify a class of genes whose expression is regulated through DNA methylation or are a legacy of accelerated evolution through gene conversion. We discuss the three most probable explanations for GC3 bimodality: biased gene conversion, transcriptional and translational advantage and gene methylation.
Collapse
Affiliation(s)
- Tatiana V Tatarinova
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA.
| | | | | | | |
Collapse
|
41
|
Vinogradov AE. Human transcriptome nexuses: basic-eukaryotic and metazoan. Genomics 2010; 95:345-54. [PMID: 20298777 DOI: 10.1016/j.ygeno.2010.03.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 03/01/2010] [Accepted: 03/08/2010] [Indexed: 01/10/2023]
Abstract
Using a new approach, I analysed human transcriptome coexpression network and revealed two large-scale nexuses. Besides gene coexpression, each nexus is characterized by a combination of gene evolutionary origin, function and among-tissues expression breadth. The first nexus contains mostly genes of pre-metazoan origin, which are widely expressed and have cell-centred functions. The second nexus is enriched in genes of metazoan origin, which are expressed more narrowly and have organism-centred functions. The revealed nexuses are supported by asymmetry in distribution of transcription factor targets between them. Within the metazoan nexus, there is a subnexus that is more pronounced in the nervous tissues and is enriched in gene regulatory complexity. It mostly contains genes related to nervous system, cell communication and multicellular organism processes and development. The revealed nexuses indicate a dichotomy in the transcriptional regulation and can provide a framework for further functional genomics studies.
Collapse
|
42
|
Sabbbia V, Romero H, Musto H, Naya H. Composition Profile of the Human Genome at the Chromosome Level. J Biomol Struct Dyn 2009; 27:361-70. [DOI: 10.1080/07391102.2009.10507322] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
43
|
Monoallelic expression and tissue specificity are associated with high crossover rates. Trends Genet 2009; 25:519-22. [PMID: 19850368 DOI: 10.1016/j.tig.2009.10.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2009] [Revised: 10/02/2009] [Accepted: 10/05/2009] [Indexed: 11/20/2022]
Abstract
What determines the recombination rate of a gene? Following the observation that, in humans, imprinted genes have unusually high recombination levels, we ask whether increased recombination is seen for other monoallelically expressed genes and, more generally, how transcriptional properties relate to recombination. We find that monoallelically expressed genes do have high crossover rates and discover a striking negative correlation between within-gene crossover rate and expression breadth. We hypothesise that these findings are possibly symptomatic of a more general, adverse relationship between recombination and transcription in the human genome.
Collapse
|
44
|
Duret L, Galtier N. Biased gene conversion and the evolution of mammalian genomic landscapes. Annu Rev Genomics Hum Genet 2009; 10:285-311. [PMID: 19630562 DOI: 10.1146/annurev-genom-082908-150001] [Citation(s) in RCA: 468] [Impact Index Per Article: 31.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Recombination is typically thought of as a symmetrical process resulting in large-scale reciprocal genetic exchanges between homologous chromosomes. Recombination events, however, are also accompanied by short-scale, unidirectional exchanges known as gene conversion in the neighborhood of the initiating double-strand break. A large body of evidence suggests that gene conversion is GC-biased in many eukaryotes, including mammals and human. AT/GC heterozygotes produce more GC- than AT-gametes, thus conferring a population advantage to GC-alleles in high-recombining regions. This apparently unimportant feature of our molecular machinery has major evolutionary consequences. Structurally, GC-biased gene conversion explains the spatial distribution of GC-content in mammalian genomes-the so-called isochore structure. Functionally, GC-biased gene conversion promotes the segregation and fixation of deleterious AT --> GC mutations, thus increasing our genomic mutation load. Here we review the recent evidence for a GC-biased gene conversion process in mammals, and its consequences for genomic landscapes, molecular evolution, and human functional genomics.
Collapse
Affiliation(s)
- Laurent Duret
- Université de Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, F-69622, Villeurbanne, France.
| | | |
Collapse
|
45
|
Elhaik E, Landan G, Graur D. Can GC content at third-codon positions be used as a proxy for isochore composition? Mol Biol Evol 2009; 26:1829-33. [PMID: 19443854 DOI: 10.1093/molbev/msp100] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The isochore theory depicts the genomes of warm-blooded vertebrates as a mosaic of long genomic regions that are characterized by relatively homogeneous GC content. In the absence of genomic data, the GC content at third-codon positions of protein-coding genes (GC3) was commonly used as a proxy for the GC content of isochores. Oddly, in the postgenomic era, GC3 is still sometimes used as a proxy for the GC composition of isochores. Here, we use genic and genomic sequences from human, chimpanzee, cow, mouse, rat, chicken, and zebrafish to show that GC3 only explains a very small proportion of the variation in GC content of long genomic sequences flanking the genes (GCf), and what little correlation there is between GC3 and GCf was found to decay rapidly with distance from the gene. The coefficient of variation of GC3 was found to be much larger than that of GCf and, therefore, GC3 and GCf values are not comparable with each other. Comparisons of orthologous gene pairs from 1) human and chimpanzee and 2) mouse and rat show strong correlations between their GC3 values, but very weak correlations between their GCf values. We conclude that the GC content of third-codon position cannot be used as stand-in for isochoric composition.
Collapse
Affiliation(s)
- Eran Elhaik
- Department of Biology and Biochemistry, University of Houston, TX, USA
| | | | | |
Collapse
|
46
|
Alaux C, Le Conte Y, Adams HA, Rodriguez-Zas S, Grozinger CM, Sinha S, Robinson GE. Regulation of brain gene expression in honey bees by brood pheromone. GENES BRAIN AND BEHAVIOR 2009; 8:309-19. [PMID: 19220482 DOI: 10.1111/j.1601-183x.2009.00480.x] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Affiliation(s)
- C Alaux
- Department of Entomology, University of Illinois at Urbana-Champaign, Urbana, IL, USA.
| | | | | | | | | | | | | |
Collapse
|
47
|
Roymondal U, Das S, Sahoo S. Predicting gene expression level from relative codon usage bias: an application to Escherichia coli genome. DNA Res 2009; 16:13-30. [PMID: 19131380 PMCID: PMC2646356 DOI: 10.1093/dnares/dsn029] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We present an expression measure of a gene, devised to predict the level of gene expression from relative codon bias (RCB). There are a number of measures currently in use that quantify codon usage in genes. Based on the hypothesis that gene expressivity and codon composition is strongly correlated, RCB has been defined to provide an intuitively meaningful measure of an extent of the codon preference in a gene. We outline a simple approach to assess the strength of RCB (RCBS) in genes as a guide to their likely expression levels and illustrate this with an analysis of Escherichia coli (E. coli) genome. Our efforts to quantitatively predict gene expression levels in E. coli met with a high level of success. Surprisingly, we observe a strong correlation between RCBS and protein length indicating natural selection in favour of the shorter genes to be expressed at higher level. The agreement of our result with high protein abundances, microarray data and radioactive data demonstrates that the genomic expression profile available in our method can be applied in a meaningful way to the study of cell physiology and also for more detailed studies of particular genes of interest.
Collapse
Affiliation(s)
- Uttam Roymondal
- Department of Mathematics, Raidighi College, South 24 Parganas, Raidighi, West Bengal, India
| | | | | |
Collapse
|
48
|
Mukhopadhyay P, Basak S, Ghosh TC. Differential selective constraints shaping codon usage pattern of housekeeping and tissue-specific homologous genes of rice and arabidopsis. DNA Res 2008; 15:347-56. [PMID: 18827062 PMCID: PMC2608846 DOI: 10.1093/dnares/dsn023] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Intra-genomic variation between housekeeping and tissue-specific genes has always been a study of interest in higher eukaryotes. To-date, however, no such investigation has been done in plants. Availability of whole genome expression data for both rice and Arabidopsis has made it possible to examine the evolutionary forces in shaping codon usage pattern in both housekeeping and tissue-specific genes in plants. In the present work, we have taken 4065 rice-Arabidopsis homologous gene pairs to study evolutionary forces responsible for codon usage divergence between housekeeping and tissue-specific genes. In both rice and Arabidopsis, it is mutational bias that regulates error minimization in highly expressed genes of both housekeeping and tissue-specific genes. Our results show that, in comparison to tissue-specific genes, housekeeping genes are under strong selective constraint in plants. However, in tissue-specific genes, lowly expressed genes are under stronger selective constraint compared with highly expressed genes. We demonstrated that constraint acting on mRNA secondary structure is responsible for modulating codon usage variations in rice tissue-specific genes. Thus, different evolutionary forces must underline the evolution of synonymous codon usage of highly expressed genes of housekeeping and tissue-specific genes in rice and Arabidopsis.
Collapse
Affiliation(s)
- Pamela Mukhopadhyay
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | | | | |
Collapse
|
49
|
Duret L, Arndt PF. The impact of recombination on nucleotide substitutions in the human genome. PLoS Genet 2008; 4:e1000071. [PMID: 18464896 PMCID: PMC2346554 DOI: 10.1371/journal.pgen.1000071] [Citation(s) in RCA: 254] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2007] [Accepted: 04/11/2008] [Indexed: 01/19/2023] Open
Abstract
Unraveling the evolutionary forces responsible for variations of neutral substitution patterns among taxa or along genomes is a major issue for detecting selection within sequences. Mammalian genomes show large-scale regional variations of GC-content (the isochores), but the substitution processes at the origin of this structure are poorly understood. We analyzed the pattern of neutral substitutions in 1 Gb of primate non-coding regions. We show that the GC-content toward which sequences are evolving is strongly negatively correlated to the distance to telomeres and positively correlated to the rate of crossovers (R2 = 47%). This demonstrates that recombination has a major impact on substitution patterns in human, driving the evolution of GC-content. The evolution of GC-content correlates much more strongly with male than with female crossover rate, which rules out selectionist models for the evolution of isochores. This effect of recombination is most probably a consequence of the neutral process of biased gene conversion (BGC) occurring within recombination hotspots. We show that the predictions of this model fit very well with the observed substitution patterns in the human genome. This model notably explains the positive correlation between substitution rate and recombination rate. Theoretical calculations indicate that variations in population size or density in recombination hotspots can have a very strong impact on the evolution of base composition. Furthermore, recombination hotspots can create strong substitution hotspots. This molecular drive affects both coding and non-coding regions. We therefore conclude that along with mutation, selection and drift, BGC is one of the major factors driving genome evolution. Our results also shed light on variations in the rate of crossover relative to non-crossover events, along chromosomes and according to sex, and also on the conservation of hotspot density between human and chimp. Mammalian genomes show a very strong heterogeneity of base composition along chromosomes (the so-called isochores). The functional significance of these peculiar genomic landscapes is highly debated: do isochores confer some selective advantage, or are they simply the by-product of neutral evolutionary processes? To resolve this issue, we analyzed the pattern of substitution in the human genome by comparison with chimpanzee and macaque. We show that the evolution of base composition (GC-content) is essentially determined by the rate of recombination. This effect appears to be much stronger in male than in female germline, which rules out selective explanations for the evolution of isochores. We show that this impact of recombination is most probably a consequence of the process of biased gene conversion (BGC). This neutral process mimics the action of selection and can induce strong substitution hotspots within recombination hotspots, sometimes leading to the fixation of deleterious mutations. BGC appears to be one of the major factors driving genome evolution. It is therefore essential to take this process into account if we want to be able to interpret genome sequences.
Collapse
Affiliation(s)
- Laurent Duret
- Laboratoire de Biométrie et Biologie Evolutive, Université de Lyon, Université Lyon 1, CNRS, UMR 5558, Villeurbanne, France
- * E-mail: (LD); (PFA)
| | - Peter F. Arndt
- Department for Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
- * E-mail: (LD); (PFA)
| |
Collapse
|
50
|
Zhu J, He F, Song S, Wang J, Yu J. How many human genes can be defined as housekeeping with current expression data? BMC Genomics 2008; 9:172. [PMID: 18416810 PMCID: PMC2396180 DOI: 10.1186/1471-2164-9-172] [Citation(s) in RCA: 98] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2007] [Accepted: 04/16/2008] [Indexed: 12/16/2022] Open
Abstract
Background Housekeeping (HK) genes are ubiquitously expressed in all tissue/cell types and constitute a basal transcriptome for the maintenance of basic cellular functions. Partitioning transcriptomes into HK and tissue-specific (TS) genes relatively is fundamental for studying gene expression and cellular differentiation. Although many studies have aimed at large-scale and thorough categorization of human HK genes, a meaningful consensus has yet to be reached. Results We collected two latest gene expression datasets (both EST and microarray data) from public databases and analyzed the gene expression profiles in 18 human tissues that have been well-documented by both two data types. Benchmarked by a manually-curated HK gene collection (HK408), we demonstrated that present data from EST sampling was far from saturated, and the inadequacy has limited the gene detectability and our understanding of TS expressions. Due to a likely over-stringent threshold, microarray data showed higher false negative rate compared with EST data, leading to a significant underestimation of HK genes. Based on EST data, we found that 40.0% of the currently annotated human genes were universally expressed in at least 16 of 18 tissues, as compared to only 5.1% specifically expressed in a single tissue. Our current EST-based estimate on human HK genes ranged from 3,140 to 6,909 in number, a ten-fold increase in comparison with previous microarray-based estimates. Conclusion We concluded that a significant fraction of human genes, at least in the currently annotated data depositories, was broadly expressed. Our understanding of tissue-specific expression was still preliminary and required much more large-scale and high-quality transcriptomic data in future studies. The new HK gene list categorized in this study will be useful for genome-wide analyses on structural and functional features of HK genes.
Collapse
Affiliation(s)
- Jiang Zhu
- Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China.
| | | | | | | | | |
Collapse
|