1
|
Werner A, Kanhere A, Wahlestedt C, Mattick JS. Natural antisense transcripts as versatile regulators of gene expression. Nat Rev Genet 2024:10.1038/s41576-024-00723-z. [PMID: 38632496 DOI: 10.1038/s41576-024-00723-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/07/2024] [Indexed: 04/19/2024]
Abstract
Long non-coding RNAs (lncRNAs) are emerging as a major class of gene products that have central roles in cell and developmental biology. Natural antisense transcripts (NATs) are an important subset of lncRNAs that are expressed from the opposite strand of protein-coding and non-coding genes and are a genome-wide phenomenon in both eukaryotes and prokaryotes. In eukaryotes, a myriad of NATs participate in regulatory pathways that affect expression of their cognate sense genes. Recent developments in the study of NATs and lncRNAs and large-scale sequencing and bioinformatics projects suggest that whether NATs regulate expression, splicing, stability or translation of the sense transcript is influenced by the pattern and degrees of overlap between the sense-antisense pair. Moreover, epigenetic gene regulatory mechanisms prevail in somatic cells whereas mechanisms dependent on the formation of double-stranded RNA intermediates are prevalent in germ cells. The modulating effects of NATs on sense transcript expression make NATs rational targets for therapeutic interventions.
Collapse
Affiliation(s)
| | | | | | - John S Mattick
- University of New South Wales, Sydney, New South Wales, Australia
| |
Collapse
|
2
|
Bukhnikashvili L. Overlaps Between CDS Regions of Protein-Coding Genes in the Human Genome: A Case Study on the NR1D1-THRA Gene Pair. J Mol Evol 2023; 91:963-975. [PMID: 38006429 DOI: 10.1007/s00239-023-10147-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2023] [Accepted: 11/12/2023] [Indexed: 11/27/2023]
Abstract
For several decades, it has been known that a substantial number of genes within human DNA exhibit overlap; however, the biological and evolutionary significance of these overlaps remain poorly understood. This study focused on investigating specific instances of overlap where the overlapping DNA region encompasses the coding DNA sequences (CDSs) of protein-coding genes. The results revealed that proteins encoded by overlapping CDSs exhibit greater disorder than those from nonoverlapping CDSs. Additionally, these DNA regions were identified as GC-rich. This could be partially attributed to the absence of stop codons from two distinct reading frames rather than one. Furthermore, these regions were found to harbour fewer single-nucleotide polymorphism (SNP) sites, possibly due to constraints arising from the overlapping state where mutations could affect two genes simultaneously.While elucidating these properties, the NR1D1-THRA gene pair emerged as an exceptional case with highly structured proteins and a distinctly conserved sequence across eutherian mammals. Both NR1D1 and THRA are nuclear receptors lacking a ligand-binding domain at their C-terminus, which is the region where these gene pairs overlap. The NR1D1 gene is involved in the regulation of circadian rhythm, while the THRA gene encodes a thyroid hormone receptor, and both play crucial roles in various physiological processes. This study suggests that, in addition to their well-established functions, the specifically overlapping CDS regions of these genes may encode protein segments with additional, yet undiscovered, biological roles.
Collapse
|
3
|
McVeigh P, McCammick E, Robb E, Brophy P, Morphew RM, Marks NJ, Maule AG. Discovery of long non-coding RNAs in the liver fluke, Fasciola hepatica. PLoS Negl Trop Dis 2023; 17:e0011663. [PMID: 37769025 PMCID: PMC10564125 DOI: 10.1371/journal.pntd.0011663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Revised: 10/10/2023] [Accepted: 09/15/2023] [Indexed: 09/30/2023] Open
Abstract
Long non-coding (lnc)RNAs are a class of eukaryotic RNA that do not code for protein and are linked with transcriptional regulation, amongst a myriad of other functions. Using a custom in silico pipeline we have identified 6,436 putative lncRNA transcripts in the liver fluke parasite, Fasciola hepatica, none of which are conserved with those previously described from Schistosoma mansoni. F. hepatica lncRNAs were distinct from F. hepatica mRNAs in transcript length, coding probability, exon/intron composition, expression patterns, and genome distribution. RNA-Seq and digital droplet PCR measurements demonstrated developmentally regulated expression of lncRNAs between intra-mammalian life stages; a similar proportion of lncRNAs (14.2%) and mRNAs (12.8%) were differentially expressed (p<0.001), supporting a functional role for lncRNAs in F. hepatica life stages. While most lncRNAs (81%) were intergenic, we identified some that overlapped protein coding loci in antisense (13%) or intronic (6%) configurations. We found no unequivocal evidence for correlated developmental expression within positionally correlated lncRNA:mRNA pairs, but global co-expression analysis identified five lncRNA that were inversely co-regulated with 89 mRNAs, including a large number of functionally essential proteases. The presence of micro (mi)RNA binding sites in 3135 lncRNAs indicates the potential for miRNA-based post-transcriptional regulation of lncRNA, and/or their function as competing endogenous (ce)RNAs. The same annotation pipeline identified 24,141 putative lncRNAs in F. gigantica. This first description of lncRNAs in F. hepatica provides an avenue to future functional and comparative genomics studies that will provide a new perspective on a poorly understood aspect of parasite biology.
Collapse
Affiliation(s)
- Paul McVeigh
- School of Biological Sciences, Queen’s University Belfast, Northern Ireland, United Kingdom
| | - Erin McCammick
- School of Biological Sciences, Queen’s University Belfast, Northern Ireland, United Kingdom
| | - Emily Robb
- School of Biological Sciences, Queen’s University Belfast, Northern Ireland, United Kingdom
| | - Peter Brophy
- Department of Life Sciences, Aberystwyth University, Wales, United Kingdom
| | - Russell M. Morphew
- Department of Life Sciences, Aberystwyth University, Wales, United Kingdom
| | - Nikki J. Marks
- School of Biological Sciences, Queen’s University Belfast, Northern Ireland, United Kingdom
| | - Aaron G. Maule
- School of Biological Sciences, Queen’s University Belfast, Northern Ireland, United Kingdom
| |
Collapse
|
4
|
Parmar BS, Kieswetter A, Geens E, Vandewyer E, Ludwig C, Temmerman L. azyx-1 is a new gene that overlaps with zyxin and affects its translation in C. elegans, impacting muscular integrity and locomotion. PLoS Biol 2023; 21:e3002300. [PMID: 37713439 PMCID: PMC10575671 DOI: 10.1371/journal.pbio.3002300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 10/13/2023] [Accepted: 08/16/2023] [Indexed: 09/17/2023] Open
Abstract
Overlapping genes are widely prevalent; however, their expression and consequences are poorly understood. Here, we describe and functionally characterize a novel zyx-1 overlapping gene, azyx-1, with distinct regulatory functions in Caenorhabditis elegans. We observed conservation of alternative open reading frames (ORFs) overlapping the 5' region of zyxin family members in several animal species, and find shared sites of azyx-1 and zyxin proteoform expression in C. elegans. In line with a standard ribosome scanning model, our results support cis regulation of zyx-1 long isoform(s) by upstream initiating azyx-1a. Moreover, we report on a rare observation of trans regulation of zyx-1 by azyx-1, with evidence of increased ZYX-1 upon azyx-1 overexpression. Our results suggest a dual role for azyx-1 in influencing zyx-1 proteoform heterogeneity and highlight its impact on C. elegans muscular integrity and locomotion.
Collapse
Affiliation(s)
- Bhavesh S. Parmar
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Amanda Kieswetter
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Ellen Geens
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Elke Vandewyer
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| | - Christina Ludwig
- Bavarian Center for Biomolecular Mass Spectrometry (BayBioMS), Technische Universität München, München, Germany
| | - Liesbet Temmerman
- Animal Physiology and Neurobiology, University of Leuven (KU Leuven), Leuven, Belgium
| |
Collapse
|
5
|
Chao KH, Mao A, Salzberg SL, Pertea M. Splam: a deep-learning-based splice site predictor that improves spliced alignments. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.27.550754. [PMID: 37546880 PMCID: PMC10402160 DOI: 10.1101/2023.07.27.550754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
The process of splicing messenger RNA to remove introns plays a central role in creating genes and gene variants. Here we describe Splam, a novel method for predicting splice junctions in DNA based on deep residual convolutional neural networks. Unlike some previous models, Splam looks at a relatively limited window of 400 base pairs flanking each splice site, motivated by the observation that the biological process of splicing relies primarily on signals within this window. Additionally, Splam introduces the idea of training the network on donor and acceptor pairs together, based on the principle that the splicing machinery recognizes both ends of each intron at once. We compare Splam's accuracy to recent state-of-the-art splice site prediction methods, particularly SpliceAI, another method that uses deep neural networks. Our results show that Splam is consistently more accurate than SpliceAI, with an overall accuracy of 96% at predicting human splice junctions. Splam generalizes even to non-human species, including distant ones like the flowering plant Arabidopsis thaliana. Finally, we demonstrate the use of Splam on a novel application: processing the spliced alignments of RNA-seq data to identify and eliminate errors. We show that when used in this manner, Splam yields substantial improvements in the accuracy of downstream transcriptome analysis of both poly(A) and ribo-depleted RNA-seq libraries. Overall, Splam offers a faster and more accurate approach to detecting splice junctions, while also providing a reliable and efficient solution for cleaning up erroneous spliced alignments.
Collapse
Affiliation(s)
- Kuan-Hao Chao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Alan Mao
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Steven L Salzberg
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21211, USA
| | - Mihaela Pertea
- Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| |
Collapse
|
6
|
Ryczek N, Łyś A, Makałowska I. The Functional Meaning of 5'UTR in Protein-Coding Genes. Int J Mol Sci 2023; 24:ijms24032976. [PMID: 36769304 PMCID: PMC9917990 DOI: 10.3390/ijms24032976] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 01/20/2023] [Accepted: 01/26/2023] [Indexed: 02/05/2023] Open
Abstract
As it is well known, messenger RNA has many regulatory regions along its sequence length. One of them is the 5' untranslated region (5'UTR), which itself contains many regulatory elements such as upstream ORFs (uORFs), internal ribosome entry sites (IRESs), microRNA binding sites, and structural components involved in the regulation of mRNA stability, pre-mRNA splicing, and translation initiation. Activation of the alternative, more upstream transcription start site leads to an extension of 5'UTR. One of the consequences of 5'UTRs extension may be head-to-head gene overlap. This review describes elements in 5'UTR of protein-coding transcripts and the functional significance of protein-coding genes 5' overlap with implications for transcription, translation, and disease.
Collapse
|
7
|
Xue S, Shen W, Cai J, Jia J, Zhao D, Zhang S, Zhao X, Ma N, Wang W, Wang B, Zhang X, Liu X. Association between rs735482 polymorphism and risk of cancer: A meta-analysis of 10 case-control studies. Medicine (Baltimore) 2022; 101:e29318. [PMID: 35905230 PMCID: PMC9333535 DOI: 10.1097/md.0000000000029318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Several studies have inspected the relationship between rs735482 polymorphism and the risk of some human cancers, but the findings remain controversial. We designed this meta-analysis to validate the association between rs735482 polymorphism and cancer risk. All articles were published before September 1, 2018 and searched in Pubmed, Embase, Web of Science, China National Knowledge Infrastructure, WangFang, and Chinese BioMedical databases, STATA 12.0 software was used for statistical analysis, which provides reasonable data and technical support for this article. A total of 10 studies were included in the meta-analysis, including 2652 cancer cases and 3536 rs735482 polymorphic controls. Data were directly extracted from these studies and odds ratios with 95% confidence intervals were computed to estimate the strength of the association. By pooling all eligible studies, the rs735482 polymorphism showed no significant association with susceptibility of several cancers in all the five genetic models (the allelic model: OR = 1.019, 95% CI: 0.916-1.134, P = .731). In addition, another adjusted OR data showed a significant increased risk between the rs735482 and susceptibility of several cancers (the codominant model BB vs AA: OR = 1.353, 95% CI: 1.033-1.774, P = .028) and the stratification analysis by ethnicity indicated the rs735482 is associated with an increased risk of cancer in Chinese group (BB vs AA, OR = 1.391, 95% CI = 1.054-1.837, P = .020; AB+BB vs AA OR = 1.253, 95% CI = 1.011-1.551, P = .039). However, the ERCC1 rs735482 is associated with a decreased risk of cancer in Italian group (AB vs AA, OR = 0.600, 95% CI = 0.402-0.859, P = .012; AB + BB vs AA, OR = 0.620, 95% CI = 0.424-0.908, P = .014). The results of this meta-analysis do not support the association between rs735482 polymorphism and cancer risk. But stratified analysis showed that rs735482 significantly increased the risk of cancer in Chinese while decreased the risk of cancer in Italian. Because of the limited number of samples, larger and well-designed researches are needed to estimate this association in detail.
Collapse
Affiliation(s)
- Shilin Xue
- School of Basic Medical Sciences Peking University, Peking University Health Science Center, Beijing, China
| | - Wenya Shen
- Department of Occupational and Environmental Health, School of Public Health, Hebei Medical University, Hebei Key Laboratory of Environment and Human Health, Shijiazhuang, Hebei, China
| | - Jianning Cai
- Department of Epidemic Treating and Preventing, Center for Disease Prevention and Control of Shijiazhuang City, Shijiazhuang, China
| | - Jinhai Jia
- Graduate School, Hebei Medical University, Shijiazhuang, China
| | - Dan Zhao
- Department of Occupational and Environmental Health, School of Public Health, Hebei Medical University, Hebei Key Laboratory of Environment and Human Health, Shijiazhuang, Hebei, China
| | - Shan Zhang
- Department of Occupational and Environmental Health, School of Public Health, Hebei Medical University, Hebei Key Laboratory of Environment and Human Health, Shijiazhuang, Hebei, China
| | - Xiujun Zhao
- Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Hebei Province Key Laboratory of Environment and Human Health, Shijiazhuang, Hebei, China
| | - Ning Ma
- Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Hebei Province Key Laboratory of Environment and Human Health, Shijiazhuang, Hebei, China
| | - Wenjuan Wang
- Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Hebei Province Key Laboratory of Environment and Human Health, Shijiazhuang, Hebei, China
| | - Bingshuang Wang
- Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Hebei Province Key Laboratory of Environment and Human Health, Shijiazhuang, Hebei, China
| | - Xiaolin Zhang
- Department of Epidemiology and Statistics, School of Public Health, Hebei Medical University, Hebei Province Key Laboratory of Environment and Human Health, Shijiazhuang, Hebei, China
| | - Xuehui Liu
- Department of Occupational and Environmental Health, School of Public Health, Hebei Medical University, Hebei Key Laboratory of Environment and Human Health, Shijiazhuang, Hebei, China
- *Correspondence: Xuehui Liu, Department of Occupational and Environmental Health, School of Public Health, Hebei Medical University, Hebei Key Laboratory of Environment and Human Health, Shijiazhuang, China (e-mail: )
| |
Collapse
|
8
|
Pholtaisong J, Chaiyaratana N, Aporntewan C, Mutirangura A. Mononucleotide A-repeats may Play a Regulatory Role in Endothermic Housekeeping Genes. Evol Bioinform Online 2022; 18:11769343221110656. [PMID: 35860694 PMCID: PMC9290108 DOI: 10.1177/11769343221110656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 07/02/2022] [Indexed: 11/24/2022] Open
Abstract
Background: Coding and non-coding short tandem repeats (STRs) facilitate a great diversity of phenotypic traits. The imbalance of mononucleotide A-repeats around transcription start sites (TSSs) was found in 3 mammals: H. sapiens, M. musculus, and R. norvegicus. Principal Findings: We found that the imbalance pattern originated in some vertebrates. A similar pattern was observed in mammals and birds, but not in amphibians and reptiles. We proposed that the enriched A-repeats upstream of TSSs is a novel hallmark of endotherms or warm-blooded animals. Gene ontology analysis indicates that the primary function of upstream A-repeats involves metabolism, cellular transportation, and sensory perception (smell and chemical stimulus) through housekeeping genes. Conclusions: Upstream A-repeats may play a regulatory role in the metabolic process of endothermic animals.
Collapse
Affiliation(s)
- Jatuphol Pholtaisong
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Pathumwan, Bangkok, Thailand
| | - Nachol Chaiyaratana
- Department of Electrical and Computer Engineering, Faculty of Engineering, King Mongkut's University of Technology North Bangkok, Bangkok, Thailand.,Division of Medical Genetics Research and Laboratory, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Chatchawit Aporntewan
- Program in Bioinformatics and Computational Biology, Graduate School, Chulalongkorn University, Pathumwan, Bangkok, Thailand.,Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Pathumwan, Bangkok, Thailand.,Omics Sciences and Bioinformatics Center, Chulalongkorn University, Pathumwan, Bangkok, Thailand
| | - Apiwat Mutirangura
- Center of Excellence in Molecular Genetics of Cancer and Human Diseases, Department of Anatomy, Faculty of Medicine, Chulalongkorn University, Pathumwan, Bangkok, Thailand
| |
Collapse
|
9
|
Riegger RJ, Caliskan N. Thinking Outside the Frame: Impacting Genomes Capacity by Programmed Ribosomal Frameshifting. Front Mol Biosci 2022; 9:842261. [PMID: 35281266 PMCID: PMC8915115 DOI: 10.3389/fmolb.2022.842261] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 01/26/2022] [Indexed: 01/08/2023] Open
Abstract
Translation facilitates the transfer of the genetic information stored in the genome via messenger RNAs to a functional protein and is therefore one of the most fundamental cellular processes. Programmed ribosomal frameshifting is a ubiquitous alternative translation event that is extensively used by viruses to regulate gene expression from overlapping open reading frames in a controlled manner. Recent technical advances in the translation field enabled the identification of precise mechanisms as to how and when ribosomes change the reading frame on mRNAs containing cis-acting signals. Several studies began also to illustrate that trans-acting RNA modulators can adjust the timing and efficiency of frameshifting illuminating that frameshifting can be a dynamically regulated process in cells. Here, we intend to summarize these new findings and emphasize how it fits in our current understanding of PRF mechanisms as previously described.
Collapse
Affiliation(s)
- Ricarda J. Riegger
- Helmholtz Centre for Infection Research (HZI), Helmholtz Institute for RNA-Based Infection Research (HIRI), Würzburg, Germany
- Graduate School of Life Sciences (GSLS), University of Würzburg, Würzburg, Germany
| | - Neva Caliskan
- Helmholtz Centre for Infection Research (HZI), Helmholtz Institute for RNA-Based Infection Research (HIRI), Würzburg, Germany
- Medical Faculty, University of Würzburg, Würzburg, Germany
- *Correspondence: Neva Caliskan,
| |
Collapse
|
10
|
Abstract
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
Collapse
|
11
|
Decrulle AL, Frénoy A, Meiller-Legrand TA, Bernheim A, Lotton C, Gutierrez A, Lindner AB. Engineering gene overlaps to sustain genetic constructs in vivo. PLoS Comput Biol 2021; 17:e1009475. [PMID: 34624014 PMCID: PMC8528312 DOI: 10.1371/journal.pcbi.1009475] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 10/20/2021] [Accepted: 09/23/2021] [Indexed: 11/20/2022] Open
Abstract
Evolution is often an obstacle to the engineering of stable biological systems due to the selection of mutations inactivating costly gene circuits. Gene overlaps induce important constraints on sequences and their evolution. We show that these constraints can be harnessed to increase the stability of costly genes by purging loss-of-function mutations. We combine computational and synthetic biology approaches to rationally design an overlapping reading frame expressing an essential gene within an existing gene to protect. Our algorithm succeeded in creating overlapping reading frames in 80% of E. coli genes. Experimentally, scoring mutations in both genes of such overlapping construct, we found that a significant fraction of mutations impacting the gene to protect have a deleterious effect on the essential gene. Such an overlap thus protects a costly gene from removal by natural selection by associating the benefit of this removal with a larger or even lethal cost. In our synthetic constructs, the overlap converts many of the possible mutants into evolutionary dead-ends, reducing the evolutionary potential of the system and thus increasing its stability over time. Genomes are translated by triplets of nucleotides on two different strands, allowing for six different reading frames. This permits the existence of gene overlaps, often observed in microbial genomes, where two different proteins are encoded on the same piece of DNA, but in different reading frames. Gene overlaps are classically considered an obstacle for both evolution and genetic engineering, as mutations in overlapping regions likely have pleitrotropic effects on several genes. In 2013, we identified specific evolutionary scenarios where the decrease in evolutionary potential caused by gene overlaps could instead be advantageous and selected for. In this work, we demonstrate the use of gene overlaps in another context where reducing evolutionary potential can be useful: preventing evolution from inactivating synthetic circuits. We show that gene overlaps can be engineered to increase the evolutionary stability of genes that are costly to their hosts, by entangling these costly genes with essential genes.
Collapse
Affiliation(s)
| | - Antoine Frénoy
- Université de Paris, INSERM U1001, Paris, France
- Université Grenoble Alpes, CNRS UMR5525, Grenoble, France
- * E-mail: (AF); (ABL)
| | | | | | | | | | - Ariel B. Lindner
- Université de Paris, INSERM U1001, Paris, France
- Université de Paris, INSERM U1284, Center for Research and Interdisciplinarity (CRI), Paris, France
- * E-mail: (AF); (ABL)
| |
Collapse
|
12
|
Mehravar M, Ghaemimanesh F, Poursani EM. Exon and intron sharing in opposite direction-an undocumented phenomenon in human genome-between Pou5f1 and Tcf19 genes. BMC Genomics 2021; 22:718. [PMID: 34610795 PMCID: PMC8493703 DOI: 10.1186/s12864-021-08039-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 09/24/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Overlapping genes share same genomic regions in parallel (sense) or anti-parallel (anti-sense) orientations. These gene pairs seem to occur in all domains of life and are best known from viruses. However, the advantage and biological significance of overlapping genes is still unclear. Expressed sequence tags (ESTs) analysis enabled us to uncover an overlapping gene pair in the human genome. RESULTS By using in silico analysis of previous experimental documentations, we reveal a new form of overlapping genes in the human genome, in which two genes found on opposite strands (Pou5f1 and Tcf19), share two exons and one intron enclosed, at the same positions, between OCT4B3 and TCF19-D splice variants. CONCLUSIONS This new form of overlapping gene expands our previous perception of splicing events and may shed more light on the complexity of gene regulation in higher organisms. Additional such genes might be detected by ESTs analysis also of other organisms.
Collapse
Affiliation(s)
- Majid Mehravar
- Department of Anatomy and Developmental Biology, Development and Stem Cells Program, Biomedicine Discovery Institute, Monash University, Melbourne, Australia
| | - Fatemeh Ghaemimanesh
- Monoclonal Antibody Research Center, Avicenna Research Institute, ACECR, Tehran, Iran
| | - Ensieh M Poursani
- Hematology, Oncology and Stem Cell Transplantation Research Center, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
13
|
Assis R. No Expression Divergence despite Transcriptional Interference between Nested Protein-Coding Genes in Mammals. Genes (Basel) 2021; 12:genes12091381. [PMID: 34573363 PMCID: PMC8467205 DOI: 10.3390/genes12091381] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 08/23/2021] [Accepted: 08/24/2021] [Indexed: 01/05/2023] Open
Abstract
Nested protein-coding genes accumulated throughout metazoan evolution, with early analyses of human and Drosophila microarray data indicating that this phenomenon was simply due to the presence of large introns. However, a recent study employing RNA-seq data uncovered evidence of transcriptional interference driving rapid expression divergence between Drosophila nested genes, illustrating that accurate expression estimation of overlapping genes can enhance detection of their relationships. Hence, here I apply an analogous approach to strand-specific RNA-seq data from human and mouse to revisit the role of transcriptional interference in the evolution of mammalian nested genes. A genomic survey reveals that whereas mammalian nested genes indeed accrued over evolutionary time, they are retained at lower frequencies than in Drosophila. Though several properties of mammalian nested genes align with observations in Drosophila and with expectations under transcriptional interference, contrary to both, their expression divergence is not statistically different from that between unnested genes, and also does not increase after nesting. Together, these results support the hypothesis that lower selection efficiencies limit rates of gene expression evolution in mammals, leading to their reliance on immediate eradication of deleterious nested genes to avoid transcriptional interference.
Collapse
Affiliation(s)
- Raquel Assis
- Department of Electrical Engineering and Computer Science, Institute for Human Health and Disease Intervention, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
14
|
Rosikiewicz W, Sikora J, Skrzypczak T, Kubiak MR, Makałowska I. Promoter switching in response to changing environment and elevated expression of protein-coding genes overlapping at their 5' ends. Sci Rep 2021; 11:8984. [PMID: 33903630 PMCID: PMC8076222 DOI: 10.1038/s41598-021-87970-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 04/07/2021] [Indexed: 11/09/2022] Open
Abstract
Despite the number of studies focused on sense-antisense transcription, the key question of whether such organization evolved as a regulator of gene expression or if this is only a byproduct of other regulatory processes has not been elucidated to date. In this study, protein-coding sense-antisense gene pairs were analyzed with a particular focus on pairs overlapping at their 5' ends. Analyses were performed in 73 human transcription start site libraries. The results of our studies showed that the overlap between genes is not a stable feature and depends on which TSSs are utilized in a given cell type. An analysis of gene expression did not confirm that overlap between genes causes downregulation of their expression. This observation contradicts earlier findings. In addition, we showed that the switch from one promoter to another, leading to genes overlap, may occur in response to changing environment of a cell or tissue. We also demonstrated that in transfected and cancerous cells genes overlap is observed more often in comparison with normal tissues. Moreover, utilization of overlapping promoters depends on particular state of a cell and, at least in some groups of genes, is not merely coincidental.
Collapse
Affiliation(s)
- Wojciech Rosikiewicz
- Center for Applied Bioinformatics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Jarosław Sikora
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Tomasz Skrzypczak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
- Center for Advanced Technology, Adam Mickiewicz University, Poznań, Poland
| | - Magdalena R Kubiak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Izabela Makałowska
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland.
| |
Collapse
|
15
|
Wright BW, Ruan J, Molloy MP, Jaschke PR. Genome Modularization Reveals Overlapped Gene Topology Is Necessary for Efficient Viral Reproduction. ACS Synth Biol 2020; 9:3079-3090. [PMID: 33044064 DOI: 10.1021/acssynbio.0c00323] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Sequence overlap between two genes is common across all genomes, with viruses having high proportions of these gene overlaps. Genome modularization and refactoring is the process of disrupting natural gene overlaps to separate coding sequences to enable their individual manipulation. The biological function and fitness effects of gene overlaps are not fully understood, and their effects on gene cluster and genome-level refactoring are unknown. The bacteriophage φX174 genome has ∼26% of nucleotides involved in encoding more than one gene. In this study we use an engineered φX174 phage containing a genome with all gene overlaps removed to show that gene overlap is critical to maintaining optimal viral fecundity. Through detailed phenotypic measurements we reveal that genome modularization in φX174 causes virion replication, stability, and attachment deficiencies. Quantitation of the complete phage proteome across an infection cycle reveals 30% of proteins display abnormal expression patterns. Taken together, we have for the first time comprehensively demonstrated that gene modularization severely perturbs the coordinated functioning of a bacteriophage replication cycle. This work highlights the biological importance of gene overlap in natural genomes and that reducing gene overlap disruption should be an integral part of future genome engineering projects.
Collapse
Affiliation(s)
- Bradley W. Wright
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Juanfang Ruan
- Electron Microscope Unit, Mark Wainwright Analytical Centre, The University of New South Wales, Sydney, NSW 2052, Australia
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Mark P. Molloy
- Kolling Institute, Northern Clinical School, The University of Sydney, Sydney, NSW 2006, Australia
| | - Paul R. Jaschke
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
16
|
Chen CH, Pan CY, Lin WC. Overlapping protein-coding genes in human genome and their coincidental expression in tissues. Sci Rep 2019; 9:13377. [PMID: 31527706 PMCID: PMC6746723 DOI: 10.1038/s41598-019-49802-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 08/29/2019] [Indexed: 01/23/2023] Open
Abstract
The completion of human genome sequences and the advancement of next-generation sequencing technologies have engendered a clear understanding of all human genes. Overlapping genes are usually observed in compact genomes, such as those of bacteria and viruses. Notably, overlapping protein-coding genes do exist in human genome sequences. Accordingly, we used the current Ensembl gene annotations to identify overlapping human protein-coding genes. We analysed 19,200 well-annotated protein-coding genes and determined that 4,951 protein-coding genes overlapped with their adjacent genes. Approximately a quarter of all human protein-coding genes were overlapping genes. We observed different clusters of overlapping protein-coding genes, ranging from two genes (paired overlapping genes) to 22 genes. We also divided the paired overlapping protein-coding gene groups into four subtypes. We found that the divergent overlapping gene subtype had a stronger expression association than did the subtypes of 5'-tandem overlapping and 3'-tandem overlapping genes. The majority of paired overlapping genes exhibited comparable coincidental tissue expression profiles; however, a few overlapping gene pairs displayed distinctive tissue expression association patterns. In summary, we have carefully examined the genomic features and distributions about human overlapping protein-coding genes and found coincidental expression in tissues for most overlapping protein-coding genes.
Collapse
Affiliation(s)
- Chao-Hsin Chen
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan R.O.C
| | - Chao-Yu Pan
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan R.O.C.,Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan R.O.C
| | - Wen-Chang Lin
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan R.O.C.. .,Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan R.O.C..
| |
Collapse
|
17
|
Rosikiewicz W, Suzuki Y, Makalowska I. OverGeneDB: a database of 5' end protein coding overlapping genes in human and mouse genomes. Nucleic Acids Res 2019; 46:D186-D193. [PMID: 29069459 PMCID: PMC5753363 DOI: 10.1093/nar/gkx948] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 10/20/2017] [Indexed: 01/24/2023] Open
Abstract
Gene overlap plays various regulatory functions on transcriptional and post-transcriptional levels. Most current studies focus on protein-coding genes overlapping with non-protein-coding counterparts, the so called natural antisense transcripts. Considerably less is known about the role of gene overlap in the case of two protein-coding genes. Here, we provide OverGeneDB, a database of human and mouse 5′ end protein-coding overlapping genes. The database contains 582 human and 113 mouse gene pairs that are transcribed using overlapping promoters in at least one analyzed library. Gene pairs were identified based on the analysis of the transcription start site (TSS) coordinates in 73 human and 10 mouse organs, tissues and cell lines. Beside TSS data, resources for 26 human lung adenocarcinoma cell lines also contain RNA-Seq and ChIP-Seq data for seven histone modifications and RNA Polymerase II activity. The collected data revealed that the overlap region is rarely conserved between the studied species and tissues. In ∼50% of the overlapping genes, transcription started explicitly in the overlap regions. In the remaining half of overlapping genes, transcription was initiated both from overlapping and non-overlapping TSSs. OverGeneDB is accessible at http://overgenedb.amu.edu.pl.
Collapse
Affiliation(s)
- Wojciech Rosikiewicz
- Department of Integrative Genomics, Institute of Anthropology, Faculty of Biology, Adam Mickiewicz University in Poznan, 61-712 Poznan, Poland
| | - Yutaka Suzuki
- Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, 272-8562, Japan
| | - Izabela Makalowska
- Department of Integrative Genomics, Institute of Anthropology, Faculty of Biology, Adam Mickiewicz University in Poznan, 61-712 Poznan, Poland
| |
Collapse
|
18
|
Landscape of Overlapping Gene Expression in the Equine Placenta. Genes (Basel) 2019; 10:genes10070503. [PMID: 31269762 PMCID: PMC6678446 DOI: 10.3390/genes10070503] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 06/26/2019] [Accepted: 06/28/2019] [Indexed: 02/07/2023] Open
Abstract
Increasing evidence suggests that overlapping genes are much more common in eukaryotic genomes than previously thought. These different-strand overlapping genes are potential sense–antisense (SAS) pairs, which might have regulatory effects on each other. In the present study, we identified the SAS loci in the equine genome using previously generated stranded, paired-end RNA sequencing data from the equine chorioallantois. We identified a total of 1261 overlapping loci. The ratio of the number of overlapping regions to chromosomal length was numerically higher on chromosome 11 followed by chromosomes 13 and 12. These results show that overlapping transcription is distributed throughout the equine genome, but that distributions differ for each chromosome. Next, we evaluated the expression patterns of SAS pairs during the course of gestation. The sense and antisense genes showed an overall positive correlation between the sense and antisense pairs. We further provide a list of SAS pairs with both positive and negative correlation in their expression patterns throughout gestation. This study characterizes the landscape of sense and antisense gene expression in the placenta for the first time and provides a resource that will enable researchers to elucidate the mechanisms of sense/antisense regulation during pregnancy.
Collapse
|
19
|
Schlub TE, Buchmann JP, Holmes EC. A Simple Method to Detect Candidate Overlapping Genes in Viruses Using Single Genome Sequences. Mol Biol Evol 2019; 35:2572-2581. [PMID: 30099499 PMCID: PMC6188560 DOI: 10.1093/molbev/msy155] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Overlapping genes in viruses maximize the coding capacity of their genomes and allow the generation of new genes without major increases in genome size. Despite their importance, the evolution and function of overlapping genes are often not well understood, in part due to difficulties in their detection. In addition, most bioinformatic approaches for the detection of overlapping genes require the comparison of multiple genome sequences that may not be available in metagenomic surveys of virus biodiversity. We introduce a simple new method for identifying candidate functional overlapping genes using single virus genome sequences. Our method uses randomization tests to estimate the expected length of open reading frames and then identifies overlapping open reading frames that significantly exceed this length and are thus predicted to be functional. We applied this method to 2548 reference RNA virus genomes and find that it has both high sensitivity and low false discovery for genes that overlap by at least 50 nucleotides. Notably, this analysis provided evidence for 29 previously undiscovered functional overlapping genes, some of which are coded in the antisense direction suggesting there are limitations in our current understanding of RNA virus replication.
Collapse
Affiliation(s)
- Timothy E Schlub
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
| | - Jan P Buchmann
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW , Australia
| | - Edward C Holmes
- Marie Bashir Institute for Infectious Diseases and Biosecurity, Charles Perkins Centre, School of Life and Environmental Sciences and Sydney Medical School, The University of Sydney, Sydney, NSW , Australia
| |
Collapse
|
20
|
Puustusmaa M, Abroi A. cRegions-a tool for detecting conserved cis-elements in multiple sequence alignment of diverged coding sequences. PeerJ 2019; 6:e6176. [PMID: 30647994 PMCID: PMC6330207 DOI: 10.7717/peerj.6176] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Accepted: 11/27/2018] [Indexed: 12/31/2022] Open
Abstract
Identifying cis-acting elements and understanding regulatory mechanisms of a gene is crucial to fully understand the molecular biology of an organism. In general, it is difficult to identify previously uncharacterised cis-acting elements with an unknown consensus sequence. The task is especially problematic with viruses containing regions of limited or no similarity to other previously characterised sequences. Fortunately, the fast increase in the number of sequenced genomes allows us to detect some of these elusive cis-elements. In this work, we introduce a web-based tool called cRegions. It was developed to identify regions within a protein-coding sequence where the conservation in the amino acid sequence is caused by the conservation in the nucleotide sequence. The cRegion can be the first step in discovering novel cis-acting sequences from diverged protein-coding genes. The results can be used as a basis for future experimental analysis. We applied cRegions on the non-structural and structural polyproteins of alphaviruses as an example and successfully detected all known cis-acting elements. In this publication and in previous work, we have shown that cRegions is able to detect a wide variety of functional elements in DNA and RNA viruses. These functional elements include splice sites, stem-loops, overlapping reading frames, internal promoters, ribosome frameshifting signals and other embedded elements with yet unknown function. The cRegions web tool is available at http://bioinfo.ut.ee/cRegions/.
Collapse
Affiliation(s)
- Mikk Puustusmaa
- Department of Bioinformatics, Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia
| | - Aare Abroi
- Institute of Technology, University of Tartu, Tartu, Estonia
| |
Collapse
|
21
|
Sas-Chen A, Schwartz S. Misincorporation signatures for detecting modifications in mRNA: Not as simple as it sounds. Methods 2018; 156:53-59. [PMID: 30359724 DOI: 10.1016/j.ymeth.2018.10.011] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 10/04/2018] [Accepted: 10/21/2018] [Indexed: 12/11/2022] Open
Abstract
Post-transcriptional modification on mRNA has become a field of intense interest in recent years, and next-generation sequencing based technologies are constantly emerging to detect an increasing number of modifications at a transcriptome-wide level. Some of these approaches are based on identification of misincorporation events induced by reverse transcriptase at modified sites. Although conceptually trivial, sensitive and specific identification of such events is a challenge prone to a surprising number of artifacts, which can lead to substantially inflated estimates of the abundance of diverse modifications. Here we discuss the sources of some of these artifacts and delineate approaches to overcome them.
Collapse
Affiliation(s)
- Aldema Sas-Chen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Schraga Schwartz
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
22
|
Willis S, Masel J. Gene Birth Contributes to Structural Disorder Encoded by Overlapping Genes. Genetics 2018; 210:303-313. [PMID: 30026186 PMCID: PMC6116962 DOI: 10.1534/genetics.118.301249] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 07/18/2018] [Indexed: 11/18/2022] Open
Abstract
The same nucleotide sequence can encode two protein products in different reading frames. Overlapping gene regions encode higher levels of intrinsic structural disorder (ISD) than nonoverlapping genes (39% vs. 25% in our viral dataset). This might be because of the intrinsic properties of the genetic code, because one member per pair was recently born de novo in a process that favors high ISD, or because high ISD relieves increased evolutionary constraint imposed by dual-coding. Here, we quantify the relative contributions of these three alternative hypotheses. We estimate that the recency of de novo gene birth explains [Formula: see text] or more of the elevation in ISD in overlapping regions of viral genes. While the two reading frames within a same-strand overlapping gene pair have markedly different ISD tendencies that must be controlled for, their effects cancel out to make no net contribution to ISD. The remaining elevation of ISD in the older members of overlapping gene pairs, presumed due to the need to alleviate evolutionary constraint, was already present prior to the origin of the overlap. Same-strand overlapping gene birth events can occur in two different frames, favoring high ISD either in the ancestral gene or in the novel gene; surprisingly, most de novo gene birth events contained completely within the body of an ancestral gene favor high ISD in the ancestral gene (23 phylogenetically independent events vs. 1). This can be explained by mutation bias favoring the frame with more start codons and fewer stop codons.
Collapse
Affiliation(s)
- Sara Willis
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721
| | - Joanna Masel
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721
| |
Collapse
|
23
|
Liu K, Hou S, Dai J, Sun Z. PyMut: A Web Tool for Overlapping Gene Loss-of-Function Mutation Design. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1107-1110. [PMID: 26661787 DOI: 10.1109/tcbb.2015.2505290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Loss-of-function study is an effective approach to research gene functions. However, currently most of such studies have ignored an important problem (in this paper, we call it "off-target" problem), that is, if the target gene is an overlapping gene (A gene whose expressible nucleotides overlaps with that of another one), loss-of-function mutation by deleting the complete open reading frame (ORF) may also cause the gene it overlaps lose function, resulting a phenotype which may be rather different from that of single gene deletion. Therefore, when doing such studies, the loss-of-function mutations should be carefully designed to guarantee only the function of the target gene will be abolished. In this paper, we present PyMut, an easy-to-use web tool for biologists to design such mutations. To the best of our knowledge, PyMut is the first tool that aims to solve the "off-target" problem regarding the overlapping genes. Our web server is freely available at http://www.bioinfo.tsinghua.edu.cn/∼liuke/PyMut/index.html.
Collapse
|
24
|
Maguire G. Amyotrophic lateral sclerosis as a protein level, non-genomic disease: Therapy with S2RM exosome released molecules. World J Stem Cells 2017; 9:187-202. [PMID: 29312526 PMCID: PMC5745587 DOI: 10.4252/wjsc.v9.i11.187] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Revised: 08/10/2017] [Accepted: 09/04/2017] [Indexed: 02/06/2023] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a rapidly progressing neurodegenerative disease that leads to death. No effective treatments are currently available. Based on data from epidemiological, etiological, laboratory, and clinical studies, I offer a new way of thinking about ALS and its treatment. This paper describes a host of extrinsic factors, including the exposome, that disrupt the extracellular matrix and protein function such that a spreading, prion-like disease leads to neurodegeneration in the motor tracts. A treatment regimen is described using the stem cell released molecules from a number of types of adult stem cells to provide tissue dependent molecules that restore homeostasis, including proteostasis, in the ALS patient. Because stem cells themselves as a therapeutic are cumbersome and expensive, and when implanted in a host cause aging of the host tissue and often fail to engraft or remain viable, only the S2RM molecules are used. Rebuilding of the extracellular matrix and repair of the dysfunctional proteins in the ALS patient ensues.
Collapse
Affiliation(s)
- Greg Maguire
- BioRegenerative Sciences, Inc., La Jolla, CA 92037, United States
| |
Collapse
|
25
|
Zinad HS, Natasya I, Werner A. Natural Antisense Transcripts at the Interface between Host Genome and Mobile Genetic Elements. Front Microbiol 2017; 8:2292. [PMID: 29209299 PMCID: PMC5701935 DOI: 10.3389/fmicb.2017.02292] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2017] [Accepted: 11/06/2017] [Indexed: 12/26/2022] Open
Abstract
Non-coding RNAs are involved in epigenetic processes, playing a role in the regulation of gene expression at the transcriptional and post-transcriptional levels. A particular group of ncRNA are natural antisense transcripts (NATs); these are transcribed in the opposite direction to protein coding transcripts and are widespread in eukaryotes. Their abundance, evidence of phylogenetic conservation and an increasing number of well-characterized examples of antisense-mediated gene regulation are indicative of essential biological roles of NATs. There is evidence to suggest that they interfere with their corresponding sense transcript to elicit concordant and discordant regulation. The main mechanisms involved include transcriptional interference as well as dsRNA formation. Sense–antisense hybrid formation can trigger RNA interference, RNA editing or protein kinase R. However, the exact molecular mechanisms elicited by NATs in the context of these regulatory roles are currently poorly understood. Several examples confirm that ectopic expression of antisense transcripts trigger epigenetic silencing of the related sense transcript. Genomic approaches suggest that the antisense transcriptome carries a broader biological significance which goes beyond the physiological regulation of the directly related sense transcripts. Because NATs show evidence of conservation we speculate that they played a role in evolution, with early eukaryotes gaining selective advantage through the regulatory effects. With the surge of genome and transcriptome sequencing projects, there is promise of a more comprehensive understanding of the biological role of NATs and the regulatory mechanisms involved.
Collapse
Affiliation(s)
- Hany S Zinad
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Inas Natasya
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Andreas Werner
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
26
|
Savisaar R, Hurst LD. Both Maintenance and Avoidance of RNA-Binding Protein Interactions Constrain Coding Sequence Evolution. Mol Biol Evol 2017; 34:1110-1126. [PMID: 28138077 PMCID: PMC5400389 DOI: 10.1093/molbev/msx061] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
While the principal force directing coding sequence (CDS) evolution is selection on protein function, to ensure correct gene expression CDSs must also maintain interactions with RNA-binding proteins (RBPs). Understanding how our genes are shaped by these RNA-level pressures is necessary for diagnostics and for improving transgenes. However, the evolutionary impact of the need to maintain RBP interactions remains unresolved. Are coding sequences constrained by the need to specify RBP binding motifs? If so, what proportion of mutations are affected? Might sequence evolution also be constrained by the need not to specify motifs that might attract unwanted binding, for instance because it would interfere with exon definition? Here, we have scanned human CDSs for motifs that have been experimentally determined to be recognized by RBPs. We observe two sets of motifs-those that are enriched over nucleotide-controlled null and those that are depleted. Importantly, the depleted set is enriched for motifs recognized by non-CDS binding RBPs. Supporting the functional relevance of our observations, we find that motifs that are more enriched are also slower-evolving. The net effect of this selection to preserve is a reduction in the over-all rate of synonymous evolution of 2-3% in both primates and rodents. Stronger motif depletion, on the other hand, is associated with stronger selection against motif gain in evolution. The challenge faced by our CDSs is therefore not only one of attracting the right RBPs but also of avoiding the wrong ones, all while also evolving under selection pressures related to protein structure.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, United Kingdom
| |
Collapse
|
27
|
Savisaar R, Hurst LD. Estimating the prevalence of functional exonic splice regulatory information. Hum Genet 2017; 136:1059-1078. [PMID: 28405812 PMCID: PMC5602102 DOI: 10.1007/s00439-017-1798-3] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Accepted: 04/04/2017] [Indexed: 12/14/2022]
Abstract
In addition to coding information, human exons contain sequences necessary for correct splicing. These elements are known to be under purifying selection and their disruption can cause disease. However, the density of functional exonic splicing information remains profoundly uncertain. Several groups have experimentally investigated how mutations at different exonic positions affect splicing. They have found splice information to be distributed widely in exons, with one estimate putting the proportion of splicing-relevant nucleotides at >90%. These results suggest that splicing could place a major pressure on exon evolution. However, analyses of sequence conservation have concluded that the need to preserve splice regulatory signals only slightly constrains exon evolution, with a resulting decrease in the average human rate of synonymous evolution of only 1–4%. Why do these two lines of research come to such different conclusions? Among other reasons, we suggest that the methods are measuring different things: one assays the density of sites that affect splicing, the other the density of sites whose effects on splicing are visible to selection. In addition, the experimental methods typically consider short exons, thereby enriching for nucleotides close to the splice junction, such sites being enriched for splice-control elements. By contrast, in part owing to correction for nucleotide composition biases and to the assumption that constraint only operates on exon ends, the conservation-based methods can be overly conservative.
Collapse
Affiliation(s)
- Rosina Savisaar
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
| | - Laurence D Hurst
- The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK
| |
Collapse
|
28
|
The Evolution and Expression Pattern of Human Overlapping lncRNA and Protein-coding Gene Pairs. Sci Rep 2017; 7:42775. [PMID: 28344339 PMCID: PMC5366806 DOI: 10.1038/srep42775] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Accepted: 01/13/2017] [Indexed: 12/27/2022] Open
Abstract
Long non-coding RNA overlapping with protein-coding gene (lncRNA-coding pair) is a special type of overlapping genes. Protein-coding overlapping genes have been well studied and increasing attention has been paid to lncRNAs. By studying lncRNA-coding pairs in human genome, we showed that lncRNA-coding pairs were more likely to be generated by overprinting and retaining genes in lncRNA-coding pairs were given higher priority than non-overlapping genes. Besides, the preference of overlapping configurations preserved during evolution was based on the origin of lncRNA-coding pairs. Further investigations showed that lncRNAs promoting the splicing of their embedded protein-coding partners was a unilateral interaction, but the existence of overlapping partners improving the gene expression was bidirectional and the effect was decreased with the increased evolutionary age of genes. Additionally, the expression of lncRNA-coding pairs showed an overall positive correlation and the expression correlation was associated with their overlapping configurations, local genomic environment and evolutionary age of genes. Comparison of the expression correlation of lncRNA-coding pairs between normal and cancer samples found that the lineage-specific pairs including old protein-coding genes may play an important role in tumorigenesis. This work presents a systematically comprehensive understanding of the evolution and the expression pattern of human lncRNA-coding pairs.
Collapse
|
29
|
Saha D, Podder S, Ghosh TC. Overlapping Regions in HIV-1 Genome Act as Potential Sites for Host-Virus Interaction. Front Microbiol 2016; 7:1735. [PMID: 27867372 PMCID: PMC5095123 DOI: 10.3389/fmicb.2016.01735] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Accepted: 10/17/2016] [Indexed: 01/05/2023] Open
Abstract
More than a decade, overlapping genes in RNA viruses became a subject of research which has explored various effect of gene overlapping on the evolution and function of viral genomes like genome size compaction. Additionally, overlapping regions (OVRs) are also reported to encode elevated degree of protein intrinsic disorder (PID) in unspliced RNA viruses. With the aim to explore the roles of OVRs in HIV-1 pathogenesis, we have carried out an in-depth analysis on the association of gene overlapping with PID in 35 HIV1- M subtypes. Our study reveals an over representation of PID in OVR of HIV-1 genomes. These disordered residues endure several vital, structural features like short linear motifs (SLiMs) and protein phosphorylation (PP) sites which are previously shown to be involved in massive host–virus interaction. Moreover, SLiMs in OVRs are noticed to be more functionally potential as compared to that of non-overlapping region. Although, density of experimentally verified SLiMs, resided in 9 HIV-1 genes, involved in host–virus interaction do not show any bias toward clustering into OVR, tat and rev two important proteins mediates host–pathogen interaction by their experimentally verified SLiMs, which are mostly localized in OVR. Finally, our analysis suggests that the acquisition of SLiMs in OVR is mutually exclusive of the occurrence of disordered residues, while the enrichment of PPs in OVR is solely dependent on PID and not on overlapping coding frames. Thus, OVRs of HIV-1 genomes could be demarcated as potential molecular recognition sites during host–virus interaction.
Collapse
Affiliation(s)
- Deeya Saha
- Bioinformatics Centre, Bose Institute Kolkata, India
| | - Soumita Podder
- Department of Microbiology, Raiganj University Raiganj, India
| | | |
Collapse
|
30
|
Klasberg S, Bitard-Feildel T, Mallet L. Computational Identification of Novel Genes: Current and Future Perspectives. Bioinform Biol Insights 2016; 10:121-31. [PMID: 27493475 PMCID: PMC4970615 DOI: 10.4137/bbi.s39950] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Revised: 05/31/2016] [Accepted: 06/05/2016] [Indexed: 12/31/2022] Open
Abstract
While it has long been thought that all genomic novelties are derived from the existing material, many genes lacking homology to known genes were found in recent genome projects. Some of these novel genes were proposed to have evolved de novo, ie, out of noncoding sequences, whereas some have been shown to follow a duplication and divergence process. Their discovery called for an extension of the historical hypotheses about gene origination. Besides the theoretical breakthrough, increasing evidence accumulated that novel genes play important roles in evolutionary processes, including adaptation and speciation events. Different techniques are available to identify genes and classify them as novel. Their classification as novel is usually based on their similarity to known genes, or lack thereof, detected by comparative genomics or against databases. Computational approaches are further prime methods that can be based on existing models or leveraging biological evidences from experiments. Identification of novel genes remains however a challenging task. With the constant software and technologies updates, no gold standard, and no available benchmark, evaluation and characterization of genomic novelty is a vibrant field. In this review, the classical and state-of-the-art tools for gene prediction are introduced. The current methods for novel gene detection are presented; the methodological strategies and their limits are discussed along with perspective approaches for further studies.
Collapse
Affiliation(s)
- Steffen Klasberg
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany
| | - Tristan Bitard-Feildel
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany
| | - Ludovic Mallet
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany
| |
Collapse
|
31
|
Piatek MJ, Henderson V, Zynad HS, Werner A. Natural antisense transcription from a comparative perspective. Genomics 2016; 108:56-63. [PMID: 27241791 PMCID: PMC4996343 DOI: 10.1016/j.ygeno.2016.05.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2016] [Revised: 05/08/2016] [Accepted: 05/25/2016] [Indexed: 12/28/2022]
Abstract
Natural antisense transcripts (NATs) can interfere with the expression of complementary sense transcripts with exquisite specificity. We have previously cloned NATs of Slc34a loci (encoding Na-phosphate transporters) from fish and mouse. Here we report the cloning of a human SLC34A1-related NAT that represents an alternatively spliced PFN3 transcript (Profilin3). The transcript is predominantly expressed in testis. Phylogenetic comparison suggests two distinct mechanisms producing Slc34a-related NATs: Alternative splicing of a transcript from a protein coding downstream gene (Pfn3, human/mouse) and transcription from the bi-directional promoter (Rbpja, zebrafish). Expression analysis suggested independent regulation of the complementary Slc34a mRNAs. Analysis of randomly selected bi-directionally transcribed human/mouse loci revealed limited phylogenetic conservation and independent regulation of NATs. They were reduced on X chromosomes and clustered in regions that escape inactivation. Locus structure and expression pattern suggest a NATs-associated regulatory mechanisms in testis unrelated to the physiological role of the sense transcript encoded protein.
Collapse
Affiliation(s)
- Monica J Piatek
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Victoria Henderson
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Hany S Zynad
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Andreas Werner
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom.
| |
Collapse
|
32
|
Brandes N, Linial M. Gene overlapping and size constraints in the viral world. Biol Direct 2016; 11:26. [PMID: 27209091 PMCID: PMC4875738 DOI: 10.1186/s13062-016-0128-3] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2016] [Accepted: 05/06/2016] [Indexed: 12/19/2022] Open
Abstract
Background Viruses are the simplest replicating units, characterized by a limited number of coding genes and an exceptionally high rate of overlapping genes. We sought a unified evolutionary explanation that accounts for their genome sizes, gene overlapping and capsid properties. Results We performed an unbiased statistical analysis of ~100 families within ~400 genera that comprise the currently known viral world. We found that the volume utilization of capsids is often low, and greatly varies among viral families. Furthermore, although viruses span three orders of magnitude in genome length, they almost never have over 1500 overlapping nucleotides, or over four significantly overlapping genes per virus. Conclusions Our findings undermine the generality of the compression theory, which emphasizes optimal packing and length dependency to explain overlapping genes and capsid size in viral genomes. Instead, we propose that gene novelty and evolution exploration offer better explanations to size constraints and gene overlapping in all viruses. Reviewers This article was reviewed by Arne Elofsson and David Kreil. Electronic supplementary material The online version of this article (doi:10.1186/s13062-016-0128-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Nadav Brandes
- Einstein Institute of Mathematics, The Edmond J. Safra Campus, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Michal Linial
- Department of Biological Chemistry, Room A-530, Institute of Life Sciences, The Edmond J. Safra Campus, The Hebrew University of Jerusalem, 91904, Jerusalem, Israel.
| |
Collapse
|
33
|
Luberg K, Park R, Aleksejeva E, Timmusk T. Novel transcripts reveal a complex structure of the human TRKA gene and imply the presence of multiple protein isoforms. BMC Neurosci 2015; 16:78. [PMID: 26581861 PMCID: PMC4652384 DOI: 10.1186/s12868-015-0215-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Accepted: 11/09/2015] [Indexed: 11/21/2022] Open
Abstract
Background Tropomyosin-related kinase A (TRKA) is a nerve growth factor (NGF) receptor that belongs to the tyrosine kinase receptor family. It is critical for the correct development of many types of neurons including pain-mediating sensory neurons and also controls proliferation, differentiation and survival of many neuronal and non-neuronal cells. TRKA (also known as NTRK1) gene is a target of alternative splicing which can result in several different protein isoforms. Presently, three human isoforms (TRKAI, TRKAII and TRKAIII) and two rat isoforms (TRKA L0 and TRKA L1) have been described. Results We show here that human TRKA gene is overlapped by two genes and spans 67 kb—almost three times the size that has been previously described. Numerous transcription initiation sites from eight different 5′ exons and a sophisticated splicing pattern among exons encoding the extracellular part of TRKA receptor indicate that there might be a large variety of alternative protein isoforms. TrkA genes in rat and mouse appear to be considerably shorter, are not overlapped by other genes and display more straightforward splicing patterns. We describe the expression profile of alternatively spliced TRKA transcripts in different tissues of human, rat and mouse, as well as analyze putative endogenous TRKA protein isoforms in human SH-SY5Y and rat PC12 cells. We also characterize a selection of novel putative protein isoforms by portraying their phosphorylation, glycosylation and intracellular localization patterns. Our findings show that an isoform comprising mainly of TRKA kinase domain is capable of entering the nucleus. Conclusions Results obtained in this study refer to the existence of a multitude of TRKA mRNA and protein isoforms, with some putative proteins possessing very distinct properties. Electronic supplementary material The online version of this article (doi:10.1186/s12868-015-0215-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Kristi Luberg
- Department of Gene Technology, Tallinn University of Technology, Akadeemia tee 15, 12618, Tallinn, Estonia. .,Competence Center for Cancer Research, Tallinn, Estonia.
| | - Rahel Park
- Department of Gene Technology, Tallinn University of Technology, Akadeemia tee 15, 12618, Tallinn, Estonia. .,Competence Center for Cancer Research, Tallinn, Estonia. .,VIB lab for Systems Biology & CMPG Lab for Genetics and Genomics, Leuven, Belgium.
| | - Elina Aleksejeva
- Department of Gene Technology, Tallinn University of Technology, Akadeemia tee 15, 12618, Tallinn, Estonia. .,Competence Center for Cancer Research, Tallinn, Estonia. .,French National Institute for Agricultural Research, Paris, France.
| | - Tõnis Timmusk
- Department of Gene Technology, Tallinn University of Technology, Akadeemia tee 15, 12618, Tallinn, Estonia. .,Competence Center for Cancer Research, Tallinn, Estonia.
| |
Collapse
|
34
|
Milligan MJ, Lipovich L. Pseudogene-derived lncRNAs: emerging regulators of gene expression. Front Genet 2015; 5:476. [PMID: 25699073 PMCID: PMC4316772 DOI: 10.3389/fgene.2014.00476] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Accepted: 12/25/2014] [Indexed: 01/11/2023] Open
Abstract
In the more than one decade since the completion of the Human Genome Project, the prevalence of non-protein-coding functional elements in the human genome has emerged as a key revelation in post-genomic biology. Highlighted by the ENCODE (Encyclopedia of DNA Elements) and FANTOM (Functional Annotation of Mammals) consortia, these elements include tens of thousands of pseudogenes, as well as comparably numerous long non-coding RNA (lncRNA) genes. Pseudogene transcription and function remain insufficiently understood. However, the field is of great importance for human disease due to the high sequence similarity between pseudogenes and their parental protein-coding genes, which generates the potential for sequence-specific regulation. Recent case studies have established essential and coordinated roles of both pseudogenes and lncRNAs in development and disease in metazoan systems, including functional impacts of lncRNA transcription at pseudogene loci on the regulation of the pseudogenes’ parental genes. This review synthesizes the nascent evidence for regulatory modalities jointly exerted by lncRNAs and pseudogenes in human disease, and for recent evolutionary origins of these systems.
Collapse
Affiliation(s)
- Michael J Milligan
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine , Detroit, MI, USA
| | - Leonard Lipovich
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine , Detroit, MI, USA
| |
Collapse
|
35
|
Wei X, Zhang J. A simple method for estimating the strength of natural selection on overlapping genes. Genome Biol Evol 2014; 7:381-90. [PMID: 25552532 PMCID: PMC4316641 DOI: 10.1093/gbe/evu294] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
Overlapping genes, where one DNA sequence codes for two proteins with different reading frames, are not uncommon in viruses and cellular organisms. Estimating the direction and strength of natural selection acting on overlapping genes is important for understanding their functionality, origin, evolution, maintenance, and potential interaction. However, the standard methods for estimating synonymous (dS) and nonsynonymous (dN) nucleotide substitution rates are inapplicable here because a nucleotide change can be simultaneously synonymous and nonsynonymous when both reading frames involved are considered. We have developed a simple method that can estimate dN/dS and test for the action of natural selection in each relevant reading frame of the overlapping genes. Our method is an extension of the modified Nei-Gojobori method previously developed for nonoverlapping genes. We confirmed the reliability of our method using extensive computer simulation. Applying this method, we studied the longest human sense–antisense overlapping gene pair, LRRC8E and ENSG00000214248. Although LRRC8E (leucine-rich repeat containing eight family, member E) is known to regulate cell size, the function of ENSG00000214248 is unknown. Our analysis revealed purifying selection on ENSG00000214248 and suggested that it originated in the common ancestor of bony vertebrates.
Collapse
Affiliation(s)
- Xinzhu Wei
- Department of Ecology and Evolutionary Biology, University of Michigan
| | - Jianzhi Zhang
- Department of Ecology and Evolutionary Biology, University of Michigan
| |
Collapse
|
36
|
Overlapping genes: a new strategy of thermophilic stress tolerance in prokaryotes. Extremophiles 2014; 19:345-53. [PMID: 25503326 DOI: 10.1007/s00792-014-0720-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 12/01/2014] [Indexed: 12/29/2022]
Abstract
Overlapping genes (OGs) draw the focus of recent day's research. However, the significance of OGs in prokaryotic genomes remained unexplored. As an adaptation to high temperature, thermophiles were shown to eliminate their intergenic regions. Therefore, it could be possible that prokaryotes would increase their OG content to adapt to high temperature. To test this hypothesis, we carried out a comparative study on OG frequency of 256 prokaryotic genomes comprising both thermophiles and non-thermophiles. It was found that thermophiles exhibit higher frequency of overlapping genes than non-thermophiles. Moreover, overlap frequency was found to correlate with optimal growth temperature (OGT) in prokaryotes. Long overlap frequency was found to hold a positive correlation with OGT resulting in an abundance of long overlaps in thermophiles compared to non-thermophiles. On the other hand, short overlap (1-4 nucleotides) frequency (SOF) did not yield any direct correlation with OGT. However, the correlation of SOF with CAIavg (extent of variation of codon usage bias measured as the mean of codon adaptation index of all genes in a given genome) and IG% (proportion of intergenic regions) indicate that they might upregulate the aforementioned factors (CAIavg and IG%) which are already known to be vital forces for thermophilic adaptation. From these evidences, we propose that the OG content bears a strong link to thermophily. Long overlaps are important for their genome compaction and short overlaps are important to uphold high CAIavg. Our findings will surely help in better understanding of the significance of overlapping gene content in prokaryotic genomes.
Collapse
|
37
|
Single-strand conformational polymorphism analysis of a common single nucleotide variation in WRAP53 gene, rs2287499, and evaluating its association in relation to breast cancer risk and prognosis among Iranian-Azeri population. Med Oncol 2014; 31:168. [PMID: 25134915 DOI: 10.1007/s12032-014-0168-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Accepted: 08/08/2014] [Indexed: 12/18/2022]
Abstract
The WRAP53 (WD40-encoding RNA antisense to p53) gene encodes an antisense RNA, essential for p53 stabilization and induction upon DNA damage. Single nucleotide polymorphisms (SNPs) in WRAP53 have been associated with risk of cancer, which strengthens the role of WRAP53 in the pathogenesis of human malignancies. In fact, WRAP53 has been considered as a candidate cancer susceptibility gene. Accordingly, we performed a study to examine the association of a frequent genetic variation in WRAP53, rs2287499 (C/G), with breast cancer risk and prognosis among Iranian-Azeri population. A case-control association study, including 206 cases and 203 controls from Iranian-Azeri population, was conducted. Genomic DNA was extracted from peripheral blood and tumor samples by salting-out method. SNP genotyping was carried out by polymerase chain reaction-based single-strand conformational polymorphism (PCR-SSCP) technique. The sequence variation of SSCP banding patterns was determined by sequencing. The collected data were analyzed through statistical package for the social sciences software, using Chi-square (χ (2)) or Fisher's exact tests, with a significance level of 0.05. No significant differences in the allele and genotype frequencies between cases and controls were detected. Similarly, no significant associations between genotypes and clinicopathological data were observed. Concisely, no significant overall associations between rs2287499 and breast cancer risk and prognosis were detected in the studied population. The rs2287499 SNP is not associated with breast cancer predisposition in Iranian-Azeri women; it also cannot be used as a molecular biomarker to predict breast cancer prognosis in Iranian-Azeri population.
Collapse
|
38
|
Sigurgeirsson B, Emanuelsson O, Lundeberg J. Analysis of stranded information using an automated procedure for strand specific RNA sequencing. BMC Genomics 2014; 15:631. [PMID: 25070246 PMCID: PMC4247151 DOI: 10.1186/1471-2164-15-631] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Accepted: 07/10/2014] [Indexed: 01/19/2023] Open
Abstract
Background Strand specific RNA sequencing is rapidly replacing conventional cDNA sequencing as an approach for assessing information about the transcriptome. Alongside improved laboratory protocols the development of bioinformatical tools is steadily progressing. In the current procedure the Illumina TruSeq library preparation kit is used, along with additional reagents, to make stranded libraries in an automated fashion which are then sequenced on Illumina HiSeq 2000. By the use of freely available bioinformatical tools we show, through quality metrics, that the protocol is robust and reproducible. We further highlight the practicality of strand specific libraries by comparing expression of strand specific libraries to non-stranded libraries, by looking at known antisense transcription of pseudogenes and by identifying novel transcription. Furthermore, two ribosomal depletion kits, RiboMinus and RiboZero, are compared and two sequence aligners, Tophat2 and STAR, are also compared. Results The, non-stranded, Illumina TruSeq kit can be adapted to generate strand specific libraries and can be used to access detailed information on the transcriptome. The RiboZero kit is very effective in removing ribosomal RNA from total RNA and the STAR aligner produces high mapping yield in a short time. Strand specific data gives more detailed and correct results than does non-stranded data as we show when estimating expression values and in assembling transcripts. Even well annotated genomes need improvements and corrections which can be achieved using strand specific data. Conclusions Researchers in the field should strive to use strand specific data; it allows for more confidence in the data analysis and is less likely to lead to false conclusions. If faced with analysing non-stranded data, researchers should be well aware of the caveats of that approach. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-631) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | - Joakim Lundeberg
- Science for Life Laboratory, School of Biotechnology, Royal Institute of Technology (KTH), Tomtebodavägen 23A, 17165 Solna, Stockholm, Sweden.
| |
Collapse
|
39
|
|
40
|
Lee YCG, Chang HH. The evolution and functional significance of nested gene structures in Drosophila melanogaster. Genome Biol Evol 2014; 5:1978-85. [PMID: 24084778 PMCID: PMC3814207 DOI: 10.1093/gbe/evt149] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Nearly 10% of the genes in the genome of Drosophila melanogaster are in nested structures, in which one gene is completely nested within the intron of another gene (nested and including gene, respectively). Even though the coding sequences and untranslated regions of these nested/including gene pairs do not overlap, their intimate structures and the possibility of shared regulatory sequences raise questions about the evolutionary forces governing the origination and subsequent functional and evolutionary impacts of these structures. In this study, we show that nested genes experience weaker evolutionary constraint, have faster rates of protein evolution, and are expressed in fewer tissues than other genes, while including genes show the opposite patterns. Surprisingly, despite completely overlapping with each other, nested and including genes are less likely to display correlated gene expression and biological function than the nearby yet nonoverlapping genes. Interestingly, significantly fewer nested genes are transcribed from the same strand as the including gene. We found that same-strand nested genes are more likely to be single-exon genes. In addition, same-strand including genes are less likely to have known lethal or sterile phenotypes than opposite-strand including genes only when the corresponding nested genes have introns. These results support our hypothesis that selection against potential erroneous mRNA splicing when nested and including genes are on the same strand plays an important role in the evolution of nested gene structures.
Collapse
Affiliation(s)
- Yuh Chwen G Lee
- Center for Population Biology and Department of Evolution and Ecology, University of California
| | | |
Collapse
|
41
|
Zeng XC, Liu Y, Shi W, Zhang L, Luo X, Nie Y, Yang Y. Genome-wide search and comparative genomic analysis of the trypsin inhibitor-like cysteine-rich domain-containing peptides. Peptides 2014; 53:106-14. [PMID: 23973966 DOI: 10.1016/j.peptides.2013.08.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Revised: 08/13/2013] [Accepted: 08/13/2013] [Indexed: 11/24/2022]
Abstract
It was shown that peptides containing trypsin inhibitor-like cysteine-rich (TIL) domain are able to inhibit proteinase activities, and thus play important roles in various biological processes, such as immune response and anticoagulation. However, only a limited number of the TIL peptides have been identified and characterized so far; and little has been known about the evolutionary relationships of the genes encoding the TIL peptides. BmKAPi is a TIL domain-containing peptide that was identified from Mesobuthus martensii Karsch. Here, we conducted genome-wide searches for new peptides that are homologous to BmKAPi or possess a cysteine pattern similar to that of BmKAPi. As a result, we identified a total of 80 different TIL peptides from 34 species of arthropods. We found that these peptides can be classified into seven evolutionarily distinct groups. Furthermore, we cloned the genomic sequence of BmKAPi; the genomic sequences of the majority of other TIL peptides were also identified from the GenBank database using bioinformatical approaches. Through phylogenetic and comparative genomic analysis, we found 26 cases of intron gain events occurred in the genes of the TIL peptides; however, no instances of intron loss were observed. Moreover, we found that alternative splicing contributes to the diversification of the TIL peptides. It is interesting to see that four genes of the TIL domain-containing peptides overlap in a DNA region located on the chromosome LG B15 of Bombus terretris. These data suggest that the evolution of the TIL peptide genes are dynamic, which was dominated by intron gain.
Collapse
Affiliation(s)
- Xian-Chun Zeng
- State Key Laboratory of Biogeology and Environmental Geology & Department of Biological Science and Technology, School of Environmental Studies, China University of Geosciences (Wuhan), Wuhan 430074, People's Republic of China.
| | - Yichen Liu
- State Key Laboratory of Biogeology and Environmental Geology & Department of Biological Science and Technology, School of Environmental Studies, China University of Geosciences (Wuhan), Wuhan 430074, People's Republic of China
| | - Wanxia Shi
- State Key Laboratory of Biogeology and Environmental Geology & Department of Biological Science and Technology, School of Environmental Studies, China University of Geosciences (Wuhan), Wuhan 430074, People's Republic of China
| | - Lei Zhang
- State Key Laboratory of Biogeology and Environmental Geology & Department of Biological Science and Technology, School of Environmental Studies, China University of Geosciences (Wuhan), Wuhan 430074, People's Republic of China
| | - Xuesong Luo
- State Key Laboratory of Biogeology and Environmental Geology & Department of Biological Science and Technology, School of Environmental Studies, China University of Geosciences (Wuhan), Wuhan 430074, People's Republic of China
| | - Yao Nie
- State Key Laboratory of Biogeology and Environmental Geology & Department of Biological Science and Technology, School of Environmental Studies, China University of Geosciences (Wuhan), Wuhan 430074, People's Republic of China
| | - Ye Yang
- State Key Laboratory of Biogeology and Environmental Geology & Department of Biological Science and Technology, School of Environmental Studies, China University of Geosciences (Wuhan), Wuhan 430074, People's Republic of China
| |
Collapse
|
42
|
Yin J, Vogel U, Wang H, Ma Y, Wang C, Liang D, Liu J, Yue L, Zhao Y, Ma J. HapMap-based study identifies risk sub-region on chromosome 19q13.3 in relation to lung cancer among Chinese. Cancer Epidemiol 2013; 37:923-9. [DOI: 10.1016/j.canep.2013.09.016] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2013] [Accepted: 09/22/2013] [Indexed: 10/26/2022]
|
43
|
Luo Y, Battistuzzi F, Lin K. Evolutionary dynamics of overlapped genes in Salmonella. PLoS One 2013; 8:e81016. [PMID: 24312259 PMCID: PMC3843671 DOI: 10.1371/journal.pone.0081016] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 10/16/2013] [Indexed: 11/19/2022] Open
Abstract
Presence of overlapping genes (OGs) is a common phenomenon in bacterial genomes. Most frequently, overlapping genes share coding regions with as few as one nucleotide to as many as thousands of nucleotides. Overlapping genes are often co-regulated, transcriptionally and translationally. Overlapping genes are also subject to the whims of evolution, as the gene overlap is known to be disrupted in some species/strains and participating genes are sometimes lost in independent lineages. Therefore, a better understanding of evolutionary patterns and rates of the disruption of overlapping genes is an important component of genome structure and evolution of gene function. In this study, we investigate the fate of ancestrally overlapping genes in complete genomes from 15 contemporary strains of Salmonella species. We find that the fates of overlapping genes inside and outside operons are distinctly different. A larger fraction of overlapping genes inside operons conserves their overlap as compared to gene pairs outside of the operons (average 0.89 vs. 0.83 per genome). However, when overlapping genes in the operons separate, one partner is lost more frequently than in those separated genes outside of operons (average 0.02 vs. 0.01 per genome). We also investigate the fate of a pan set of overlapping genes at the present and ancestral nodes over a phylogenetic tree based on genome sequence data, respectively. We propose that co-regulation plays important roles on the fates of genes. Furthermore, a vast majority of disruptions occurred prior to the common ancestor of all 15 Salmonella strains, which enables us to obtain an estimate of disruptions between Salmonella and E. coli.
Collapse
Affiliation(s)
- Yingqin Luo
- Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, Arizona, United States of America
- Center for Infectious Diseases and Vaccinology, The Biodesign Institute, Arizona State University, Tempe, Arizona, United States of America
| | - Fabia Battistuzzi
- Department of Biological Sciences, Oakland University, Rochester, Michigan, United States of America
| | - Kui Lin
- College of Life Sciences, Beijing Normal University, Beijing, China
| |
Collapse
|
44
|
Wood EJ, Chin-Inmanu K, Jia H, Lipovich L. Sense-antisense gene pairs: sequence, transcription, and structure are not conserved between human and mouse. Front Genet 2013; 4:183. [PMID: 24133500 PMCID: PMC3783845 DOI: 10.3389/fgene.2013.00183] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Accepted: 08/29/2013] [Indexed: 01/25/2023] Open
Abstract
Previous efforts to characterize conservation between the human and mouse genomes focused largely on sequence comparisons. These studies are inherently limited because they don't account for gene structure differences, which may exist despite genomic sequence conservation. Recent high-throughput transcriptome studies have revealed widespread and extensive overlaps between genes, and transcripts, encoded on both strands of the genomic sequence. This overlapping gene organization, which produces sense-antisense (SAS) gene pairs, is capable of effecting regulatory cascades through established mechanisms. We present an evolutionary conservation assessment of SAS pairs, on three levels: genomic, transcriptomic, and structural. From a genome-wide dataset of human SAS pairs, we first identified orthologous loci in the mouse genome, then assessed their transcription in the mouse, and finally compared the genomic structures of SAS pairs expressed in both species. We found that approximately half of human SAS loci have single orthologous locations in the mouse genome; however, only half of those orthologous locations have SAS transcriptional activity in the mouse. This suggests that high human-mouse gene conservation overlooks widespread distinctions in SAS pair incidence and expression. We compared gene structures at orthologous SAS loci, finding frequent differences in gene structure between human and orthologous mouse SAS pair members. Our categorization of human SAS pairs with respect to mouse conservation of expression as well as structure points to limitations of mouse models. Gene structure differences, including at SAS loci, may account for some of the phenotypic distinctions between primates and rodents. Genes in non-conserved SAS pairs may contribute to evolutionary lineage-specific regulatory outcomes.
Collapse
Affiliation(s)
- Emily J Wood
- Center for Molecular Medicine and Genetics, Wayne State University Detroit, MI, USA
| | | | | | | |
Collapse
|
45
|
Yin J, Guo L, Wang C, Wang H, Ma Y, Liu J, Liang D, Ma J, Zhao Y. Effects of PPP1R13L and CD3EAP variants on lung cancer susceptibility among nonsmoking Chinese women. Gene 2013; 524:228-31. [DOI: 10.1016/j.gene.2013.04.017] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2013] [Revised: 03/30/2013] [Accepted: 04/04/2013] [Indexed: 11/26/2022]
|
46
|
Behura SK, Severson DW. Overlapping genes of Aedes aegypti: evolutionary implications from comparison with orthologs of Anopheles gambiae and other insects. BMC Evol Biol 2013; 13:124. [PMID: 23777277 PMCID: PMC3689595 DOI: 10.1186/1471-2148-13-124] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2012] [Accepted: 06/12/2013] [Indexed: 11/11/2022] Open
Abstract
Background Although gene overlapping is a common feature of prokaryote and mitochondria genomes, such genes have also been identified in many eukaryotes. The overlapping genes in eukaryotes are extensively rearranged even between closely related species. In this study, we investigated retention and rearrangement of positionally overlapping genes between the mosquitoes Aedes aegypti (dengue virus vector) and Anopheles gambiae (malaria vector). The overlapping gene pairs of A. aegypti were further compared with orthologs of other selected insects to conduct several hypothesis driven investigations relating to the evolution and rearrangement of overlapping genes. Results The results show that as much as ~10% of the predicted genes of A. aegypti and A. gambiae are localized in positional overlapping manner. Furthermore, the study shows that differential abundance of introns and simple sequence repeats have significant association with positional rearrangement of overlapping genes between the two species. Gene expression analysis further suggests that antisense transcripts generated from the oppositely oriented overlapping genes are differentially regulated and may have important regulatory functions in these mosquitoes. Our data further shows that synonymous and non-synonymous mutations have differential but non-significant effect on overlapping localization of orthologous genes in other insect genomes. Conclusion Gene overlapping in insects may be a species-specific evolutionary process as evident from non-dependency of gene overlapping with species phylogeny. Based on the results, our study suggests that overlapping genes may have played an important role in genome evolution of insects.
Collapse
Affiliation(s)
- Susanta K Behura
- Eck Institute for Global Health, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA
| | | |
Collapse
|
47
|
Ho MR, Tsai KW, Lin WC. A unified framework of overlapping genes: towards the origination and endogenic regulation. Genomics 2012; 100:231-9. [PMID: 22766524 DOI: 10.1016/j.ygeno.2012.06.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2012] [Revised: 06/21/2012] [Accepted: 06/25/2012] [Indexed: 11/27/2022]
Abstract
Overlapping genes are pairs of adjacent genes whose genomic regions partially overlap. They are notable by their potential intricate regulation, such as cis-regulation of nested gene-promoter configurations, and post-transcriptional regulation of natural antisense transcripts. The originations and consequent detailed regulation remain obscure. Herein, we propose a unified framework comprising biological classification rules followed by extensive analyses, namely, exon-sharing analysis, a human-mouse conservation study, and transcriptome analysis of hundreds of microarrays and transcriptome sequencing data (mRNA-Seq). We demonstrate that the tail-to-tail architecture would result from sharing functional elements in 3'-untranslated regions (3'-UTRs) of pre-existing genes. Dissimilarly, we illustrate that the other gene overlaps would originate from a new gene arising in a pre-existing gene locus. Interestingly, these types of coupled overlapping genes may influence each other synergistically or competitively during transcription, depending on the promoter configurations. This framework discloses distinctive characteristics of overlapping genes to be a foundation for a further comprehensive understanding of them.
Collapse
Affiliation(s)
- Meng-Ru Ho
- Biodiversity Research Center, Academia Sinica, Taipei 115, Taiwan
| | | | | |
Collapse
|
48
|
McNamara A. Can we measure memes? FRONTIERS IN EVOLUTIONARY NEUROSCIENCE 2011; 3:1. [PMID: 21720531 PMCID: PMC3118481 DOI: 10.3389/fnevo.2011.00001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2010] [Accepted: 05/12/2011] [Indexed: 11/13/2022]
Abstract
Memes are the fundamental unit of cultural evolution and have been left upon the periphery of cognitive neuroscience due to their inexact definition and the consequent presumption that they are impossible to measure. Here it is argued that although a precise definition of memes is rather difficult it does not preclude highly controlled experiments studying the neural substrates of their initiation and replication. In this paper, memes are termed as either internally or externally represented (i-memes/e-memes) in relation to whether they are represented as a neural substrate within the central nervous system or in some other form within our environment. It is argued that neuroimaging technology is now sufficiently advanced to image the connectivity profiles of i-memes and critically, to measure changes to i-memes over time, i.e., as they evolve. It is argued that it is wrong to simply pass off memes as an alternative term for "stimulus" and "learnt associations" as it does not accurately account for the way in which natural stimuli may dynamically "evolve" as clearly observed in our cultural lives.
Collapse
Affiliation(s)
- Adam McNamara
- Department of Psychology, University of Surrey Surrey, UK
| |
Collapse
|
49
|
Li S, Shih CH, Kohn MH. Functional and evolutionary correlates of gene constellations in the Drosophila melanogaster genome that deviate from the stereotypical gene architecture. BMC Genomics 2010; 11:322. [PMID: 20497561 PMCID: PMC2891614 DOI: 10.1186/1471-2164-11-322] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2009] [Accepted: 05/24/2010] [Indexed: 01/19/2023] Open
Abstract
Background The biological dimensions of genes are manifold. These include genomic properties, (e.g., X/autosomal linkage, recombination) and functional properties (e.g., expression level, tissue specificity). Multiple properties, each generally of subtle influence individually, may affect the evolution of genes or merely be (auto-)correlates. Results of multidimensional analyses may reveal the relative importance of these properties on the evolution of genes, and therefore help evaluate whether these properties should be considered during analyses. While numerous properties are now considered during studies, most work still assumes the stereotypical solitary gene as commonly depicted in textbooks. Here, we investigate the Drosophila melanogaster genome to determine whether deviations from the stereotypical gene architecture correlate with other properties of genes. Results Deviations from the stereotypical gene architecture were classified as the following gene constellations: Overlapping genes were defined as those that overlap in the 5-prime, exonic, or intronic regions. Chromatin co-clustering genes were defined as genes that co-clustered within 20 kb of transcriptional territories. If this scheme is applied the stereotypical gene emerges as a rare occurrence (7.5%), slightly varied schemes yielded between ~1%-50%. Moreover, when following our scheme, paired-overlapping genes and chromatin co-clustering genes accounted for 50.1 and 42.4% of the genes analyzed, respectively. Gene constellation was a correlate of a number of functional and evolutionary properties of genes, but its statistical effect was ~1-2 orders of magnitude lower than the effects of recombination, chromosome linkage and protein function. Analysis of datasets on male reproductive proteins showed these were biased in their representation of gene constellations and evolutionary rate Ka/Ks estimates, but these biases did not overwhelm the biologically meaningful observation of high evolutionary rates of male reproductive genes. Conclusion Given the rarity of the solitary stereotypical gene, and the abundance of gene constellations that deviate from it, the presence of gene constellations, while once thought to be exceptional in large Eukaryote genomes, might have broader relevance to the understanding and study of the genome. However, according to our definition, while gene constellations can be significant correlates of functional properties of genes, they generally are weak correlates of the evolution of genes. Thus, the need for their consideration would depend on the context of studies.
Collapse
Affiliation(s)
- Shuwei Li
- Department of Ecology and Evolutionary Biology, Rice University, 6100 Main Street, MS 170, Houston, Texas 77005, USA
| | | | | |
Collapse
|
50
|
Salato VK, Rediske NW, Zhang C, Hastings ML, Munroe SH. An exonic splicing enhancer within a bidirectional coding sequence regulates alternative splicing of an antisense mRNA. RNA Biol 2010; 7:179-90. [PMID: 20200494 DOI: 10.4161/rna.7.2.11182] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
The discovery of increasing numbers of genes with overlapping sequences highlights the problem of expression in the context of constraining regulatory elements from more than one gene. This study identifies regulatory sequences encompassed within two genes that overlap in an antisense orientation at their 3' ends. The genes encode the alpha-thyroid hormone receptor gene (TRalpha or NR1A1) and Rev-erbalpha (NR1D1). In mammals TRalpha pre-mRNAs are alternatively spliced to yield mRNAs encoding functionally antagonistic proteins: TRalpha1, an authentic thyroid hormone receptor; and TRalpha2, a non-hormone-binding variant that acts as a repressor. TRalpha2-specific splicing requires two regulatory elements that overlap with Rev-erbalpha sequences. Functional mapping of these elements reveals minimal splicing enhancer elements that have evolved within the constraints of the overlapping Rev-erbalpha sequence. These results provide insight into the evolution of regulatory elements within the context of bidirectional coding sequences. They also demonstrate the ability of the genetic code to accommodate multiple layers of information within a given sequence, an important property of the code recently suggested on theoretical grounds.
Collapse
Affiliation(s)
- Valerie K Salato
- Department of Biological Sciences, Marquette University, Milwaukee, WI, USA
| | | | | | | | | |
Collapse
|