1
|
Kamzeeva PN, Alferova VA, Korshun VA, Varizhuk AM, Aralov AV. 5'-UTR G-Quadruplex-Mediated Translation Regulation in Eukaryotes: Current Understanding and Methodological Challenges. Int J Mol Sci 2025; 26:1187. [PMID: 39940956 PMCID: PMC11818886 DOI: 10.3390/ijms26031187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2024] [Revised: 01/23/2025] [Accepted: 01/28/2025] [Indexed: 02/16/2025] Open
Abstract
RNA G-quadruplexes (rG4s) in 5'-UTRs represent complex regulatory elements capable of both inhibiting and activating mRNA translation through diverse mechanisms in eukaryotes. This review analyzes the evolution of our understanding of 5'-UTR rG4-mediated translation regulation, from early discoveries of simple translation inhibitors to the current recognition of their multifaceted regulatory roles. We discuss canonical and non-canonical rG4 structures, their interactions with regulatory proteins, including helicases and FMRP, and their function in both cap-dependent and IRES-mediated translation. Special attention is given to the synergistic effects between rG4s and upstream open reading frames (uORFs), stress-responsive translation regulation, and their role in repeat-associated non-AUG (RAN) translation linked to neurodegenerative diseases. We critically evaluate methodological challenges in the field, including limitations of current detection methods, reporter system artifacts, and the necessity to verify rG4 presence in endogenous transcripts. Recent technological advances, including genome editing and high-throughput sequencing approaches, have revealed that rG4 effects are more complex and context-dependent than initially thought. This review highlights the importance of developing more robust methodologies for studying rG4s at endogenous levels and carefully reevaluating previously identified targets, while emphasizing their potential as therapeutic targets in various diseases.
Collapse
Affiliation(s)
- Polina N. Kamzeeva
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Miklukho-Maklaya 16/10, 117997 Moscow, Russia
| | - Vera A. Alferova
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Miklukho-Maklaya 16/10, 117997 Moscow, Russia
| | - Vladimir A. Korshun
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Miklukho-Maklaya 16/10, 117997 Moscow, Russia
| | - Anna M. Varizhuk
- Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine, Federal Medical Biological Agency, 119435 Moscow, Russia
| | - Andrey V. Aralov
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Miklukho-Maklaya 16/10, 117997 Moscow, Russia
- RUDN University, 117198 Moscow, Russia
| |
Collapse
|
2
|
Kim SH, Marinov GK, Greenleaf WJ. KAS-ATAC reveals the genome-wide single-stranded accessible chromatin landscape of the human genome. Genome Res 2025; 35:124-134. [PMID: 39572230 PMCID: PMC11789636 DOI: 10.1101/gr.279621.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 11/19/2024] [Indexed: 01/24/2025]
Abstract
Gene regulation in most eukaryotes involves two fundamental processes: alterations in genome packaging by nucleosomes, with active cis-regulatory elements (CREs) generally characterized by open-chromatin configuration, and transcriptional activation. Mapping these physical properties and biochemical activities, through profiling chromatin accessibility and active transcription, is a key tool for understanding the logic and mechanisms of transcription and its regulation. However, the relationship between these two states has not been accessible to simultaneous measurement. To this end, we developed KAS-ATAC, a combination of the kethoxal-assisted ssDNA sequencing (KAS-seq) and assay for transposase-accessible chromatin using sequencing (ATAC-seq) methods for mapping single-stranded DNA (and thus active transcription) and chromatin accessibility, respectively, enabling the genome-wide identification of DNA fragments that are simultaneously accessible and contain ssDNA. We use KAS-ATAC to evaluate levels of active transcription over different CRE classes, to estimate absolute levels of transcribed accessible DNA over CREs, to map nucleosomal configurations associated with RNA polymerase activities, and to assess transcription factor association with transcribed DNA through transcription factor binding site (TFBS) footprinting. We observe lower levels of transcription over distal enhancers compared with promoters and distinct nucleosomal configurations around transcription initiation sites associated with active transcription. We find that most TFs associate equally with transcribed and nontranscribed DNA, but a few factors specifically do not exhibit footprints over ssDNA-containing fragments. We anticipate KAS-ATAC to continue to derive useful insights into chromatin organization and transcriptional regulation in other contexts in the future.
Collapse
Affiliation(s)
- Samuel H Kim
- Cancer Biology Programs, School of Medicine, Stanford University, Stanford, California 94305, USA
| | - Georgi K Marinov
- Department of Genetics, School of Medicine, Stanford University, Stanford, California 94305, USA;
| | - William J Greenleaf
- Department of Genetics, School of Medicine, Stanford University, Stanford, California 94305, USA
- Department of Applied Physics, Stanford University, Stanford, California 94305, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, California 94305, USA
- Chan Zuckerberg Biohub, San Francisco, California 94158, USA
| |
Collapse
|
3
|
Alfonso-Gonzalez C, Hilgers V. (Alternative) transcription start sites as regulators of RNA processing. Trends Cell Biol 2024; 34:1018-1028. [PMID: 38531762 DOI: 10.1016/j.tcb.2024.02.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 02/20/2024] [Accepted: 02/23/2024] [Indexed: 03/28/2024]
Abstract
Alternative transcription start site usage (ATSS) is a widespread regulatory strategy that enables genes to choose between multiple genomic loci for initiating transcription. This mechanism is tightly controlled during development and is often altered in disease states. In this review, we examine the growing evidence highlighting a role for transcription start sites (TSSs) in the regulation of mRNA isoform selection during and after transcription. We discuss how the choice of transcription initiation sites influences RNA processing and the importance of this crosstalk for cell identity and organism function. We also speculate on possible mechanisms underlying the integration of transcriptional and post-transcriptional processes.
Collapse
Affiliation(s)
- Carlos Alfonso-Gonzalez
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany; Faculty of Biology, Albert Ludwigs University, 79104 Freiburg, Germany; International Max Planck Research School for Molecular and Cellular Biology (IMPRS- MCB), 79108 Freiburg, Germany
| | - Valérie Hilgers
- Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg, Germany.
| |
Collapse
|
4
|
Zhao L, Gong F, Lou K, Wang L, Wang J, Sun H, Wang D, Shi Y, Wang Z. Retrotransposon involves in photoperiodic spermatogenesis in Brandt's voles (Lasiopodomys brandtii) by co-transcription with flagellar genes. Int J Biol Macromol 2024; 281:136224. [PMID: 39362423 DOI: 10.1016/j.ijbiomac.2024.136224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 09/10/2024] [Accepted: 09/30/2024] [Indexed: 10/05/2024]
Abstract
Photoperiod is a pivotal factor in affecting spermatogenesis in seasonal-breeding animals. Transposable elements have regulatory functions during spermatogenesis. However, whether it also functions in photoperiodic spermatogenesis in seasonal breeding animals is unknown. To explore this, we first annotated 5,501,822 transposons in the whole genome of Brandt's voles (Lasiopodomys brandtii), and revealed that LINEs were the most abundant, comprising 16.61 % of the genome. Following closely, SINEs accounted for 10.13 %, LTRs for 7.54 %, and DNA transposons for 0.70 %. Subsequently, we exposed male Brandt's voles to long-photoperiod (LP, 16 h/day) and short-photoperiod (SP, 8 h/day) from their embryonic stages, and obtained testes transcriptome at 4 and 10 weeks after birth. Differential expression and Pearson analysis indicated strongly positive correlations between the expression of differentially expressed retrotransposons and the adjacent genes. KO, KEGG and GSEA results showed that sperm flagellar genes were most enriched nearby the retrotransposons such as Dnah1, Dnah2, Dnah17, Dnali1. RT-PCR results showed that SINE/Alu_1213291 co-transcripted with Dnali1 gene. Our findings first reveal the regulatory function of transposons in photoperiodic spermatogenesis, providing insights into the role of photoperiod in seasonal reproduction in wild animals.
Collapse
Affiliation(s)
- Lijuan Zhao
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, Henan, China
| | - Fanglei Gong
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, Henan, China
| | - Kang Lou
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, Henan, China
| | - Lewen Wang
- State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China; Western Agricultural Research Center, Chinese Academy of Agriculture Science, Changji 831100, China
| | - Jingou Wang
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, Henan, China
| | - Hong Sun
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, Henan, China; Centre for Sport Nutrition and Health, School of Physical Education (Main Campus), Zhengzhou University, Zhengzhou 450001, Henan, China
| | - Dawei Wang
- State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China; Western Agricultural Research Center, Chinese Academy of Agriculture Science, Changji 831100, China.
| | - Yuhua Shi
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, Henan, China.
| | - Zhenlong Wang
- School of Life Sciences, Zhengzhou University, Zhengzhou 450001, Henan, China.
| |
Collapse
|
5
|
Lebherz MK, Fouks B, Schmidt J, Bornberg-Bauer E, Grandchamp A. DNA Transposons Favor De Novo Transcript Emergence Through Enrichment of Transcription Factor Binding Motifs. Genome Biol Evol 2024; 16:evae134. [PMID: 38934893 PMCID: PMC11264136 DOI: 10.1093/gbe/evae134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 06/11/2024] [Accepted: 06/15/2024] [Indexed: 06/28/2024] Open
Abstract
De novo genes emerge from noncoding regions of genomes via succession of mutations. Among others, such mutations activate transcription and create a new open reading frame (ORF). Although the mechanisms underlying ORF emergence are well documented, relatively little is known about the mechanisms enabling new transcription events. Yet, in many species a continuum between absent and very prominent transcription has been reported for essentially all regions of the genome. In this study, we searched for de novo transcripts by using newly assembled genomes and transcriptomes of seven inbred lines of Drosophila melanogaster, originating from six European and one African population. This setup allowed us to detect sample specific de novo transcripts, and compare them to their homologous nontranscribed regions in other samples, as well as genic and intergenic control sequences. We studied the association with transposable elements (TEs) and the enrichment of transcription factor motifs upstream of de novo emerged transcripts and compared them with regulatory elements. We found that de novo transcripts overlap with TEs more often than expected by chance. The emergence of new transcripts correlates with regions of high guanine-cytosine content and TE expression. Moreover, upstream regions of de novo transcripts are highly enriched with regulatory motifs. Such motifs are more enriched in new transcripts overlapping with TEs, particularly DNA TEs, and are more conserved upstream de novo transcripts than upstream their 'nontranscribed homologs'. Overall, our study demonstrates that TE insertion is important for transcript emergence, partly by introducing new regulatory motifs from DNA TE families.
Collapse
Affiliation(s)
| | - Bertrand Fouks
- CEFE, Univ Montpellier, CNRS, EPHE, IRD, Montpellier, France
- UMR AGAP Institut, Univ Montpellier, CIRAD, INRAE, Institut Agro, F-34398, Montpellier, France
- CIRAD, UMR AGAP Institut, F-34398, Montpellier, France
| | - Julian Schmidt
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Biology, Tübingen, Germany
| | - Anna Grandchamp
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| |
Collapse
|
6
|
Wudarski J, Aliabadi S, Gulia-Nuss M. Arthropod promoters for genetic control of disease vectors. Trends Parasitol 2024; 40:619-632. [PMID: 38824066 PMCID: PMC11223965 DOI: 10.1016/j.pt.2024.04.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 04/15/2024] [Accepted: 04/15/2024] [Indexed: 06/03/2024]
Abstract
Vector-borne diseases (VBDs) impose devastating effects on human health and a heavy financial burden. Malaria, Lyme disease, and dengue fever are just a few examples of VBDs that cause severe illnesses. The current strategies to control VBDs consist mainly of environmental modification and chemical use, and to a small extent, genetic approaches. The genetic approaches, including transgenesis/genome modification and gene-drive technologies, provide the basis for developing new tools for VBD prevention by suppressing vector populations or reducing their capacity to transmit pathogens. The regulatory elements such as promoters are required for a robust sex-, tissue-, and stage-specific transgene expression. As discussed in this review, information on the regulatory elements is available for mosquito vectors but is scant for other vectors.
Collapse
Affiliation(s)
- Jakub Wudarski
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, USA
| | - Simindokht Aliabadi
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, USA
| | - Monika Gulia-Nuss
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, NV, USA.
| |
Collapse
|
7
|
Lykoskoufis NMR, Planet E, Ongen H, Trono D, Dermitzakis ET. Transposable elements mediate genetic effects altering the expression of nearby genes in colorectal cancer. Nat Commun 2024; 15:749. [PMID: 38272908 PMCID: PMC10811328 DOI: 10.1038/s41467-023-42405-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 10/10/2023] [Indexed: 01/27/2024] Open
Abstract
Transposable elements (TEs) are prevalent repeats in the human genome, play a significant role in the regulome, and their disruption can contribute to tumorigenesis. However, TE influence on gene expression in cancer remains unclear. Here, we analyze 275 normal colon and 276 colorectal cancer samples from the SYSCOL cohort, discovering 10,231 and 5,199 TE-expression quantitative trait loci (eQTLs) in normal and tumor tissues, respectively, of which 376 are colorectal cancer specific eQTLs, likely due to methylation changes. Tumor-specific TE-eQTLs show greater enrichment of transcription factors, compared to shared TE-eQTLs suggesting specific regulation of their expression in tumor. Bayesian networks reveal 1,766 TEs as mediators of genetic effects, altering the expression of 1,558 genes, including 55 known cancer driver genes and show that tumor-specific TE-eQTLs trigger the driver capability of TEs. These insights expand our knowledge of cancer drivers, deepening our understanding of tumorigenesis and presenting potential avenues for therapeutic interventions.
Collapse
Affiliation(s)
- Nikolaos M R Lykoskoufis
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211, Geneva, Switzerland.
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, 1211, Geneva, Switzerland.
- Swiss Institute of Bioinformatics, 1211, Geneva, Switzerland.
- NGS-AI JSR Life Sciences, Route de la Corniche 3, 1066, Epalinges, Switzerland.
| | - Evarist Planet
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
| | - Halit Ongen
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211, Geneva, Switzerland
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, 1211, Geneva, Switzerland
- Swiss Institute of Bioinformatics, 1211, Geneva, Switzerland
| | - Didier Trono
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
| | - Emmanouil T Dermitzakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211, Geneva, Switzerland.
- Institute for Genetics and Genomics in Geneva (iGE3), University of Geneva, 1211, Geneva, Switzerland.
- Swiss Institute of Bioinformatics, 1211, Geneva, Switzerland.
| |
Collapse
|
8
|
Zhu Y, Vvedenskaya IO, Sze SH, Nickels BE, Kaplan CD. Quantitative analysis of transcription start site selection reveals control by DNA sequence, RNA polymerase II activity and NTP levels. Nat Struct Mol Biol 2024; 31:190-202. [PMID: 38177677 PMCID: PMC10928753 DOI: 10.1038/s41594-023-01171-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 11/03/2023] [Indexed: 01/06/2024]
Abstract
Transcription start site (TSS) selection is a key step in gene expression and occurs at many promoter positions over a wide range of efficiencies. Here we develop a massively parallel reporter assay to quantitatively dissect contributions of promoter sequence, nucleoside triphosphate substrate levels and RNA polymerase II (Pol II) activity to TSS selection by 'promoter scanning' in Saccharomyces cerevisiae (Pol II MAssively Systematic Transcript End Readout, 'Pol II MASTER'). Using Pol II MASTER, we measure the efficiency of Pol II initiation at 1,000,000 individual TSS sequences in a defined promoter context. Pol II MASTER confirms proposed critical qualities of S. cerevisiae TSS -8, -1 and +1 positions, quantitatively, in a controlled promoter context. Pol II MASTER extends quantitative analysis to surrounding sequences and determines that they tune initiation over a wide range of efficiencies. These results enabled the development of a predictive model for initiation efficiency based on sequence. We show that genetic perturbation of Pol II catalytic activity alters initiation efficiency mostly independently of TSS sequence, but selectively modulates preference for the initiating nucleotide. Intriguingly, we find that Pol II initiation efficiency is directly sensitive to guanosine-5'-triphosphate levels at the first five transcript positions and to cytosine-5'-triphosphate and uridine-5'-triphosphate levels at the second position genome wide. These results suggest individual nucleoside triphosphate levels can have transcript-specific effects on initiation, representing a cryptic layer of potential regulation at the level of Pol II biochemical properties. The results establish Pol II MASTER as a method for quantitative dissection of transcription initiation in eukaryotes.
Collapse
Affiliation(s)
- Yunye Zhu
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Irina O Vvedenskaya
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Sing-Hoi Sze
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, USA
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
| | - Bryce E Nickels
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
9
|
Oliveira DS, Fablet M, Larue A, Vallier A, Carareto CA, Rebollo R, Vieira C. ChimeraTE: a pipeline to detect chimeric transcripts derived from genes and transposable elements. Nucleic Acids Res 2023; 51:9764-9784. [PMID: 37615575 PMCID: PMC10570057 DOI: 10.1093/nar/gkad671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Revised: 07/25/2023] [Accepted: 08/09/2023] [Indexed: 08/25/2023] Open
Abstract
Transposable elements (TEs) produce structural variants and are considered an important source of genetic diversity. Notably, TE-gene fusion transcripts, i.e. chimeric transcripts, have been associated with adaptation in several species. However, the identification of these chimeras remains hindered due to the lack of detection tools at a transcriptome-wide scale, and to the reliance on a reference genome, even though different individuals/cells/strains have different TE insertions. Therefore, we developed ChimeraTE, a pipeline that uses paired-end RNA-seq reads to identify chimeric transcripts through two different modes. Mode 1 is the reference-guided approach that employs canonical genome alignment, and Mode 2 identifies chimeras derived from fixed or insertionally polymorphic TEs without any reference genome. We have validated both modes using RNA-seq data from four Drosophila melanogaster wild-type strains. We found ∼1.12% of all genes generating chimeric transcripts, most of them from TE-exonized sequences. Approximately ∼23% of all detected chimeras were absent from the reference genome, indicating that TEs belonging to chimeric transcripts may be recent, polymorphic insertions. ChimeraTE is the first pipeline able to automatically uncover chimeric transcripts without a reference genome, consisting of two running Modes that can be used as a tool to investigate the contribution of TEs to transcriptome plasticity.
Collapse
Affiliation(s)
- Daniel S Oliveira
- São Paulo State University (Unesp), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, SP, Brazil
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR5558, Villeurbanne, Rhone-Alpes, 69100, France
| | - Marie Fablet
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR5558, Villeurbanne, Rhone-Alpes, 69100, France
- Institut Universitaire de France (IUF), Paris, Île-de-FranceF-75231, France
| | - Anaïs Larue
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR5558, Villeurbanne, Rhone-Alpes, 69100, France
- Univ Lyon, INRAE, INSA-Lyon, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Agnès Vallier
- Univ Lyon, INRAE, INSA-Lyon, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Claudia M A Carareto
- São Paulo State University (Unesp), Institute of Biosciences, Humanities and Exact Sciences, São José do Rio Preto, SP, Brazil
| | - Rita Rebollo
- Univ Lyon, INRAE, INSA-Lyon, BF2I, UMR 203, 69621 Villeurbanne, France
| | - Cristina Vieira
- Laboratoire de Biométrie et Biologie Evolutive, Université Lyon 1, CNRS, UMR5558, Villeurbanne, Rhone-Alpes, 69100, France
| |
Collapse
|
10
|
Coronado-Zamora M, González J. Transposons contribute to the functional diversification of the head, gut, and ovary transcriptomes across Drosophila natural strains. Genome Res 2023; 33:1541-1553. [PMID: 37793782 PMCID: PMC10620055 DOI: 10.1101/gr.277565.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 08/08/2023] [Indexed: 10/06/2023]
Abstract
Transcriptomes are dynamic, with cells, tissues, and body parts expressing particular sets of transcripts. Transposable elements (TEs) are a known source of transcriptome diversity; however, studies often focus on a particular type of chimeric transcript, analyze single body parts or cell types, or are based on incomplete TE annotations from a single reference genome. In this work, we have implemented a method based on de novo transcriptome assembly that minimizes the potential sources of errors while identifying a comprehensive set of gene-TE chimeras. We applied this method to the head, gut, and ovary dissected from five Drosophila melanogaster natural strains, with individual reference genomes available. We found that ∼19% of body part-specific transcripts are gene-TE chimeras. Overall, chimeric transcripts contribute a mean of 43% to the total gene expression, and they provide protein domains for DNA binding, catalytic activity, and DNA polymerase activity. Our comprehensive data set is a rich resource for follow-up analysis. Moreover, because TEs are present in virtually all species sequenced to date, their role in spatially restricted transcript expression is likely not exclusive to the species analyzed in this work.
Collapse
Affiliation(s)
| | - Josefa González
- Institute of Evolutionary Biology, CSIC, UPF, Barcelona 08003, Spain
| |
Collapse
|
11
|
Fan K, Pfister E, Weng Z. Toward a comprehensive catalog of regulatory elements. Hum Genet 2023; 142:1091-1111. [PMID: 36935423 DOI: 10.1007/s00439-023-02519-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 01/03/2023] [Indexed: 03/21/2023]
Abstract
Regulatory elements are the genomic regions that interact with transcription factors to control cell-type-specific gene expression in different cellular environments. A precise and complete catalog of functional elements encoded by the human genome is key to understanding mammalian gene regulation. Here, we review the current state of regulatory element annotation. We first provide an overview of assays for characterizing functional elements, including genome, epigenome, transcriptome, three-dimensional chromatin interaction, and functional validation assays. We then discuss computational methods for defining regulatory elements, including peak-calling and other statistical modeling methods. Finally, we introduce several high-quality lists of regulatory element annotations and suggest potential future directions.
Collapse
Affiliation(s)
- Kaili Fan
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, 368 Plantation Street, ASC5-1069, Worcester, MA, 01605, USA
- Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Edith Pfister
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, 368 Plantation Street, ASC5-1069, Worcester, MA, 01605, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, 368 Plantation Street, ASC5-1069, Worcester, MA, 01605, USA.
| |
Collapse
|
12
|
Hitz BC, Lee JW, Jolanki O, Kagda MS, Graham K, Sud P, Gabdank I, Strattan JS, Sloan CA, Dreszer T, Rowe LD, Podduturi NR, Malladi VS, Chan ET, Davidson JM, Ho M, Miyasato S, Simison M, Tanaka F, Luo Y, Whaling I, Hong EL, Lee BT, Sandstrom R, Rynes E, Nelson J, Nishida A, Ingersoll A, Buckley M, Frerker M, Kim DS, Boley N, Trout D, Dobin A, Rahmanian S, Wyman D, Balderrama-Gutierrez G, Reese F, Durand NC, Dudchenko O, Weisz D, Rao SSP, Blackburn A, Gkountaroulis D, Sadr M, Olshansky M, Eliaz Y, Nguyen D, Bochkov I, Shamim MS, Mahajan R, Aiden E, Gingeras T, Heath S, Hirst M, Kent WJ, Kundaje A, Mortazavi A, Wold B, Cherry JM. The ENCODE Uniform Analysis Pipelines. RESEARCH SQUARE 2023:rs.3.rs-3111932. [PMID: 37503119 PMCID: PMC10371165 DOI: 10.21203/rs.3.rs-3111932/v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.
Collapse
Affiliation(s)
- Benjamin C Hitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Jin-Wook Lee
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Otto Jolanki
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Meenakshi S Kagda
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Keenan Graham
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Paul Sud
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Idan Gabdank
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - J Seth Strattan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Cricket A Sloan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Timothy Dreszer
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Laurence D Rowe
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Nikhil R Podduturi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Venkat S Malladi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Esther T Chan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Jean M Davidson
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Marcus Ho
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Stuart Miyasato
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Matt Simison
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Forrest Tanaka
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Yunhai Luo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Ian Whaling
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Eurie L Hong
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Brian T Lee
- Genomics Institute, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Richard Sandstrom
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Eric Rynes
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Jemma Nelson
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Andrew Nishida
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Alyssa Ingersoll
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Michael Buckley
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Mark Frerker
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Daniel S Kim
- Department of Genetics, Department of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Nathan Boley
- Department of Genetics, Department of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Diane Trout
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125 USA
| | - Alex Dobin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sorena Rahmanian
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Dana Wyman
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | | | - Fairlie Reese
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Neva C Durand
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of Computer Science, Rice University, Houston, TX 77030, USA
| | - Olga Dudchenko
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - David Weisz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Suhas S P Rao
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Medicine, University of California San Francisco, San Francisco, CA 94143, USA
| | - Alyssa Blackburn
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Dimos Gkountaroulis
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Mahdi Sadr
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Moshe Olshansky
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Yossi Eliaz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Dat Nguyen
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ivan Bochkov
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Muhammad Saad Shamim
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ragini Mahajan
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of BioSciences, Rice University, Houston, TX 77005, USA
| | - Erez Aiden
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Tom Gingeras
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Simon Heath
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain. Universitat Pompeu Fabra, Barcelona, Spain
| | - Martin Hirst
- Micheal Smith Laboratories, University of British Columbia, British Columbia, Canada
| | - W James Kent
- Genomics Institute, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Anshul Kundaje
- Department of Genetics, Department of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Ali Mortazavi
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125 USA
| | - J Michael Cherry
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
13
|
Vilar JMG, Saiz L. Multi-landmark alignment of genomic signals reveals conserved expression patterns across transcription start sites. Sci Rep 2023; 13:10835. [PMID: 37407625 DOI: 10.1038/s41598-023-37140-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 06/16/2023] [Indexed: 07/07/2023] Open
Abstract
The prevalent one-dimensional alignment of genomic signals to a reference landmark is a cornerstone of current methods to study transcription and its DNA-dependent processes but it is prone to mask potential relations among multiple DNA elements. We developed a systematic approach to align genomic signals to multiple locations simultaneously by expanding the dimensionality of the genomic-coordinate space. We analyzed transcription in human and uncovered a complex dependence on the relative position of neighboring transcription start sites (TSSs) that is consistently conserved among cell types. The dependence ranges from enhancement to suppression of transcription depending on the relative distances to the TSSs, their intragenic position, and the transcriptional activity of the gene. Our results reveal a conserved hierarchy of alternative TSS usage within a previously unrecognized level of genomic organization and provide a general methodology to analyze complex functional relationships among multiple types of DNA elements.
Collapse
Affiliation(s)
- Jose M G Vilar
- Biofisika Institute (CSIC, UPV/EHU), University of the Basque Country (UPV/EHU), P.O. Box 644, 48080, Bilbao, Spain.
- IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain.
| | - Leonor Saiz
- Department of Biomedical Engineering, University of California, 451 E. Health Sciences Drive, Davis, CA, 95616, USA.
| |
Collapse
|
14
|
Pagni S, Custodio HM, Frankish A, Mudge JM, Mills JD, Sisodiya SM. SCN1A: bioinformatically informed revised boundaries for promoter and enhancer regions. Hum Mol Genet 2023; 32:1753-1763. [PMID: 36715146 PMCID: PMC10162429 DOI: 10.1093/hmg/ddad015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 01/06/2023] [Accepted: 01/24/2023] [Indexed: 01/31/2023] Open
Abstract
Pathogenic variations in the sodium voltage-gated channel alpha subunit 1 (SCN1A) gene are responsible for multiple epilepsy phenotypes, including Dravet syndrome, febrile seizures (FS) and genetic epilepsy with FS plus. Phenotypic heterogeneity is a hallmark of SCN1A-related epilepsies, the causes of which are yet to be clarified. Genetic variation in the non-coding regulatory regions of SCN1A could be one potential causal factor. However, a comprehensive understanding of the SCN1A regulatory landscape is currently lacking. Here, we summarized the current state of knowledge of SCN1A regulation, providing details on its promoter and enhancer regions. We then integrated currently available data on SCN1A promoters by extracting information related to the SCN1A locus from genome-wide repositories and clearly defined the promoter and enhancer regions of SCN1A. Further, we explored the cellular specificity of differential SCN1A promoter usage. We also reviewed and integrated the available human brain-derived enhancer databases and mouse-derived data to provide a comprehensive computationally developed summary of SCN1A brain-active enhancers. By querying genome-wide data repositories, extracting SCN1A-specific data and integrating the different types of independent evidence, we created a comprehensive catalogue that better defines the regulatory landscape of SCN1A, which could be used to explore the role of SCN1A regulatory regions in disease.
Collapse
Affiliation(s)
- Susanna Pagni
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
- Chalfont Centre for Epilepsy, Bucks SL9 0RJ, UK
| | - Helena Martins Custodio
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
- Chalfont Centre for Epilepsy, Bucks SL9 0RJ, UK
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge, UK
| | - James D Mills
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
- Chalfont Centre for Epilepsy, Bucks SL9 0RJ, UK
- Amsterdam UMC, Department of (Neuro) Pathology, Amsterdam Neuroscience, University of Amsterdam, Amsterdam, 1105 AZ The Netherlands
| | - Sanjay M Sisodiya
- Department of Clinical and Experimental Epilepsy, UCL Queen Square Institute of Neurology, London WC1N 3BG, UK
- Chalfont Centre for Epilepsy, Bucks SL9 0RJ, UK
| |
Collapse
|
15
|
Hitz BC, Jin-Wook L, Jolanki O, Kagda MS, Graham K, Sud P, Gabdank I, Strattan JS, Sloan CA, Dreszer T, Rowe LD, Podduturi NR, Malladi VS, Chan ET, Davidson JM, Ho M, Miyasato S, Simison M, Tanaka F, Luo Y, Whaling I, Hong EL, Lee BT, Sandstrom R, Rynes E, Nelson J, Nishida A, Ingersoll A, Buckley M, Frerker M, Kim DS, Boley N, Trout D, Dobin A, Rahmanian S, Wyman D, Balderrama-Gutierrez G, Reese F, Durand NC, Dudchenko O, Weisz D, Rao SSP, Blackburn A, Gkountaroulis D, Sadr M, Olshansky M, Eliaz Y, Nguyen D, Bochkov I, Shamim MS, Mahajan R, Aiden E, Gingeras T, Heath S, Hirst M, Kent WJ, Kundaje A, Mortazavi A, Wold B, Cherry JM. The ENCODE Uniform Analysis Pipelines. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.04.535623. [PMID: 37066421 PMCID: PMC10104020 DOI: 10.1101/2023.04.04.535623] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
The Encyclopedia of DNA elements (ENCODE) project is a collaborative effort to create a comprehensive catalog of functional elements in the human genome. The current database comprises more than 19000 functional genomics experiments across more than 1000 cell lines and tissues using a wide array of experimental techniques to study the chromatin structure, regulatory and transcriptional landscape of the Homo sapiens and Mus musculus genomes. All experimental data, metadata, and associated computational analyses created by the ENCODE consortium are submitted to the Data Coordination Center (DCC) for validation, tracking, storage, and distribution to community resources and the scientific community. The ENCODE project has engineered and distributed uniform processing pipelines in order to promote data provenance and reproducibility as well as allow interoperability between genomic resources and other consortia. All data files, reference genome versions, software versions, and parameters used by the pipelines are captured and available via the ENCODE Portal. The pipeline code, developed using Docker and Workflow Description Language (WDL; https://openwdl.org/) is publicly available in GitHub, with images available on Dockerhub (https://hub.docker.com), enabling access to a diverse range of biomedical researchers. ENCODE pipelines maintained and used by the DCC can be installed to run on personal computers, local HPC clusters, or in cloud computing environments via Cromwell. Access to the pipelines and data via the cloud allows small labs the ability to use the data or software without access to institutional compute clusters. Standardization of the computational methodologies for analysis and quality control leads to comparable results from different ENCODE collections - a prerequisite for successful integrative analyses.
Collapse
Affiliation(s)
- Benjamin C Hitz
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Lee Jin-Wook
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Otto Jolanki
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Meenakshi S Kagda
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Keenan Graham
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Paul Sud
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Idan Gabdank
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - J Seth Strattan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Cricket A Sloan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Timothy Dreszer
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Laurence D Rowe
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Nikhil R Podduturi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Venkat S Malladi
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Esther T Chan
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Jean M Davidson
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Marcus Ho
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Stuart Miyasato
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Matt Simison
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Forrest Tanaka
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Yunhai Luo
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Ian Whaling
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Eurie L Hong
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Brian T Lee
- Genomics Institute, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Richard Sandstrom
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Eric Rynes
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Jemma Nelson
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Andrew Nishida
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Alyssa Ingersoll
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Michael Buckley
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Mark Frerker
- Altius Institute for Biomedical Sciences, 2211 Elliott Avenue, 6th Floor, Seattle, WA 98121, USA
| | - Daniel S Kim
- Dept. of Genetics, Dept. of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Nathan Boley
- Dept. of Genetics, Dept. of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Diane Trout
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125 USA
| | - Alex Dobin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Sorena Rahmanian
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Dana Wyman
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | | | - Fairlie Reese
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Neva C Durand
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of Computer Science, Rice University, Houston, TX 77030, USA
| | - Olga Dudchenko
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - David Weisz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Suhas S P Rao
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305, USA
- Department of Medicine, University of California San Francisco, San Francisco, CA 94143, USA
| | - Alyssa Blackburn
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Dimos Gkountaroulis
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Mahdi Sadr
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Moshe Olshansky
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Yossi Eliaz
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Dat Nguyen
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ivan Bochkov
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Muhammad Saad Shamim
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of Bioengineering, Rice University, Houston, TX 77030, USA
- Medical Scientist Training Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - Ragini Mahajan
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
- Department of BioSciences, Rice University, Houston, TX 77005, USA
| | - Erez Aiden
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77030, USA
| | - Tom Gingeras
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Simon Heath
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain. Universitat Pompeu Fabra, Barcelona, Spain
| | - Martin Hirst
- Micheal Smith Laboratories, University of British Columbia, British Columbia, Canada
| | - W James Kent
- Genomics Institute, School of Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Anshul Kundaje
- Dept. of Genetics, Dept. of Computer Science, Stanford University, 240 Pasteur Drive, Palo Alto, CA 94304, USA
| | - Ali Mortazavi
- Center for Complex Biological Systems, University of California, Irvine, Irvine, CA 92697, USA
| | - Barbara Wold
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, 91125 USA
| | - J Michael Cherry
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| |
Collapse
|
16
|
Hamamoto K, Umemura Y, Makino S, Fukaya T. Dynamic interplay between non-coding enhancer transcription and gene activity in development. Nat Commun 2023; 14:826. [PMID: 36805453 PMCID: PMC9941499 DOI: 10.1038/s41467-023-36485-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 02/03/2023] [Indexed: 02/22/2023] Open
Abstract
Non-coding transcription at the intergenic regulatory regions is a prevalent feature of metazoan genomes, but its biological function remains uncertain. Here, we devise a live-imaging system that permits simultaneous visualization of gene activity along with intergenic non-coding transcription at single-cell resolution in Drosophila. Quantitative image analysis reveals that elongation of RNA polymerase II across the internal core region of enhancers leads to suppression of transcriptional bursting from linked genes. Super-resolution imaging and genome-editing analysis further demonstrate that enhancer transcription antagonizes molecular crowding of transcription factors, thereby interrupting the formation of a transcription hub at the gene locus. We also show that a certain class of developmental enhancers are structurally optimized to co-activate gene transcription together with non-coding transcription effectively. We suggest that enhancer function is flexibly tunable through the modulation of hub formation via surrounding non-coding transcription during development.
Collapse
Affiliation(s)
- Kota Hamamoto
- Laboratory of Transcription Dynamics, Research Center for Biological Visualization, Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.,Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Yusuke Umemura
- Laboratory of Transcription Dynamics, Research Center for Biological Visualization, Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.,Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Shiho Makino
- Laboratory of Transcription Dynamics, Research Center for Biological Visualization, Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Takashi Fukaya
- Laboratory of Transcription Dynamics, Research Center for Biological Visualization, Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan. .,Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.
| |
Collapse
|
17
|
Gasparotto E, Burattin FV, Di Gioia V, Panepuccia M, Ranzani V, Marasca F, Bodega B. Transposable Elements Co-Option in Genome Evolution and Gene Regulation. Int J Mol Sci 2023; 24:ijms24032610. [PMID: 36768929 PMCID: PMC9917352 DOI: 10.3390/ijms24032610] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 01/26/2023] [Accepted: 01/28/2023] [Indexed: 01/31/2023] Open
Abstract
The genome is no longer deemed as a fixed and inert item but rather as a moldable matter that is continuously evolving and adapting. Within this frame, Transposable Elements (TEs), ubiquitous, mobile, repetitive elements, are considered an alive portion of the genomes to date, whose functions, although long considered "dark", are now coming to light. Here we will review that, besides the detrimental effects that TE mobilization can induce, TEs have shaped genomes in their current form, promoting genome sizing, genomic rearrangements and shuffling of DNA sequences. Although TEs are mostly represented in the genomes by evolutionarily old, short, degenerated, and sedentary fossils, they have been thoroughly co-opted by the hosts as a prolific and original source of regulatory instruments for the control of gene transcription and genome organization in the nuclear space. For these reasons, the deregulation of TE expression and/or activity is implicated in the onset and progression of several diseases. It is likely that we have just revealed the outermost layers of TE functions. Further studies on this portion of the genome are required to unlock novel regulatory functions that could also be exploited for diagnostic and therapeutic approaches.
Collapse
Affiliation(s)
- Erica Gasparotto
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- SEMM, European School of Molecular Medicine, 20139 Milan, Italy
| | - Filippo Vittorio Burattin
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- Department of Biosciences, University of Milan, 20133 Milan, Italy
| | - Valeria Di Gioia
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- SEMM, European School of Molecular Medicine, 20139 Milan, Italy
| | - Michele Panepuccia
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
| | - Valeria Ranzani
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
| | - Federica Marasca
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- Department of Clinical Sciences and Community Health, University of Milan, 20122 Milan, Italy
| | - Beatrice Bodega
- Fondazione INGM, Istituto Nazionale di Genetica Molecolare “Enrica e Romeo Invernizzi”, 20122 Milan, Italy
- Department of Biosciences, University of Milan, 20133 Milan, Italy
- Correspondence:
| |
Collapse
|
18
|
Murray A, Mendieta JP, Vollmers C, Schmitz RJ. Simple and accurate transcriptional start site identification using Smar2C2 and examination of conserved promoter features. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 112:583-596. [PMID: 36030508 PMCID: PMC9827901 DOI: 10.1111/tpj.15957] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 08/12/2022] [Accepted: 08/22/2022] [Indexed: 06/15/2023]
Abstract
The precise and accurate identification and quantification of transcriptional start sites (TSSs) is key to understanding the control of transcription. The core promoter consists of the TSS and proximal non-coding sequences, which are critical in transcriptional regulation. Therefore, the accurate identification of TSSs is important for understanding the molecular regulation of transcription. Existing protocols for TSS identification are challenging and expensive, leaving high-quality data available for a small subset of organisms. This sparsity of data impairs study of TSS usage across tissues or in an evolutionary context. To address these shortcomings, we developed Smart-Seq2 Rolling Circle to Concatemeric Consensus (Smar2C2), which identifies and quantifies TSSs and transcription termination sites. Smar2C2 incorporates unique molecular identifiers that allowed for the identification of as many as 70 million sites, with no known upper limit. We have also generated TSS data sets from as little as 40 pg of total RNA, which was the smallest input tested. In this study, we used Smar2C2 to identify TSSs in Glycine max (soybean), Oryza sativa (rice), Sorghum bicolor (sorghum), Triticum aestivum (wheat) and Zea mays (maize) across multiple tissues. This wide panel of plant TSSs facilitated the identification of evolutionarily conserved features, such as novel patterns in the dinucleotides that compose the initiator element (Inr), that correlated with promoter expression levels across all species examined. We also discovered sequence variations in known promoter motifs that are positioned reliably close to the TSS, such as differences in the TATA box and in the Inr that may prove significant to our understanding and control of transcription initiation. Smar2C2 allows for the easy study of these critical sequences, providing a tool to facilitate discovery.
Collapse
Affiliation(s)
- Andrew Murray
- Department of Plant BiologyUniversity of GeorgiaAthensGA30602USA
| | | | - Chris Vollmers
- Deparment of Biomolecular EngineeringUniversity of California Santa CruzSanta CruzCA95064USA
| | | |
Collapse
|
19
|
Yao L, Liang J, Ozer A, Leung AKY, Lis JT, Yu H. A comparison of experimental assays and analytical methods for genome-wide identification of active enhancers. Nat Biotechnol 2022; 40:1056-1065. [PMID: 35177836 PMCID: PMC9288987 DOI: 10.1038/s41587-022-01211-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 01/06/2022] [Indexed: 01/15/2023]
Abstract
Mounting evidence supports the idea that transcriptional patterns serve as more specific identifiers of active enhancers than histone marks; however, the optimal strategy to identify active enhancers both experimentally and computationally has not been determined. Here, we compared 13 genome-wide RNA sequencing (RNA-seq) assays in K562 cells and show that nuclear run-on followed by cap-selection assay (GRO/PRO-cap) has advantages in enhancer RNA detection and active enhancer identification. We also introduce a tool, peak identifier for nascent transcript starts (PINTS), to identify active promoters and enhancers genome wide and pinpoint the precise location of 5' transcription start sites. Finally, we compiled a comprehensive enhancer candidate compendium based on the detected enhancer RNA (eRNA) transcription start sites (TSSs) available in 120 cell and tissue types, which can be accessed at https://pints.yulab.org . With knowledge of the best available assays and pipelines, this large-scale annotation of candidate enhancers will pave the way for selection and characterization of their functions in a time- and labor-efficient manner.
Collapse
Affiliation(s)
- Li Yao
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Jin Liang
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - Abdullah Ozer
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Alden King-Yung Leung
- Department of Computational Biology, Cornell University, Ithaca, NY, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA
| | - John T Lis
- Department of Computational Biology, Cornell University, Ithaca, NY, USA.
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA.
| | - Haiyuan Yu
- Department of Computational Biology, Cornell University, Ithaca, NY, USA.
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, NY, USA.
| |
Collapse
|
20
|
Tombácz D, Kakuk B, Torma G, Csabai Z, Gulyás G, Tamás V, Zádori Z, Jefferson VA, Meyer F, Boldogkői Z. In-Depth Temporal Transcriptome Profiling of an Alphaherpesvirus Using Nanopore Sequencing. Viruses 2022; 14:v14061289. [PMID: 35746760 PMCID: PMC9229804 DOI: 10.3390/v14061289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 06/05/2022] [Accepted: 06/08/2022] [Indexed: 12/10/2022] Open
Abstract
In this work, a long-read sequencing (LRS) technique based on the Oxford Nanopore Technology MinION platform was used for quantifying and kinetic characterization of the poly(A) fraction of bovine alphaherpesvirus type 1 (BoHV-1) lytic transcriptome across a 12-h infection period. Amplification-based LRS techniques frequently generate artefactual transcription reads and are biased towards the production of shorter amplicons. To avoid these undesired effects, we applied direct cDNA sequencing, an amplification-free technique. Here, we show that a single promoter can produce multiple transcription start sites whose distribution patterns differ among the viral genes but are similar in the same gene at different timepoints. Our investigations revealed that the circ gene is expressed with immediate–early (IE) kinetics by utilizing a special mechanism based on the use of the promoter of another IE gene (bicp4) for the transcriptional control. Furthermore, we detected an overlap between the initiation of DNA replication and the transcription from the bicp22 gene, which suggests an interaction between the two molecular machineries. This study developed a generally applicable LRS-based method for the time-course characterization of transcriptomes of any organism.
Collapse
Affiliation(s)
- Dóra Tombácz
- Department of Medical Biology, Albert Szent-Györgyi Medical School, University of Szeged, Somogyi u. 4, 6720 Szeged, Hungary; (D.T.); (B.K.); (G.T.); (Z.C.); (G.G.)
| | - Balázs Kakuk
- Department of Medical Biology, Albert Szent-Györgyi Medical School, University of Szeged, Somogyi u. 4, 6720 Szeged, Hungary; (D.T.); (B.K.); (G.T.); (Z.C.); (G.G.)
| | - Gábor Torma
- Department of Medical Biology, Albert Szent-Györgyi Medical School, University of Szeged, Somogyi u. 4, 6720 Szeged, Hungary; (D.T.); (B.K.); (G.T.); (Z.C.); (G.G.)
| | - Zsolt Csabai
- Department of Medical Biology, Albert Szent-Györgyi Medical School, University of Szeged, Somogyi u. 4, 6720 Szeged, Hungary; (D.T.); (B.K.); (G.T.); (Z.C.); (G.G.)
| | - Gábor Gulyás
- Department of Medical Biology, Albert Szent-Györgyi Medical School, University of Szeged, Somogyi u. 4, 6720 Szeged, Hungary; (D.T.); (B.K.); (G.T.); (Z.C.); (G.G.)
| | - Vivien Tamás
- Institute for Veterinary Medical Research, Centre for Agricultural Research, Hungária krt. 21, 1143 Budapest, Hungary; (V.T.); (Z.Z.)
| | - Zoltán Zádori
- Institute for Veterinary Medical Research, Centre for Agricultural Research, Hungária krt. 21, 1143 Budapest, Hungary; (V.T.); (Z.Z.)
| | - Victoria A. Jefferson
- Department of Biochemistry & Molecular Biology, Entomology & Plant Pathology, Mississippi State University, 408 Dorman P.O. Box 9655, 32 Creelman St., Starkville, MS 39762, USA; (V.A.J.); (F.M.)
| | - Florencia Meyer
- Department of Biochemistry & Molecular Biology, Entomology & Plant Pathology, Mississippi State University, 408 Dorman P.O. Box 9655, 32 Creelman St., Starkville, MS 39762, USA; (V.A.J.); (F.M.)
| | - Zsolt Boldogkői
- Department of Medical Biology, Albert Szent-Györgyi Medical School, University of Szeged, Somogyi u. 4, 6720 Szeged, Hungary; (D.T.); (B.K.); (G.T.); (Z.C.); (G.G.)
- Correspondence:
| |
Collapse
|
21
|
Levo M, Raimundo J, Bing XY, Sisco Z, Batut PJ, Ryabichko S, Gregor T, Levine MS. Transcriptional coupling of distant regulatory genes in living embryos. Nature 2022; 605:754-760. [PMID: 35508662 PMCID: PMC9886134 DOI: 10.1038/s41586-022-04680-7] [Citation(s) in RCA: 91] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 03/23/2022] [Indexed: 02/01/2023]
Abstract
The prevailing view of metazoan gene regulation is that individual genes are independently regulated by their own dedicated sets of transcriptional enhancers. Past studies have reported long-range gene-gene associations1-3, but their functional importance in regulating transcription remains unclear. Here we used quantitative single-cell live imaging methods to provide a demonstration of co-dependent transcriptional dynamics of genes separated by large genomic distances in living Drosophila embryos. We find extensive physical and functional associations of distant paralogous genes, including co-regulation by shared enhancers and co-transcriptional initiation over distances of nearly 250 kilobases. Regulatory interconnectivity depends on promoter-proximal tethering elements, and perturbations in these elements uncouple transcription and alter the bursting dynamics of distant genes, suggesting a role of genome topology in the formation and stability of co-transcriptional hubs. Transcriptional coupling is detected throughout the fly genome and encompasses a broad spectrum of conserved developmental processes, suggesting a general strategy for long-range integration of gene activity.
Collapse
Affiliation(s)
- Michal Levo
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - João Raimundo
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Xin Yang Bing
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Zachary Sisco
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Philippe J Batut
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Sergey Ryabichko
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Thomas Gregor
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA.
- Joseph Henry Laboratories of Physics, Princeton University, Princeton, NJ, USA.
- Department of Developmental and Stem Cell Biology, UMR3738, Institut Pasteur, Paris, France.
| | - Michael S Levine
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
22
|
Huang T, Li J, Wang SM. Core promoter mutation contributes to abnormal gene expression in bladder cancer. BMC Cancer 2022; 22:68. [PMID: 35033028 PMCID: PMC8761283 DOI: 10.1186/s12885-022-09178-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Accepted: 01/06/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Bladder cancer is one of the most mortal cancers. Bladder cancer has distinct gene expression signature, highlighting altered gene expression plays important roles in bladder cancer etiology. However, the mechanism for how the regulatory disorder causes the altered expression in bladder cancer remains elusive. Core promoter controls transcriptional initiation. We hypothesized that mutation in core promoter abnormality could cause abnormal transcriptional initiation thereby the altered gene expression in bladder cancer. METHODS In this study, we performed a genome-wide characterization of core promoter mutation in 77 Spanish bladder cancer cases. RESULTS We identified 69 recurrent somatic mutations in 61 core promoters of 62 genes and 28 recurrent germline mutations in 20 core promoters of 21 genes, including TERT, the only gene known with core promoter mutation in bladder cancer, and many oncogenes and tumor suppressors. From the RNA-seq data from bladder cancer, we observed altered expression of the core promoter-mutated genes. We further validated the effects of core promoter mutation on gene expression by using luciferase reporter gene assay. We also identified potential drugs targeting the core promoter-mutated genes. CONCLUSIONS Data from our study highlights that core promoter mutation contributes to bladder cancer development through altering gene expression.
Collapse
Affiliation(s)
- Teng Huang
- Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, Ministry of Education Frontiers Science Center for Precision Oncology, University of Macau, Taipa, Macau
| | - Jiaheng Li
- Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, Ministry of Education Frontiers Science Center for Precision Oncology, University of Macau, Taipa, Macau
| | - San Ming Wang
- Cancer Center and Institute of Translational Medicine, Faculty of Health Sciences, Ministry of Education Frontiers Science Center for Precision Oncology, University of Macau, Taipa, Macau.
| |
Collapse
|
23
|
Abstract
Transcription start site (TSS) usage is a critical factor in the regulation of gene expression. A number of methods for global TSS mapping have been developed, but barriers of expense, technical difficulty, time, and/or cost have limited their broader adoption. To address these issues, we developed Survey of TRanscription Initiation at Promoter Elements with high-throughput sequencing (STRIPE-seq). Requiring only three enzymatic steps with intervening bead cleanups, a STRIPE-seq library can be prepared from as little as 50 ng total RNA in ~5 h at a cost of ~$12 (US). In addition to profiling TSS usage, STRIPE-seq provides information on transcript levels that can be used for differential expression analysis. Thanks to its simplicity and low cost, we envision that STRIPE-seq could be employed by any molecular biology laboratory interested in profiling transcription initiation.
Collapse
Affiliation(s)
| | - Gabriel E Zentner
- Department of Biology, Indiana University, Bloomington, IN, USA.
- Indiana University Melvin and Bren Simon Comprehensive Cancer Center, Indianapolis, IN, USA.
- eGenesis, Inc., Cambridge, MA, USA.
| |
Collapse
|
24
|
Moore JE, Zhang XO, Elhajjajy SI, Fan K, Pratt HE, Reese F, Mortazavi A, Weng Z. Integration of high-resolution promoter profiling assays reveals novel, cell type-specific transcription start sites across 115 human cell and tissue types. Genome Res 2021; 32:389-402. [PMID: 34949670 PMCID: PMC8805725 DOI: 10.1101/gr.275723.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 12/19/2021] [Indexed: 12/02/2022]
Abstract
Accurate transcription start site (TSS) annotations are essential for understanding transcriptional regulation and its role in human disease. Gene collections such as GENCODE contain annotations for tens of thousands of TSSs, but not all of these annotations are experimentally validated nor do they contain information on cell type–specific usage. Therefore, we sought to generate a collection of experimentally validated TSSs by integrating RNA Annotation and Mapping of Promoters for the Analysis of Gene Expression (RAMPAGE) data from 115 cell and tissue types, which resulted in a collection of approximately 50 thousand representative RAMPAGE peaks. These peaks are primarily proximal to GENCODE-annotated TSSs and are concordant with other transcription assays. Because RAMPAGE uses paired-end reads, we were then able to connect peaks to transcripts by analyzing the genomic positions of the 3′ ends of read mates. Using this paired-end information, we classified the vast majority (37 thousand) of our RAMPAGE peaks as verified TSSs, updating TSS annotations for 20% of GENCODE genes. We also found that these updated TSS annotations are supported by epigenomic and other transcriptomic data sets. To show the utility of this RAMPAGE rPeak collection, we intersected it with the NHGRI/EBI genome-wide association study (GWAS) catalog and identified new candidate GWAS genes. Overall, our work shows the importance of integrating experimental data to further refine TSS annotations and provides a valuable resource for the biological community.
Collapse
Affiliation(s)
| | | | | | - Kaili Fan
- University of Massachusetts Chan Medical School
| | | | | | | | - Zhiping Weng
- University of Massachusetts Chan Medical School;
| |
Collapse
|
25
|
Li L, Waymack R, Gad M, Wunderlich Z. Two promoters integrate multiple enhancer inputs to drive wild-type knirps expression in the Drosophila melanogaster embryo. Genetics 2021; 219:iyab154. [PMID: 34849867 PMCID: PMC8664596 DOI: 10.1093/genetics/iyab154] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 09/12/2021] [Indexed: 11/13/2022] Open
Abstract
Proper development depends on precise spatiotemporal gene expression patterns. Most developmental genes are regulated by multiple enhancers and often by multiple core promoters that generate similar transcripts. We hypothesize that multiple promoters may be required either because enhancers prefer a specific promoter or because multiple promoters serve as a redundancy mechanism. To test these hypotheses, we studied the expression of the knirps locus in the early Drosophila melanogaster embryo, which is mediated by multiple enhancers and core promoters. We found that one of these promoters resembles a typical "sharp" developmental promoter, while the other resembles a "broad" promoter usually associated with housekeeping genes. Using synthetic reporter constructs, we found that some, but not all, enhancers in the locus show a preference for one promoter, indicating that promoters provide both redundancy and specificity. By analyzing the reporter dynamics, we identified specific burst properties during the transcription process, namely burst size and frequency, that are most strongly tuned by the combination of promoter and enhancer. Using locus-sized reporters, we discovered that enhancers with no promoter preference in a synthetic setting have a preference in the locus context. Our results suggest that the presence of multiple promoters in a locus is due both to enhancer preference and a need for redundancy and that "broad" promoters with dispersed transcription start sites are common among developmental genes. They also imply that it can be difficult to extrapolate expression measurements from synthetic reporters to the locus context, where other variables shape a gene's overall expression pattern.
Collapse
Affiliation(s)
- Lily Li
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA
| | - Rachel Waymack
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA
| | - Mario Gad
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA
| | - Zeba Wunderlich
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA 92697, USA
- Department of Biology, Boston University, Boston, MA 02215, USA
| |
Collapse
|
26
|
Lu Z, Berry K, Hu Z, Zhan Y, Ahn TH, Lin Z. TSSr: an R package for comprehensive analyses of TSS sequencing data. NAR Genom Bioinform 2021; 3:lqab108. [PMID: 34805991 PMCID: PMC8598296 DOI: 10.1093/nargab/lqab108] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 10/05/2021] [Accepted: 10/27/2021] [Indexed: 12/13/2022] Open
Abstract
Transcription initiation is regulated in a highly organized fashion to ensure proper cellular functions. Accurate identification of transcription start sites (TSSs) and quantitative characterization of transcription initiation activities are fundamental steps for studies of regulated transcriptions and core promoter structures. Several high-throughput techniques have been developed to sequence the very 5'end of RNA transcripts (TSS sequencing) on the genome scale. Bioinformatics tools are essential for processing, analysis, and visualization of TSS sequencing data. Here, we present TSSr, an R package that provides rich functions for mapping TSS and characterizations of structures and activities of core promoters based on all types of TSS sequencing data. Specifically, TSSr implements several newly developed algorithms for accurately identifying TSSs from mapped sequencing reads and inference of core promoters, which are a prerequisite for subsequent functional analyses of TSS data. Furthermore, TSSr also enables users to export various types of TSS data that can be visualized by genome browser for inspection of promoter activities in association with other genomic features, and to generate publication-ready TSS graphs. These user-friendly features could greatly facilitate studies of transcription initiation based on TSS sequencing data. The source code and detailed documentations of TSSr can be freely accessed at https://github.com/Linlab-slu/TSSr.
Collapse
Affiliation(s)
- Zhaolian Lu
- Department of Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Keenan Berry
- Program of Bioinformatics and Computational Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Zhenbin Hu
- Department of Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Yu Zhan
- Department of Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Tae-Hyuk Ahn
- Program of Bioinformatics and Computational Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Zhenguo Lin
- Department of Biology, Saint Louis University, St. Louis, MO 63103, USA
| |
Collapse
|
27
|
Wolfe JC, Mikheeva LA, Hagras H, Zabet NR. An explainable artificial intelligence approach for decoding the enhancer histone modifications code and identification of novel enhancers in Drosophila. Genome Biol 2021; 22:308. [PMID: 34749786 PMCID: PMC8574042 DOI: 10.1186/s13059-021-02532-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 10/29/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Enhancers are non-coding regions of the genome that control the activity of target genes. Recent efforts to identify active enhancers experimentally and in silico have proven effective. While these tools can predict the locations of enhancers with a high degree of accuracy, the mechanisms underpinning the activity of enhancers are often unclear. RESULTS Using machine learning (ML) and a rule-based explainable artificial intelligence (XAI) model, we demonstrate that we can predict the location of known enhancers in Drosophila with a high degree of accuracy. Most importantly, we use the rules of the XAI model to provide insight into the underlying combinatorial histone modifications code of enhancers. In addition, we identified a large set of putative enhancers that display the same epigenetic signature as enhancers identified experimentally. These putative enhancers are enriched in nascent transcription, divergent transcription and have 3D contacts with promoters of transcribed genes. However, they display only intermediary enrichment of mediator and cohesin complexes compared to previously characterised active enhancers. We also found that 10-15% of the predicted enhancers display similar characteristics to super enhancers observed in other species. CONCLUSIONS Here, we applied an explainable AI model to predict enhancers with high accuracy. Most importantly, we identified that different combinations of epigenetic marks characterise different groups of enhancers. Finally, we discovered a large set of putative enhancers which display similar characteristics with previously characterised active enhancers.
Collapse
Affiliation(s)
- Jareth C Wolfe
- School of Life Sciences, University of Essex, Colchester, CO4 3SQ, UK
- School of Computer Science and Electronic Engineering, University of Essex, Colchester, CO4 3SQ, UK
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 2AT, London, UK
| | - Liudmila A Mikheeva
- School of Life Sciences, University of Essex, Colchester, CO4 3SQ, UK
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 2AT, London, UK
- Department of Mathematical Sciences, University of Essex, Colchester, CO4 3SQ, UK
| | - Hani Hagras
- School of Computer Science and Electronic Engineering, University of Essex, Colchester, CO4 3SQ, UK.
| | - Nicolae Radu Zabet
- School of Life Sciences, University of Essex, Colchester, CO4 3SQ, UK.
- Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, E1 2AT, London, UK.
| |
Collapse
|
28
|
Modzelewski AJ, Shao W, Chen J, Lee A, Qi X, Noon M, Tjokro K, Sales G, Biton A, Anand A, Speed TP, Xuan Z, Wang T, Risso D, He L. A mouse-specific retrotransposon drives a conserved Cdk2ap1 isoform essential for development. Cell 2021; 184:5541-5558.e22. [PMID: 34644528 PMCID: PMC8787082 DOI: 10.1016/j.cell.2021.09.021] [Citation(s) in RCA: 54] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 07/26/2021] [Accepted: 09/14/2021] [Indexed: 12/13/2022]
Abstract
Retrotransposons mediate gene regulation in important developmental and pathological processes. Here, we characterized the transient retrotransposon induction during preimplantation development of eight mammals. Induced retrotransposons exhibit similar preimplantation profiles across species, conferring gene regulatory activities, particularly through long terminal repeat (LTR) retrotransposon promoters. A mouse-specific MT2B2 retrotransposon promoter generates an N-terminally truncated Cdk2ap1ΔN that peaks in preimplantation embryos and promotes proliferation. In contrast, the canonical Cdk2ap1 peaks in mid-gestation and represses cell proliferation. This MT2B2 promoter, whose deletion abolishes Cdk2ap1ΔN production, reduces cell proliferation and impairs embryo implantation, is developmentally essential. Intriguingly, Cdk2ap1ΔN is evolutionarily conserved in sequence and function yet is driven by different promoters across mammals. The distinct preimplantation Cdk2ap1ΔN expression in each mammalian species correlates with the duration of its preimplantation development. Hence, species-specific transposon promoters can yield evolutionarily conserved, alternative protein isoforms, bestowing them with new functions and species-specific expression to govern essential biological divergence.
Collapse
Affiliation(s)
- Andrew J Modzelewski
- Division of Cellular and Developmental Biology, MCB Department, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Wanqing Shao
- Department of Genetics, Edison Family Center for Genome Science and System Biology, McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Jingqi Chen
- Division of Cellular and Developmental Biology, MCB Department, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Angus Lee
- Division of Cellular and Developmental Biology, MCB Department, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Xin Qi
- Division of Cellular and Developmental Biology, MCB Department, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Mackenzie Noon
- Division of Cellular and Developmental Biology, MCB Department, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Kristy Tjokro
- Division of Cellular and Developmental Biology, MCB Department, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Gabriele Sales
- Department of Biology, University of Padova, Padova 35122, Italy
| | - Anne Biton
- Department of Statistics, University of California, Berkeley, Berkeley, CA 94720, USA; Bioinformatics and Biostatistics, Department of Computational Biology, USR 3756 CNRS, Institut Pasteur, Paris 75015, France
| | - Aparna Anand
- Department of Genetics, Edison Family Center for Genome Science and System Biology, McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Terence P Speed
- Bioinformatics Division, WEHI, Parkville, VIC 3052, Australia
| | - Zhenyu Xuan
- Department of Biological Sciences, The University of Texas at Dallas, Richardson, TX 75080, USA
| | - Ting Wang
- Department of Genetics, Edison Family Center for Genome Science and System Biology, McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63110, USA.
| | - Davide Risso
- Department of Statistical Sciences, University of Padova, Padova 35122, Italy.
| | - Lin He
- Division of Cellular and Developmental Biology, MCB Department, University of California, Berkeley, Berkeley, CA 94720, USA.
| |
Collapse
|
29
|
Abstract
Transcription start site (TSS) selection influences transcript stability and translation as well as protein sequence. Alternative TSS usage is pervasive in organismal development, is a major contributor to transcript isoform diversity in humans, and is frequently observed in human diseases including cancer. In this review, we discuss the breadth of techniques that have been used to globally profile TSSs and the resulting insights into gene regulation, as well as future prospects in this area of inquiry.
Collapse
Affiliation(s)
| | - Gabriel E. Zentner
- Department of Biology, Indiana University, Bloomington, IN 47401, USA
- Indiana University Melvin and Bren Simon Comprehensive Cancer Center, Indianapolis, IN 46202, USA
| |
Collapse
|
30
|
Ullastres A, Merenciano M, González J. Regulatory regions in natural transposable element insertions drive interindividual differences in response to immune challenges in Drosophila. Genome Biol 2021; 22:265. [PMID: 34521452 PMCID: PMC8439047 DOI: 10.1186/s13059-021-02471-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 08/19/2021] [Indexed: 02/08/2023] Open
Abstract
Background Variation in gene expression underlies interindividual variability in relevant traits including immune response. However, the genetic variation responsible for these gene expression changes remains largely unknown. Among the non-coding variants that could be relevant, transposable element insertions are promising candidates as they have been shown to be a rich and diverse source of cis-regulatory elements. Results In this work, we use a population genetics approach to identify transposable element insertions likely to increase the tolerance of Drosophila melanogaster to bacterial infection by affecting the expression of immune-related genes. We identify 12 insertions associated with allele-specific expression changes in immune-related genes. We experimentally validate three of these insertions including one likely to be acting as a silencer, one as an enhancer, and one with a dual role as enhancer and promoter. The direction in the change of gene expression associated with the presence of several of these insertions is consistent with an increased survival to infection. Indeed, for one of the insertions, we show that this is the case by analyzing both natural populations and CRISPR/Cas9 mutants in which the insertion is deleted from its native genomic context. Conclusions We show that transposable elements contribute to gene expression variation in response to infection in D. melanogaster and that this variation is likely to affect their survival capacity. Because the role of transposable elements as regulatory elements is not restricted to Drosophila, transposable elements are likely to play a role in immune response in other organisms as well. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-021-02471-3.
Collapse
Affiliation(s)
- Anna Ullastres
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Passeig Marítim de la Barceloneta 37-49, 08003, Barcelona, Spain
| | - Miriam Merenciano
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Passeig Marítim de la Barceloneta 37-49, 08003, Barcelona, Spain
| | - Josefa González
- Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Passeig Marítim de la Barceloneta 37-49, 08003, Barcelona, Spain.
| |
Collapse
|
31
|
Khetan S, Kales S, Kursawe R, Jillette A, Ulirsch JC, Reilly SK, Ucar D, Tewhey R, Stitzel ML. Functional characterization of T2D-associated SNP effects on baseline and ER stress-responsive β cell transcriptional activation. Nat Commun 2021; 12:5242. [PMID: 34475398 PMCID: PMC8413311 DOI: 10.1038/s41467-021-25514-6] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 08/10/2021] [Indexed: 11/08/2022] Open
Abstract
Genome-wide association studies (GWAS) have linked single nucleotide polymorphisms (SNPs) at >250 loci in the human genome to type 2 diabetes (T2D) risk. For each locus, identifying the functional variant(s) among multiple SNPs in high linkage disequilibrium is critical to understand molecular mechanisms underlying T2D genetic risk. Using massively parallel reporter assays (MPRA), we test the cis-regulatory effects of SNPs associated with T2D and altered in vivo islet chromatin accessibility in MIN6 β cells under steady state and pathophysiologic endoplasmic reticulum (ER) stress conditions. We identify 1,982/6,621 (29.9%) SNP-containing elements that activate transcription in MIN6 and 879 SNP alleles that modulate MPRA activity. Multiple T2D-associated SNPs alter the activity of short interspersed nuclear element (SINE)-containing elements that are strongly induced by ER stress. We identify 220 functional variants at 104 T2D association signals, narrowing 54 signals to a single candidate SNP. Together, this study identifies elements driving β cell steady state and ER stress-responsive transcriptional activation, nominates causal T2D SNPs, and uncovers potential roles for repetitive elements in β cell transcriptional stress response and T2D genetics.
Collapse
Affiliation(s)
- Shubham Khetan
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Department of Genetics and Genome Sciences, University of Connecticut, Farmington, CT, USA
| | - Susan Kales
- The Jackson Laboratory for Mammalian Genetics, Bar Harbor, ME, USA
| | - Romy Kursawe
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Jacob C Ulirsch
- Program in Biological and Biomedical Sciences, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Duygu Ucar
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Department of Genetics and Genome Sciences, University of Connecticut, Farmington, CT, USA
- Institute of Systems Genomics, University of Connecticut, Farmington, CT, USA
| | - Ryan Tewhey
- The Jackson Laboratory for Mammalian Genetics, Bar Harbor, ME, USA.
- Graduate School of Biomedical Sciences and Engineering, University of Maine, Orono, ME, USA.
- Tufts University School of Medicine, Boston, MA, USA.
| | - Michael L Stitzel
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
- Department of Genetics and Genome Sciences, University of Connecticut, Farmington, CT, USA.
- Institute of Systems Genomics, University of Connecticut, Farmington, CT, USA.
| |
Collapse
|
32
|
Guerrini MM, Oguchi A, Suzuki A, Murakawa Y. Cap analysis of gene expression (CAGE) and noncoding regulatory elements. Semin Immunopathol 2021; 44:127-136. [PMID: 34468849 DOI: 10.1007/s00281-021-00886-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 08/13/2021] [Indexed: 01/06/2023]
Abstract
Cap analysis of gene expression (CAGE) was developed to detect the 5' end of RNA. Trapping of the RNA 5'-cap structure enables the enrichment and selective sequencing of complete transcripts. Upscaled high-throughput versions of CAGE have enabled the genome-wide identification of transcription start sites, including transcriptionally active promoters and enhancers. CAGE sequencing can be exploited to draw comprehensive maps of active genomic regulatory elements in a cell type- and activation-specific manner. The cells of the immune system are among the best candidates to be analyzed in humans, since they are easily accessible. In this review, we discuss how CAGE data are instrumental for integrative analyses with quantitative trait loci and omics data, and their usefulness in the mechanistic interpretation of the effects of genetic variations over the entire human genome. Integrating CAGE data with the currently available omics information will contribute to better understanding of the genome-wide association study variants that lie outside of annotated genes, deepening our knowledge on human diseases, and enabling the targeted design of more specific therapeutic interventions.
Collapse
Affiliation(s)
- Matteo Maurizio Guerrini
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan.
| | - Akiko Oguchi
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Akari Suzuki
- Laboratory for Autoimmune Diseases, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, Japan
| | - Yasuhiro Murakawa
- RIKEN-IFOM Joint Laboratory for Cancer Genomics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- IFOM-the FIRC Institute of Molecular Oncology, Milan, Italy
| |
Collapse
|
33
|
Stockwell PA, Lynch-Sutherland CF, Chatterjee A, Macaulay EC, Eccles MR. RepExpress: A Novel Pipeline for the Quantification and Characterization of Transposable Element Expression from RNA-seq Data. Curr Protoc 2021; 1:e206. [PMID: 34387946 DOI: 10.1002/cpz1.206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Transposable elements (TEs) are key regulators of both development and disease; however, their repetitive nature presents substantial computational challenges to their analysis. Due to a lack of computational tools and suitable analysis frameworks, TE expression is often not quantified at the locus level. Therefore, we have developed RepExpress, a novel pipeline that enables locus-level TE quantification and characterization. RepExpress enables the characterization of TE expression in a genomic context, and is the first tool focusing on the identification of tissue-specific TE-derived and TE-regulated genes. RepExpress identifies expressed TEs overlapping with annotated genomic features and enables tissue-specific profiles of TE-derived genes. TEs that are expressed with no overlap with any known genomic features are characterized by the closest downstream genomic feature enabling identification of novel TE-gene regulatory relationships. RepExpress takes standard RNA-seq data as input and performs genomic alignment optimized for TEs. Our novel pipeline quantifies expression of both TEs and genes using featureCounts and Stringtie, respectively. RepExpress then filters expressed repeats and characterizes their genomic context, enabling the identification of TEs that overlap with genes, or that may be influencing gene expression. Here, we describe RepExpress, and provide a step-by-step protocol detailing its workflow. We also discuss other TE analysis tools and their applicability to addressing different biological questions. © 2021 Wiley Periodicals LLC. Basic Protocol: RepExpress workflow.
Collapse
Affiliation(s)
- Peter A Stockwell
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand
| | | | - Aniruddha Chatterjee
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand.,Maurice Wilkins Centre for Molecular Biodiscovery, Auckland, New Zealand
| | - Erin C Macaulay
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand
| | - Michael R Eccles
- Department of Pathology, Dunedin School of Medicine, University of Otago, Dunedin, New Zealand.,Maurice Wilkins Centre for Molecular Biodiscovery, Auckland, New Zealand
| |
Collapse
|
34
|
Tissue-specific expression of p73 and p63 isoforms in human tissues. Cell Death Dis 2021; 12:745. [PMID: 34315849 PMCID: PMC8316356 DOI: 10.1038/s41419-021-04017-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 06/25/2021] [Accepted: 06/30/2021] [Indexed: 12/13/2022]
Abstract
p73 and p63 are members of the p53 family that exhibit overlapping and distinct functions in development and homeostasis. The evaluation of p73 and p63 isoform expression across human tissue can provide greater insight to the functional interactions between family members. We determined the mRNA isoform expression patterns of TP73 and TP63 across a panel of 36 human tissues and protein expression within the highest-expressing tissues. TP73 and TP63 expression significantly correlated across tissues. In tissues with concurrent mRNA expression, nuclear co-expression of both proteins was observed in a majority of cells. Using GTEx data, we quantified p73 and p63 isoform expression in human tissue and identified that the α-isoforms of TP73 and TP63 were the predominant isoform expressed in nearly all tissues. Further, we identified a previously unreported p73 mRNA product encoded by exons 4 to 14. In sum, these data provide the most comprehensive tissue-specific atlas of p73 and p63 protein and mRNA expression patterns in human and murine samples, indicating coordinate expression of these transcription factors in the majority of tissues in which they are expressed.
Collapse
|
35
|
Fabry MH, Falconio FA, Joud F, Lythgoe EK, Czech B, Hannon GJ. Maternally inherited piRNAs direct transient heterochromatin formation at active transposons during early Drosophila embryogenesis. eLife 2021; 10:e68573. [PMID: 34236313 PMCID: PMC8352587 DOI: 10.7554/elife.68573] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Accepted: 07/07/2021] [Indexed: 12/12/2022] Open
Abstract
The PIWI-interacting RNA (piRNA) pathway controls transposon expression in animal germ cells, thereby ensuring genome stability over generations. In Drosophila, piRNAs are intergenerationally inherited through the maternal lineage, and this has demonstrated importance in the specification of piRNA source loci and in silencing of I- and P-elements in the germ cells of daughters. Maternally inherited Piwi protein enters somatic nuclei in early embryos prior to zygotic genome activation and persists therein for roughly half of the time required to complete embryonic development. To investigate the role of the piRNA pathway in the embryonic soma, we created a conditionally unstable Piwi protein. This enabled maternally deposited Piwi to be cleared from newly laid embryos within 30 min and well ahead of the activation of zygotic transcription. Examination of RNA and protein profiles over time, and correlation with patterns of H3K9me3 deposition, suggests a role for maternally deposited Piwi in attenuating zygotic transposon expression in somatic cells of the developing embryo. In particular, robust deposition of piRNAs targeting roo, an element whose expression is mainly restricted to embryonic development, results in the deposition of transient heterochromatic marks at active roo insertions. We hypothesize that roo, an extremely successful mobile element, may have adopted a lifestyle of expression in the embryonic soma to evade silencing in germ cells.
Collapse
Affiliation(s)
- Martin H Fabry
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Federica A Falconio
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Fadwa Joud
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Emily K Lythgoe
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Benjamin Czech
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| | - Gregory J Hannon
- CRUK Cambridge Institute, University of Cambridge, Li Ka Shing CentreCambridgeUnited Kingdom
| |
Collapse
|
36
|
Grencewicz DJ, Romigh T, Thacker S, Abbas A, Jaini R, Luse D, Eng C. Redefining the PTEN promoter: Identification of novel upstream transcription start regions. Hum Mol Genet 2021; 30:2135-2148. [PMID: 34218272 DOI: 10.1093/hmg/ddab175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 06/19/2021] [Accepted: 06/23/2021] [Indexed: 11/14/2022] Open
Abstract
Germline mutation of PTEN is causally observed in Cowden syndrome (CS) and is one of the most common, penetrant risk genes for autism spectrum disorder (ASD). However, the majority of individuals who present with CS-like clinical features are PTEN-mutation negative. Reassessment of PTEN promoter regulation may help explain abnormal PTEN dosage, as only the minimal promoter and coding regions are currently included in diagnostic PTEN mutation analysis. Therefore, we reanalyzed the architecture of the PTEN promoter using next-generation sequencing datasets. Specifically, run-on sequencing assays identified two additional transcription start regions (TSRs) at -2053 and - 1906 basepairs from the canonical start of PTEN, thus extending the PTEN 5'UTR and redefining the PTEN promoter. We show that these novel upstream TSRs are active in cancer cell lines, human cancer, and normal tissue. Further, these TSRs can produce novel PTEN transcripts due to the introduction of new splice donors at -2041, -1826, and - 1355, which may allow for splicing out of the PTEN 5'UTR or the first and second exon in upstream-initiated transcripts. Combining ENCODE ChIP-seq and pertinent literature, we also compile and analyze all transcription factors (TFs) binding at the redefined PTEN locus. Enrichment analyses suggest that TFs bind specifically to the upstream TSRs may be implicated in inflammatory processes. Together, these data redefine the architecture of the PTEN promoter, an important step toward a comprehensive model of PTEN transcription regulation, a basis for future investigations into the new promoters' role in disease pathogenesis.
Collapse
Affiliation(s)
- Dennis J Grencewicz
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA
| | - Todd Romigh
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA.,Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
| | - Stetson Thacker
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA.,Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Ata Abbas
- Division of Hematology and Oncology, Department of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA.,Developmental Therapeutics Program, CASE Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Ritika Jaini
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA.,Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA.,Developmental Therapeutics Program, CASE Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA.,Germline High Risk Focus Group, CASE Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Donal Luse
- Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.,Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Charis Eng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH, 44195, USA.,Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA.,Germline High Risk Focus Group, CASE Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA.,Center for Personalized Genetic Healthcare, Cleveland Clinic Community Care and Population Health, Cleveland, OH 44195, USA.,Department of Genetics and Genome Sciences, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| |
Collapse
|
37
|
Policastro RA, McDonald DJ, Brendel VP, Zentner GE. Flexible analysis of TSS mapping data and detection of TSS shifts with TSRexploreR. NAR Genom Bioinform 2021; 3:lqab051. [PMID: 34250478 PMCID: PMC8265037 DOI: 10.1093/nargab/lqab051] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 04/29/2021] [Accepted: 05/18/2021] [Indexed: 12/13/2022] Open
Abstract
Heterogeneity in transcription initiation has important consequences for transcript stability and translation, and shifts in transcription start site (TSS) usage are prevalent in various developmental, metabolic, and disease contexts. Accordingly, numerous methods for global TSS profiling have been developed, including most recently Survey of TRanscription Initiation at Promoter Elements with high-throughput sequencing (STRIPE-seq), a method to profile transcription start sites (TSSs) on a genome-wide scale with significant cost and time savings compared to previous methods. In anticipation of more widespread adoption of STRIPE-seq and related methods for construction of promoter atlases and studies of differential gene expression, we built TSRexploreR, an R package for end-to-end analysis of TSS mapping data. TSRexploreR provides functions for TSS and transcription start region (TSR) detection, normalization, correlation, visualization, and differential TSS/TSR analyses. TSRexploreR is highly interoperable, accepting the data structures of TSS and TSR sets generated by several existing tools for processing and alignment of TSS mapping data, such as CAGEr for Cap Analysis of Gene Expression (CAGE) data. Lastly, TSRexploreR implements a novel approach for the detection of shifts in TSS distribution.
Collapse
Affiliation(s)
| | - Daniel J McDonald
- Department of Statistics, Indiana University, Bloomington, IN 47405, USA
| | - Volker P Brendel
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
- Department of Computer Science, Indiana University, Bloomington, IN 47405, USA
| | - Gabriel E Zentner
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
- Indiana University Melvin and Bren Simon Comprehensive Cancer Center, Indianapolis, IN 46202, USA
| |
Collapse
|
38
|
Rosikiewicz W, Sikora J, Skrzypczak T, Kubiak MR, Makałowska I. Promoter switching in response to changing environment and elevated expression of protein-coding genes overlapping at their 5' ends. Sci Rep 2021; 11:8984. [PMID: 33903630 PMCID: PMC8076222 DOI: 10.1038/s41598-021-87970-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 04/07/2021] [Indexed: 11/09/2022] Open
Abstract
Despite the number of studies focused on sense-antisense transcription, the key question of whether such organization evolved as a regulator of gene expression or if this is only a byproduct of other regulatory processes has not been elucidated to date. In this study, protein-coding sense-antisense gene pairs were analyzed with a particular focus on pairs overlapping at their 5' ends. Analyses were performed in 73 human transcription start site libraries. The results of our studies showed that the overlap between genes is not a stable feature and depends on which TSSs are utilized in a given cell type. An analysis of gene expression did not confirm that overlap between genes causes downregulation of their expression. This observation contradicts earlier findings. In addition, we showed that the switch from one promoter to another, leading to genes overlap, may occur in response to changing environment of a cell or tissue. We also demonstrated that in transfected and cancerous cells genes overlap is observed more often in comparison with normal tissues. Moreover, utilization of overlapping promoters depends on particular state of a cell and, at least in some groups of genes, is not merely coincidental.
Collapse
Affiliation(s)
- Wojciech Rosikiewicz
- Center for Applied Bioinformatics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Jarosław Sikora
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Tomasz Skrzypczak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
- Center for Advanced Technology, Adam Mickiewicz University, Poznań, Poland
| | - Magdalena R Kubiak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Izabela Makałowska
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland.
| |
Collapse
|
39
|
Integrated genomic analysis reveals key features of long undecoded transcript isoform-based gene repression. Mol Cell 2021; 81:2231-2245.e11. [PMID: 33826921 PMCID: PMC8153250 DOI: 10.1016/j.molcel.2021.03.013] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2019] [Revised: 02/05/2021] [Accepted: 03/09/2021] [Indexed: 12/31/2022]
Abstract
Long undecoded transcript isoforms (LUTIs) represent a class of non-canonical mRNAs that downregulate gene expression through the combined act of transcriptional and translational repression. While single gene studies revealed important aspects of LUTI-based repression, how these features affect gene regulation on a global scale is unknown. Using transcript leader and direct RNA sequencing, here, we identify 74 LUTI candidates that are specifically induced in meiotic prophase. Translational repression of these candidates appears to be ubiquitous and is dependent on upstream open reading frames. However, LUTI-based transcriptional repression is variable. In only 50% of the cases, LUTI transcription causes downregulation of the protein-coding transcript isoform. Higher LUTI expression, enrichment of histone 3 lysine 36 trimethylation, and changes in nucleosome position are the strongest predictors of LUTI-based transcriptional repression. We conclude that LUTIs downregulate gene expression in a manner that integrates translational repression, chromatin state changes, and the magnitude of LUTI expression.
Collapse
|
40
|
Zhang XO, Pratt H, Weng Z. Investigating the Potential Roles of SINEs in the Human Genome. Annu Rev Genomics Hum Genet 2021; 22:199-218. [PMID: 33792357 DOI: 10.1146/annurev-genom-111620-100736] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Short interspersed nuclear elements (SINEs) are nonautonomous retrotransposons that occupy approximately 13% of the human genome. They are transcribed by RNA polymerase III and can be retrotranscribed and inserted back into the genome with the help of other autonomous retroelements. Because they are preferentially located close to or within gene-rich regions, they can regulate gene expression by various mechanisms that act at both the DNA and the RNA levels. In this review, we summarize recent findings on the involvement of SINEs in different types of gene regulation and discuss the potential regulatory functions of SINEs that are in close proximity to genes, Pol III-transcribed SINE RNAs, and embedded SINE sequences within Pol II-transcribed genes in the human genome. These discoveries illustrate how the human genome has exapted some SINEs into functional regulatory elements.
Collapse
Affiliation(s)
- Xiao-Ou Zhang
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA; .,Current affiliation: School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Henry Pratt
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA;
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA;
| |
Collapse
|
41
|
Goszczynski DE, Halstead MM, Islas-Trejo AD, Zhou H, Ross PJ. Transcription initiation mapping in 31 bovine tissues reveals complex promoter activity, pervasive transcription, and tissue-specific promoter usage. Genome Res 2021; 31:732-744. [PMID: 33722934 PMCID: PMC8015843 DOI: 10.1101/gr.267336.120] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 02/01/2021] [Indexed: 01/04/2023]
Abstract
Characterizing transcription start sites is essential for understanding the regulatory mechanisms that control gene expression. Recently, a new bovine genome assembly (ARS-UCD1.2) with high continuity, accuracy, and completeness was released; however, the functional annotation of the bovine genome lacks precise transcription start sites and contains a low number of transcripts in comparison to human and mouse. By using the RAMPAGE approach, this study identified transcription start sites at high resolution in a large collection of bovine tissues. We found several known and novel transcription start sites attributed to promoters of protein-coding and lncRNA genes that were validated through experimental and in silico evidence. With these findings, the annotation of transcription start sites in cattle reached a level comparable to the mouse and human genome annotations. In addition, we identified and characterized transcription start sites for antisense transcripts derived from bidirectional promoters, potential lncRNAs, mRNAs, and pre-miRNAs. We also analyzed the quantitative aspects of RAMPAGE to produce a promoter activity atlas, reaching highly reproducible results comparable to traditional RNA-seq. Coexpression networks revealed considerable use of tissue-specific promoters, especially between brain and testicle, which expressed several genes in common from alternate loci. Furthermore, regions surrounding coexpressed modules were enriched in binding factor motifs representative of each tissue. The comprehensive annotation of promoters in such a large collection of tissues will substantially contribute to our understanding of gene expression in cattle and other mammalian species, shortening the gap between genotypes and phenotypes.
Collapse
Affiliation(s)
- Daniel E Goszczynski
- Department of Animal Science, University of California, Davis, California 95616, USA
| | - Michelle M Halstead
- Department of Animal Science, University of California, Davis, California 95616, USA
| | - Alma D Islas-Trejo
- Department of Animal Science, University of California, Davis, California 95616, USA
| | - Huaijun Zhou
- Department of Animal Science, University of California, Davis, California 95616, USA
| | - Pablo J Ross
- Department of Animal Science, University of California, Davis, California 95616, USA
| |
Collapse
|
42
|
Liu Q, Jiang F, Zhang J, Li X, Kang L. Transcription initiation of distant core promoters in a large-sized genome of an insect. BMC Biol 2021; 19:62. [PMID: 33785021 PMCID: PMC8011201 DOI: 10.1186/s12915-021-01004-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 03/16/2021] [Indexed: 12/30/2022] Open
Abstract
Background Core promoters have a substantial influence on various steps of transcription, including initiation, elongation, termination, polyadenylation, and finally, translation. The characterization of core promoters is crucial for exploring the regulatory code of transcription initiation. However, the current understanding of insect core promoters is focused on those of Diptera (especially Drosophila) species with small genome sizes. Results Here, we present an analysis of the transcription start sites (TSSs) in the migratory locust, Locusta migratoria, which has a genome size of 6.5 Gb. The genomic differences, including lower precision of transcription initiation and fewer constraints on the distance from transcription factor binding sites or regulatory elements to TSSs, were revealed in locusts compared with Drosophila insects. Furthermore, we found a distinct bimodal log distribution of the distances from the start codons to the core promoters of locust genes. We found stricter constraints on the exon length of mRNA leaders and widespread expression activity of the distant core promoters in locusts compared with fruit flies. We further compared core promoters in seven arthropod species across a broad range of genome sizes to reinforce our results on the emergence of distant core promoters in large-sized genomes. Conclusions In summary, our results provide novel insights into the effects of genome size expansion on distant transcription initiation. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-021-01004-5.
Collapse
Affiliation(s)
- Qing Liu
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China.,Sino-Danish College, University of Chinese Academy of Sciences, Beijing, China.,Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Feng Jiang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China.,CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China
| | - Jie Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Xiao Li
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Le Kang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China. .,CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing, China. .,State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.
| |
Collapse
|
43
|
Aptardi predicts polyadenylation sites in sample-specific transcriptomes using high-throughput RNA sequencing and DNA sequence. Nat Commun 2021; 12:1652. [PMID: 33712618 PMCID: PMC7955126 DOI: 10.1038/s41467-021-21894-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 02/18/2021] [Indexed: 02/01/2023] Open
Abstract
Annotation of polyadenylation sites from short-read RNA sequencing alone is a challenging computational task. Other algorithms rooted in DNA sequence predict potential polyadenylation sites; however, in vivo expression of a particular site varies based on a myriad of conditions. Here, we introduce aptardi (alternative polyadenylation transcriptome analysis from RNA-Seq data and DNA sequence information), which leverages both DNA sequence and RNA sequencing in a machine learning paradigm to predict expressed polyadenylation sites. Specifically, as input aptardi takes DNA nucleotide sequence, genome-aligned RNA-Seq data, and an initial transcriptome. The program evaluates these initial transcripts to identify expressed polyadenylation sites in the biological sample and refines transcript 3'-ends accordingly. The average precision of the aptardi model is twice that of a standard transcriptome assembler. In particular, the recall of the aptardi model (the proportion of true polyadenylation sites detected by the algorithm) is improved by over three-fold. Also, the model-trained using the Human Brain Reference RNA commercial standard-performs well when applied to RNA-sequencing samples from different tissues and different mammalian species. Finally, aptardi's input is simple to compile and its output is easily amenable to downstream analyses such as quantitation and differential expression.
Collapse
|
44
|
Genomic Space of MGMT in Human Glioma Revisited: Novel Motifs, Regulatory RNAs, NRF1, 2, and CTCF Involvement in Gene Expression. Int J Mol Sci 2021; 22:ijms22052492. [PMID: 33801310 PMCID: PMC7958331 DOI: 10.3390/ijms22052492] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 02/18/2021] [Accepted: 02/25/2021] [Indexed: 01/08/2023] Open
Abstract
Background: The molecular regulation of increased MGMT expression in human brain tumors, the associated regulatory elements, and linkages of these to its epigenetic silencing are not understood. Because the heightened expression or non-expression of MGMT plays a pivotal role in glioma therapeutics, we applied bioinformatics and experimental tools to identify the regulatory elements in the MGMT and neighboring EBF3 gene loci. Results: Extensive genome database analyses showed that the MGMT genomic space was rich in and harbored many undescribed RNA regulatory sequences and recognition motifs. We extended the MGMT’s exon-1 promoter to 2019 bp to include five overlapping alternate promoters. Consensus sequences in the revised promoter for (a) the transcriptional factors CTCF, NRF1/NRF2, GAF, (b) the genetic switch MYC/MAX/MAD, and (c) two well-defined p53 response elements in MGMT intron-1, were identified. A putative protein-coding or non-coding RNA sequence was located in the extended 3′ UTR of the MGMT transcript. Eleven non-coding RNA loci coding for miRNAs, antisense RNA, and lncRNAs were identified in the MGMT-EBF3 region and six of these showed validated potential for curtailing the expression of both MGMT and EBF3 genes. ChIP analysis verified the binding site in MGMT promoter for CTCF which regulates the genomic methylation and chromatin looping. CTCF depletion by a pool of specific siRNA and shRNAs led to a significant attenuation of MGMT expression in human GBM cell lines. Computational analysis of the ChIP sequence data in ENCODE showed the presence of NRF1 in the MGMT promoter and this occurred only in MGMT-proficient cell lines. Further, an enforced NRF2 expression markedly augmented the MGMT mRNA and protein levels in glioma cells. Conclusions: We provide the first evidence for several new regulatory components in the MGMT gene locus which predict complex transcriptional and posttranscriptional controls with potential for new therapeutic avenues.
Collapse
|
45
|
The Regulation and Functions of Endogenous Retrovirus in Embryo Development and Stem Cell Differentiation. Stem Cells Int 2021; 2021:6660936. [PMID: 33727936 PMCID: PMC7937486 DOI: 10.1155/2021/6660936] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 02/19/2021] [Indexed: 11/17/2022] Open
Abstract
Endogenous retroviruses (ERVs) are repetitive sequences in the genome, belonging to the retrotransposon family. During the course of life, ERVs are associated with multiple aspects of chromatin and transcriptional regulation in development and pathological conditions. In mammalian embryos, ERVs are extensively activated in early embryo development, but with a highly restricted spatial-temporal pattern; and they are drastically silenced during differentiation with exceptions in extraembryonic tissue and germlines. The dynamic activation pattern of ERVs raises questions about how ERVs are regulated in the life cycle and whether they are functionally important to cell fate decision during early embryo and somatic cell development. Therefore, in this review, we focus on the pieces of evidence demonstrating regulations and functions of ERVs during stem cell differentiation, which suggests that ERV activation is not a passive result of cell fate transition but the active epigenetic and transcriptional regulation during mammalian development and stem cell differentiation.
Collapse
|
46
|
Markus BM, Waldman BS, Lorenzi HA, Lourido S. High-Resolution Mapping of Transcription Initiation in the Asexual Stages of Toxoplasma gondii. Front Cell Infect Microbiol 2021; 10:617998. [PMID: 33553008 PMCID: PMC7854901 DOI: 10.3389/fcimb.2020.617998] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 12/03/2020] [Indexed: 12/13/2022] Open
Abstract
Toxoplasma gondii is a common parasite of humans and animals, causing life-threatening disease in the immunocompromized, fetal abnormalities when contracted during gestation, and recurrent ocular lesions in some patients. Central to the prevalence and pathogenicity of this protozoan is its ability to adapt to a broad range of environments, and to differentiate between acute and chronic stages. These processes are underpinned by a major rewiring of gene expression, yet the mechanisms that regulate transcription in this parasite are only partially characterized. Deciphering these mechanisms requires a precise and comprehensive map of transcription start sites (TSSs); however, Toxoplasma TSSs have remained incompletely defined. To address this challenge, we used 5'-end RNA sequencing to genomically assess transcription initiation in both acute and chronic stages of Toxoplasma. Here, we report an in-depth analysis of transcription initiation at promoters, and provide empirically-defined TSSs for 7603 (91%) protein-coding genes, of which only 1840 concur with existing gene models. Comparing data from acute and chronic stages, we identified instances of stage-specific alternative TSSs that putatively generate mRNA isoforms with distinct 5' termini. Analysis of the nucleotide content and nucleosome occupancy around TSSs allowed us to examine the determinants of TSS choice, and outline features of Toxoplasma promoter architecture. We also found pervasive divergent transcription at Toxoplasma promoters, clustered within the nucleosomes of highly-symmetrical phased arrays, underscoring chromatin contributions to transcription initiation. Corroborating previous observations, we asserted that Toxoplasma 5' leaders are among the longest of any eukaryote studied thus far, displaying a median length of approximately 800 nucleotides. Further highlighting the utility of a precise TSS map, we pinpointed motifs associated with transcription initiation, including the binding sites of the master regulator of chronic-stage differentiation, BFD1, and a novel motif with a similar positional arrangement present at 44% of Toxoplasma promoters. This work provides a critical resource for functional genomics in Toxoplasma, and lays down a foundation to study the interactions between genomic sequences and the regulatory factors that control transcription in this parasite.
Collapse
Affiliation(s)
- Benedikt M. Markus
- Whitehead Institute for Biomedical Research, Cambridge, MA, United States
- Faculty of Biology, University of Freiburg, Freiburg, Germany
| | - Benjamin S. Waldman
- Whitehead Institute for Biomedical Research, Cambridge, MA, United States
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, United States
| | | | - Sebastian Lourido
- Whitehead Institute for Biomedical Research, Cambridge, MA, United States
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, United States
| |
Collapse
|
47
|
Chia M, Li C, Marques S, Pelechano V, Luscombe NM, van Werven FJ. High-resolution analysis of cell-state transitions in yeast suggests widespread transcriptional tuning by alternative starts. Genome Biol 2021; 22:34. [PMID: 33446241 PMCID: PMC7807719 DOI: 10.1186/s13059-020-02245-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Accepted: 12/15/2020] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND The start and end sites of messenger RNAs (TSSs and TESs) are highly regulated, often in a cell-type-specific manner. Yet the contribution of transcript diversity in regulating gene expression remains largely elusive. We perform an integrative analysis of multiple highly synchronized cell-fate transitions and quantitative genomic techniques in Saccharomyces cerevisiae to identify regulatory functions associated with transcribing alternative isoforms. RESULTS Cell-fate transitions feature widespread elevated expression of alternative TSS and, to a lesser degree, TES usage. These dynamically regulated alternative TSSs are located mostly upstream of canonical TSSs, but also within gene bodies possibly encoding for protein isoforms. Increased upstream alternative TSS usage is linked to various effects on canonical TSS levels, which range from co-activation to repression. We identified two key features linked to these outcomes: an interplay between alternative and canonical promoter strengths, and distance between alternative and canonical TSSs. These two regulatory properties give a plausible explanation of how locally transcribed alternative TSSs control gene transcription. Additionally, we find that specific chromatin modifiers Set2, Set3, and FACT play an important role in mediating gene repression via alternative TSSs, further supporting that the act of upstream transcription drives the local changes in gene transcription. CONCLUSIONS The integrative analysis of multiple cell-fate transitions suggests the presence of a regulatory control system of alternative TSSs that is important for dynamic tuning of gene expression. Our work provides a framework for understanding how TSS heterogeneity governs eukaryotic gene expression, particularly during cell-fate changes.
Collapse
Affiliation(s)
- Minghao Chia
- The Francis Crick Institute, London, UK
- Genome Institute of Singapore, 60 Biopolis Street, Genome, #02-01, Singapore, 138672, Singapore
| | - Cai Li
- The Francis Crick Institute, London, UK
- School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Sueli Marques
- SciLifeLab, Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solna, Sweden
| | - Vicente Pelechano
- SciLifeLab, Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solna, Sweden
| | - Nicholas M Luscombe
- The Francis Crick Institute, London, UK
- Okinawa Institute of Science & Technology Graduate University, Okinawa, 904-0495, Japan
- UCL Genetics Institute, University College London, London, WC1E 6BT, UK
| | | |
Collapse
|
48
|
Navarro Gonzalez J, Zweig AS, Speir ML, Schmelter D, Rosenbloom K, Raney BJ, Powell CC, Nassar LR, Maulding N, Lee CM, Lee BT, Hinrichs A, Fyfe A, Fernandes J, Diekhans M, Clawson H, Casper J, Benet-Pagès A, Barber GP, Haussler D, Kuhn R, Haeussler M, Kent W. The UCSC Genome Browser database: 2021 update. Nucleic Acids Res 2021; 49:D1046-D1057. [PMID: 33221922 PMCID: PMC7779060 DOI: 10.1093/nar/gkaa1070] [Citation(s) in RCA: 325] [Impact Index Per Article: 81.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 10/19/2020] [Accepted: 11/18/2020] [Indexed: 12/11/2022] Open
Abstract
For more than two decades, the UCSC Genome Browser database (https://genome.ucsc.edu) has provided high-quality genomics data visualization and genome annotations to the research community. As the field of genomics grows and more data become available, new modes of display are required to accommodate new technologies. New features released this past year include a Hi-C heatmap display, a phased family trio display for VCF files, and various track visualization improvements. Striving to keep data up-to-date, new updates to gene annotations include GENCODE Genes, NCBI RefSeq Genes, and Ensembl Genes. New data tracks added for human and mouse genomes include the ENCODE registry of candidate cis-regulatory elements, promoters from the Eukaryotic Promoter Database, and NCBI RefSeq Select and Matched Annotation from NCBI and EMBL-EBI (MANE). Within weeks of learning about the outbreak of coronavirus, UCSC released a genome browser, with detailed annotation tracks, for the SARS-CoV-2 RNA reference assembly.
Collapse
Affiliation(s)
| | - Ann S Zweig
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Matthew L Speir
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Daniel Schmelter
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Kate R Rosenbloom
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian J Raney
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Conner C Powell
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Luis R Nassar
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Nathan D Maulding
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Christopher M Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian T Lee
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Angie S Hinrichs
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Alastair C Fyfe
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jason D Fernandes
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Mark Diekhans
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Hiram Clawson
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Jonathan Casper
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Anna Benet-Pagès
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Medical Genetics Center (MGZ), Munich, Germany
| | - Galt P Barber
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - David Haussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Howard Hughes Medical Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Robert M Kuhn
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Maximilian Haeussler
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - W James Kent
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| |
Collapse
|
49
|
Pachganov S, Murtazalieva K, Zarubin A, Taran T, Chartier D, Tatarinova TV. Prediction of Rice Transcription Start Sites Using TransPrise: A Novel Machine Learning Approach. Methods Mol Biol 2021; 2238:261-274. [PMID: 33471337 DOI: 10.1007/978-1-0716-1068-8_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
As the interest in genetic resequencing increases, so does the need for effective mathematical, computational, and statistical approaches. One of the difficult problems in genome annotation is determination of precise positions of transcription start sites. In this paper, we present TransPrise-an efficient deep learning tool for predicting positions of eukaryotic transcription start sites. TransPrise offers significant improvement over existing promoter-prediction methods. To illustrate this, we compared predictions of TransPrise with the TSSPlant approach for well-annotated genome of Oryza sativa. Using a computer with a graphics processing unit, the run time of TransPrise is 250 min on a genome of 374 Mb long.We provide the full basis for the comparison and encourage users to freely access a set of our computational tools to facilitate and streamline their own analyses. The ready-to-use Docker image with all the necessary packages, models, and code as well as the source code of the TransPrise algorithm are available at http://compubioverne.group/ . The source code is ready to use and to be customized to predict TSS in any eukaryotic organism.
Collapse
Affiliation(s)
- Stepan Pachganov
- Ugra Research Institute of Information Technologies, Khanty-Mansiysk, Russia
| | | | - Alexei Zarubin
- Tomsk National Research Medical Center of the Russian Academy of Sciences, Research Institute of Medical Genetics, Tomsk, Russia
| | | | - Duane Chartier
- International Center for Art Intelligence, Inc, Los Angeles, CA, USA
| | - Tatiana V Tatarinova
- Vavilov Institute of General Genetics, Moscow, Russia.
- Department of Biology, University of La Verne, La Verne, CA, USA.
- A.A. Kharkevich Institute for Information Transmission Problems, Russian Academy of Sciences, Moscow, Russia.
- Siberian Federal University, Krasnoyarsk, Russia.
| |
Collapse
|
50
|
Gupta H, Chandratre K, Sinha S, Huang T, Wu X, Cui J, Zhang MQ, Wang SM. Highly diversified core promoters in the human genome and their effects on gene expression and disease predisposition. BMC Genomics 2020; 21:842. [PMID: 33256598 PMCID: PMC7706239 DOI: 10.1186/s12864-020-07222-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Accepted: 11/09/2020] [Indexed: 12/14/2022] Open
Abstract
Background Core promoter controls transcription initiation. However, little is known for core promoter diversity in the human genome and its relationship with diseases. We hypothesized that as a functional important component in the genome, the core promoter in the human genome could be under evolutionary selection, as reflected by its highly diversification in order to adjust gene expression for better adaptation to the different environment. Results Applying the “Exome-based Variant Detection in Core-promoters” method, we analyzed human core-promoter diversity by using the 2682 exome data sets of 25 worldwide human populations sequenced by the 1000 Genome Project. Collectively, we identified 31,996 variants in the core promoter region (− 100 to + 100) of 12,509 human genes (https://dbhcpd.fhs.um.edu.mo). Analyzing the rich variation data identified highly ethnic-specific patterns of core promoter variation between different ethnic populations, the genes with highly variable core promoters, the motifs affected by the variants, and their involved functional pathways. eQTL test revealed that 12% of core promoter variants can significantly alter gene expression level. Comparison with GWAS data we located 163 variants as the GWAS identified traits associated with multiple diseases, half of these variants can alter gene expression. Conclusion Data from our study reals the highly diversified nature of core promoter in the human genome, and highlights that core promoter variation could play important roles not only in gene expression regulation but also in disease predisposition. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-020-07222-5.
Collapse
Affiliation(s)
- Hemant Gupta
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Khyati Chandratre
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Siddharth Sinha
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Teng Huang
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Xiaobing Wu
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China
| | - Jian Cui
- Eppley Institute for Cancer Research, University of Nebraska Medical Center, Omaha, NE, 68198, USA
| | - Michael Q Zhang
- Department of Biological Sciences, Center for Systems Biology, University of Texas at Dallas, Richardson, TX, 75080, USA
| | - San Ming Wang
- Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau, SAR, China.
| |
Collapse
|