1
|
Delobel D, Nishiyori-Sueki H, Nisoli I, Kawaji H, Robbe P, Carninci P, Takahashi H. Protocol for direct cDNA cap analysis of gene expression for paired-end patterned flow cell sequencing. STAR Protoc 2025; 6:103594. [PMID: 39921863 PMCID: PMC11851279 DOI: 10.1016/j.xpro.2024.103594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2024] [Revised: 11/12/2024] [Accepted: 12/31/2024] [Indexed: 02/10/2025] Open
Abstract
Cap analysis of gene expression (CAGE) is a technique that facilitates the assessment of the 5'-end of RNA transcript starting site (TSS) of both coding and non-coding genes. Here, we present a protocol for using CAGE on Illumina patterned flow cell technology with dual indexes on mouse and human samples. We describe steps for sequencing, automated data processing, and complete analytical framework ensuring CAGE operability for the determination of TSS and enhancers on the ever-evolving Illumina sequencing platforms.
Collapse
Affiliation(s)
- Diane Delobel
- RIKEN Center for Integrative Medical Sciences (IMS), Yokohama 230-0045, Japan
| | | | - Ilaria Nisoli
- Human Technopole Research Center for Genomics, 20157 Milan, Italy
| | - Hideya Kawaji
- Tokyo Metropolitan Institute of Medical Science, Research Center for Genome & Medical Sciences, Tokyo 156-8506, Japan
| | - Pauline Robbe
- RIKEN Center for Integrative Medical Sciences (IMS), Yokohama 230-0045, Japan
| | - Piero Carninci
- RIKEN Center for Integrative Medical Sciences (IMS), Yokohama 230-0045, Japan; Human Technopole Research Center for Genomics, 20157 Milan, Italy.
| | - Hazuki Takahashi
- RIKEN Center for Integrative Medical Sciences (IMS), Yokohama 230-0045, Japan.
| |
Collapse
|
2
|
Kose C, Lindsey-Boltz LA, Sancar A, Jiang Y. Genome-wide analysis of transcription-coupled repair reveals novel transcription events in Caenorhabditis elegans. PLoS Genet 2024; 20:e1011365. [PMID: 39028758 PMCID: PMC11290646 DOI: 10.1371/journal.pgen.1011365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 07/31/2024] [Accepted: 07/08/2024] [Indexed: 07/21/2024] Open
Abstract
Bulky DNA adducts such as those induced by ultraviolet light are removed from the genomes of multicellular organisms by nucleotide excision repair, which occurs through two distinct mechanisms, global repair, requiring the DNA damage recognition-factor XPC (xeroderma pigmentosum complementation group C), and transcription-coupled repair (TCR), which does not. TCR is initiated when elongating RNA polymerase II encounters DNA damage, and thus analysis of genome-wide excision repair in XPC-mutants only repairing by TCR provides a unique opportunity to map transcription events missed by methods dependent on capturing RNA transcription products and thus limited by their stability and/or modifications (5'-capping or 3'-polyadenylation). Here, we have performed eXcision Repair-sequencing (XR-seq) in the model organism Caenorhabditis elegans to generate genome-wide repair maps in a wild-type strain with normal excision repair, a strain lacking TCR (csb-1), and a strain that only repairs by TCR (xpc-1). Analysis of the intersections between the xpc-1 XR-seq repair maps with RNA-mapping datasets (RNA-seq, long- and short-capped RNA-seq) reveal previously unrecognized sites of transcription and further enhance our understanding of the genome of this important model organism.
Collapse
Affiliation(s)
- Cansu Kose
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, United States of America
| | - Laura A. Lindsey-Boltz
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, United States of America
| | - Aziz Sancar
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, United States of America
| | - Yuchao Jiang
- Department of Statistics, College of Arts and Sciences, Texas A&M University, College Station, Texas, United States of America
- Department of Biology, College of Arts and Sciences, Texas A&M University, College Station, Texas, United States of America
- Department of Biomedical Engineering, College of Engineering, Texas A&M University, College Station, Texas, United States of America
| |
Collapse
|
3
|
Carbonell-Sala S, Perteghella T, Lagarde J, Nishiyori H, Palumbo E, Arnan C, Takahashi H, Carninci P, Uszczynska-Ratajczak B, Guigó R. CapTrap-seq: a platform-agnostic and quantitative approach for high-fidelity full-length RNA sequencing. Nat Commun 2024; 15:5278. [PMID: 38937428 PMCID: PMC11211341 DOI: 10.1038/s41467-024-49523-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 06/10/2024] [Indexed: 06/29/2024] Open
Abstract
Long-read RNA sequencing is essential to produce accurate and exhaustive annotation of eukaryotic genomes. Despite advancements in throughput and accuracy, achieving reliable end-to-end identification of RNA transcripts remains a challenge for long-read sequencing methods. To address this limitation, we develop CapTrap-seq, a cDNA library preparation method, which combines the Cap-trapping strategy with oligo(dT) priming to detect 5' capped, full-length transcripts. In our study, we evaluate the performance of CapTrap-seq alongside other widely used RNA-seq library preparation protocols in human and mouse tissues, employing both ONT and PacBio sequencing technologies. To explore the quantitative capabilities of CapTrap-seq and its accuracy in reconstructing full-length RNA molecules, we implement a capping strategy for synthetic RNA spike-in sequences that mimics the natural 5'cap formation. Our benchmarks, incorporating the Long-read RNA-seq Genome Annotation Assessment Project (LRGASP) data, demonstrate that CapTrap-seq is a competitive, platform-agnostic RNA library preparation method for generating full-length transcript sequences.
Collapse
Affiliation(s)
- Sílvia Carbonell-Sala
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Tamara Perteghella
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Julien Lagarde
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Flomics Biotech, SL, Carrer de Roc Boronat 31, 08005, Barcelona, Catalonia, Spain
| | - Hiromi Nishiyori
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences (IMS), Yokohama, Kanagawa, Japan
| | - Emilio Palumbo
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Carme Arnan
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Hazuki Takahashi
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences (IMS), Yokohama, Kanagawa, Japan
| | - Piero Carninci
- Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences (IMS), Yokohama, Kanagawa, Japan
- Human Technopole, Milan, Italy
| | - Barbara Uszczynska-Ratajczak
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
- Department of Computational Biology of Noncoding RNA, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland.
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
- Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.
| |
Collapse
|
4
|
Kose C, Lindsey-Boltz LA, Sancar A, Jiang Y. Genome-wide analysis of transcription-coupled repair reveals novel transcription events in Caenorhabditis elegans. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.12.562083. [PMID: 37904932 PMCID: PMC10614815 DOI: 10.1101/2023.10.12.562083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
Bulky DNA adducts such as those induced by ultraviolet light are removed from the genomes of multicellular organisms by nucleotide excision repair, which occurs through two distinct mechanisms, global repair, requiring the DNA damage recognition-factor XPC (xeroderma pigmentosum complementation group C), and transcription-coupled repair (TCR), which does not. TCR is initiated when elongating RNA polymerase II encounters DNA damage, and thus analysis of genome-wide excision repair in XPC-mutants only repairing by TCR provides a unique opportunity to map transcription events missed by methods dependent on capturing RNA transcription products and thus limited by their stability and/or modifications (5'-capping or 3'-polyadenylation). Here, we have performed the eXcision Repair-sequencing (XR-seq) in the model organism Caenorhabditis elegans to generate genome-wide repair maps from a wild-type strain with normal excision repair, a strain lacking TCR (csb-1), or one that only repairs by TCR (xpc-1). Analysis of the intersections between the xpc-1 XR-seq repair maps with RNA-mapping datasets (RNA-seq, long- and short-capped RNA-seq) reveal previously unrecognized sites of transcription and further enhance our understanding of the genome of this important model organism.
Collapse
Affiliation(s)
- Cansu Kose
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC, USA
| | - Laura A. Lindsey-Boltz
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC, USA
| | - Aziz Sancar
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, NC, USA
| | - Yuchao Jiang
- Department of Statistics, College of Arts and Sciences, Texas A&M University, College Station, TX 77843, USA
- Department of Biology, College of Arts and Sciences, Texas A&M University, College Station, TX 77843
- Department of Biomedical Engineering, College of Engineering, Texas A&M University, College Station, TX 77843
| |
Collapse
|
5
|
Deviatiiarov R, Nagai H, Ismagulov G, Stupina A, Wada K, Ide S, Toji N, Zhang H, Sukparangsi W, Intarapat S, Gusev O, Sheng G. Dosage compensation of Z sex chromosome genes in avian fibroblast cells. Genome Biol 2023; 24:213. [PMID: 37730643 PMCID: PMC10510239 DOI: 10.1186/s13059-023-03055-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Accepted: 09/08/2023] [Indexed: 09/22/2023] Open
Abstract
In birds, sex is genetically determined; however, the molecular mechanism is not well-understood. The avian Z sex chromosome (chrZ) lacks whole chromosome inactivation, in contrast to the mammalian chrX. To investigate chrZ dosage compensation and its role in sex specification, we use a highly quantitative method and analyze transcriptional activities of male and female fibroblast cells from seven bird species. Our data indicate that three fourths of chrZ genes are strictly compensated across Aves, similar to mammalian chrX. We also present a complete list of non-compensated chrZ genes and identify Ribosomal Protein S6 (RPS6) as a conserved sex-dimorphic gene in birds.
Collapse
Affiliation(s)
- Ruslan Deviatiiarov
- International Research Center for Medical Sciences, Kumamoto University, Kumamoto, Japan
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russian Federation
- Graduate School of Medicine, Juntendo University, Tokyo, Japan
- Life Improvement by Future Technologies Institute, Moscow, Russian Federation
| | - Hiroki Nagai
- International Research Center for Medical Sciences, Kumamoto University, Kumamoto, Japan
| | - Galym Ismagulov
- International Research Center for Medical Sciences, Kumamoto University, Kumamoto, Japan
| | - Anastasia Stupina
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russian Federation
| | - Kazuhiro Wada
- Department of Biological Sciences, Faculty of Science, Hokkaido University, Sapporo, Japan
| | - Shinji Ide
- Kumamoto City Zoo and Botanical Garden, Kumamoto, Japan
| | - Noriyuki Toji
- Department of Biological Sciences, Faculty of Science, Hokkaido University, Sapporo, Japan
| | - Heng Zhang
- Graduate School of Life Science, Hokkaido University, Sapporo, Japan
| | - Woranop Sukparangsi
- Department of Biology, Faculty of Science, Burapha University, Chonburi, Thailand
| | | | - Oleg Gusev
- Regulatory Genomics Research Center, Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan, Russian Federation.
- Graduate School of Medicine, Juntendo University, Tokyo, Japan.
- Life Improvement by Future Technologies Institute, Moscow, Russian Federation.
| | - Guojun Sheng
- International Research Center for Medical Sciences, Kumamoto University, Kumamoto, Japan.
| |
Collapse
|
6
|
Abstract
Within the next decade, the genomes of 1.8 million eukaryotic species will be sequenced. Identifying genes in these sequences is essential to understand the biology of the species. This is challenging due to the transcriptional complexity of eukaryotic genomes, which encode hundreds of thousands of transcripts of multiple types. Among these, a small set of protein-coding mRNAs play a disproportionately large role in defining phenotypes. Due to their sequence conservation, orthology can be established, making it possible to define the universal catalog of eukaryotic protein-coding genes. This catalog should substantially contribute to uncovering the genomic events underlying the emergence of eukaryotic phenotypes. This piece briefly reviews the basics of protein-coding gene prediction, discusses challenges in finalizing annotation of the human genome, and proposes strategies for producing annotations across the eukaryotic Tree of Life. This lays the groundwork for obtaining the catalog of all genes-the Earth's code of life.
Collapse
Affiliation(s)
- Roderic Guigó
- Bioinformatics and Genomics, Center for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Dr. Aiguader 88, 08003 Barcelona, Catalonia
- Universitat Pompeu Fabra (UPF), Barcelona, Catalonia
| |
Collapse
|
7
|
Carbonell-Sala S, Lagarde J, Nishiyori H, Palumbo E, Arnan C, Takahashi H, Carninci P, Uszczynska-Ratajczak B, Guigó R. CapTrap-Seq: A platform-agnostic and quantitative approach for high-fidelity full-length RNA transcript sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.16.543444. [PMID: 37398314 PMCID: PMC10312720 DOI: 10.1101/2023.06.16.543444] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Long-read RNA sequencing is essential to produce accurate and exhaustive annotation of eukaryotic genomes. Despite advancements in throughput and accuracy, achieving reliable end-to-end identification of RNA transcripts remains a challenge for long-read sequencing methods. To address this limitation, we developed CapTrap-seq, a cDNA library preparation method, which combines the Cap-trapping strategy with oligo(dT) priming to detect 5'capped, full-length transcripts, together with the data processing pipeline LyRic. We benchmarked CapTrap-seq and other popular RNA-seq library preparation protocols in a number of human tissues using both ONT and PacBio sequencing. To assess the accuracy of the transcript models produced, we introduced a capping strategy for synthetic RNA spike-in sequences that mimics the natural 5'cap formation in RNA spike-in molecules. We found that the vast majority (up to 90%) of transcript models that LyRic derives from CapTrap-seq reads are full-length. This makes it possible to produce highly accurate annotations with minimal human intervention.
Collapse
|
8
|
Balaratnam S, Torrey ZR, Calabrese DR, Banco MT, Yazdani K, Liang X, Fullenkamp CR, Seshadri S, Holewinski RJ, Andresson T, Ferré-D'Amaré AR, Incarnato D, Schneekloth JS. Investigating the NRAS 5' UTR as a target for small molecules. Cell Chem Biol 2023; 30:643-657.e8. [PMID: 37257453 PMCID: PMC11623308 DOI: 10.1016/j.chembiol.2023.05.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 02/24/2023] [Accepted: 05/10/2023] [Indexed: 06/02/2023]
Abstract
Neuroblastoma RAS (NRAS) is an oncogene that is deregulated and highly mutated in cancers including melanomas and acute myeloid leukemias. The 5' untranslated region (UTR) (5' UTR) of the NRAS mRNA contains a G-quadruplex (G4) that regulates translation. Here we report a novel class of small molecule that binds to the G4 structure located in the 5' UTR of the NRAS mRNA. We used a small molecule microarray screen to identify molecules that selectively bind to the NRAS-G4 with submicromolar affinity. One compound inhibits the translation of NRAS in vitro but showed only moderate effects on the NRAS levels in cellulo. Rapid Amplification of cDNA Ends and RT-PCR analysis revealed that the predominant NRAS transcript does not possess the G4 structure. Thus, although NRAS transcripts lack a G4 in many cell lines the concept of targeting folded regions within 5' UTRs to control translation remains a highly attractive strategy.
Collapse
Affiliation(s)
- Sumirtha Balaratnam
- Chemical Biology Laboratory, National Cancer Institute, Frederick, MD 21702, USA
| | - Zachary R Torrey
- Chemical Biology Laboratory, National Cancer Institute, Frederick, MD 21702, USA
| | - David R Calabrese
- Chemical Biology Laboratory, National Cancer Institute, Frederick, MD 21702, USA
| | - Michael T Banco
- Biochemistry and Biophysics Center, National Heart, Lung and Blood Institute, Bethesda, MD 20892, USA
| | - Kamyar Yazdani
- Chemical Biology Laboratory, National Cancer Institute, Frederick, MD 21702, USA
| | - Xiao Liang
- Chemical Biology Laboratory, National Cancer Institute, Frederick, MD 21702, USA
| | | | - Srinath Seshadri
- Chemical Biology Laboratory, National Cancer Institute, Frederick, MD 21702, USA
| | - Ronald J Holewinski
- Protein Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc, Frederick, MD 21702, USA
| | - Thorkell Andresson
- Protein Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc, Frederick, MD 21702, USA
| | - Adrian R Ferré-D'Amaré
- Biochemistry and Biophysics Center, National Heart, Lung and Blood Institute, Bethesda, MD 20892, USA
| | - Danny Incarnato
- Department of Molecular Genetics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Groningen, the Netherlands
| | - John S Schneekloth
- Chemical Biology Laboratory, National Cancer Institute, Frederick, MD 21702, USA.
| |
Collapse
|
9
|
Gavgani HN, Grotewold E, Gray J. Methodology for Constructing a Knowledgebase for Plant Gene Regulation Information. Methods Mol Biol 2023; 2698:277-300. [PMID: 37682481 DOI: 10.1007/978-1-0716-3354-0_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
The amount of biological data is growing at a rapid pace as many high-throughput omics technologies and data pipelines are developed. This is resulting in the growth of databases for DNA and protein sequences, gene expression, protein accumulation, structural, and localization information. The diversity and multi-omics nature of such bioinformatic data requires well-designed databases for flexible organization and presentation. Besides general-purpose online bioinformatic databases, users need narrowly focused online databases to quickly access a meaningful collection of related data for their research. Here, we describe the methodology used to implement a plant gene regulatory knowledgebase, with data, query, and tool features, as well as the ability to expand to accommodate future datasets. We exemplify this methodology for the GRASSIUS knowledgebase, but it is applicable to developing and updating similar plant gene regulatory knowledgebases. GRASSIUS organizes and presents gene regulatory data from grass species with a central focus on maize (Zea mays). The main class of data presented include not only the families of transcription factors (TFs) and co-regulators (CRs) but also protein-DNA interaction data, where available.
Collapse
Affiliation(s)
- Hadi Nayebi Gavgani
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
- Dandelions Therapeutics Inc., San Francisco, CA, USA
| | - Erich Grotewold
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - John Gray
- Department of Biological Sciences, University of Toledo, Toledo, OH, USA.
| |
Collapse
|
10
|
Navigating the Multiverse of Antisense RNAs: The Transcription- and RNA-Dependent Dimension. Noncoding RNA 2022; 8:ncrna8060074. [PMID: 36412909 PMCID: PMC9680235 DOI: 10.3390/ncrna8060074] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 10/21/2022] [Accepted: 10/23/2022] [Indexed: 12/14/2022] Open
Abstract
Evidence accumulated over the past decades shows that the number of identified antisense transcripts is continuously increasing, promoting them from transcriptional noise to real genes with specific functions. Indeed, recent studies have begun to unravel the complexity of the antisense RNA (asRNA) world, starting from the multidimensional mechanisms that they can exert in physiological and pathological conditions. In this review, we discuss the multiverse of the molecular functions of asRNAs, describing their action through transcription-dependent and RNA-dependent mechanisms. Then, we report the workflow and methodologies to study and functionally characterize single asRNA candidates.
Collapse
|
11
|
Xu J, Pratt HE, Moore JE, Gerstein MB, Weng Z. Building integrative functional maps of gene regulation. Hum Mol Genet 2022; 31:R114-R122. [PMID: 36083269 PMCID: PMC9585680 DOI: 10.1093/hmg/ddac195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Revised: 08/03/2022] [Accepted: 08/09/2022] [Indexed: 11/13/2022] Open
Abstract
Every cell in the human body inherits a copy of the same genetic information. The three billion base pairs of DNA in the human genome, and the roughly 50 000 coding and non-coding genes they contain, must thus encode all the complexity of human development and cell and tissue type diversity. Differences in gene regulation, or the modulation of gene expression, enable individual cells to interpret the genome differently to carry out their specific functions. Here we discuss recent and ongoing efforts to build gene regulatory maps, which aim to characterize the regulatory roles of all sequences in a genome. Many researchers and consortia have identified such regulatory elements using functional assays and evolutionary analyses; we discuss the results, strengths and shortcomings of their approaches. We also discuss new techniques the field can leverage and emerging challenges it will face while striving to build gene regulatory maps of ever-increasing resolution and comprehensiveness.
Collapse
Affiliation(s)
- Jinrui Xu
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Henry E Pratt
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| | - Mark B Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, New Haven, CT 06520, USA
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Chan Medical School, Worcester, MA 01605, USA
| |
Collapse
|
12
|
Global patterns of enhancer activity during sea urchin embryogenesis assessed by eRNA profiling. Genome Res 2021; 31:1680-1692. [PMID: 34330790 PMCID: PMC8415375 DOI: 10.1101/gr.275684.121] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 07/28/2021] [Indexed: 11/25/2022]
Abstract
We used capped analysis of gene expression with sequencing (CAGE-seq) to profile eRNA expression and enhancer activity during embryogenesis of a model echinoderm: the sea urchin, Strongylocentrotus purpuratus. We identified more than 18,000 enhancers that were active in mature oocytes and developing embryos and documented a burst of enhancer activation during cleavage and early blastula stages. We found that a large fraction (73.8%) of all enhancers active during the first 48 h of embryogenesis were hyperaccessible no later than the 128-cell stage and possibly even earlier. Most enhancers were located near gene bodies, and temporal patterns of eRNA expression tended to parallel those of nearby genes. Furthermore, enhancers near lineage-specific genes contained signatures of inputs from developmental gene regulatory networks deployed in those lineages. A large fraction (60%) of sea urchin enhancers previously shown to be active in transgenic reporter assays was associated with eRNA expression. Moreover, a large fraction (50%) of a representative subset of enhancers identified by eRNA profiling drove tissue-specific gene expression in isolation when tested by reporter assays. Our findings provide an atlas of developmental enhancers in a model sea urchin and support the utility of eRNA profiling as a tool for enhancer discovery and regulatory biology. The data generated in this study are available at Echinobase, the public database of information related to echinoderm genomics.
Collapse
|
13
|
Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network. Nat Commun 2021; 12:3297. [PMID: 34078885 PMCID: PMC8172540 DOI: 10.1038/s41467-021-23143-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 04/13/2021] [Indexed: 02/04/2023] Open
Abstract
Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.
Collapse
|
14
|
Hashimoto M, Saito Y, Nakagawa R, Ogahara I, Takagi S, Takata S, Amitani H, Endo M, Yuki H, Ramilowski JA, Severin J, Manabe RI, Watanabe T, Ozaki K, Kaneko A, Kajita H, Fujiki S, Sato K, Honma T, Uchida N, Fukami T, Okazaki Y, Ohara O, Shultz LD, Yamada M, Taniguchi S, Vyas P, de Hoon M, Momozawa Y, Ishikawa F. Combined inhibition of XIAP and BCL2 drives maximal therapeutic efficacy in genetically diverse aggressive acute myeloid leukemia. ACTA ACUST UNITED AC 2021; 2:340-356. [DOI: 10.1038/s43018-021-00177-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 01/22/2021] [Indexed: 01/18/2023]
|
15
|
Efremova LN, Strelnikova SR, Gazizova GR, Minkina EA, Komakhin RA. A Synthetic Strong and Constitutive Promoter Derived from the Stellaria media pro-SmAMP1 and pro-SmAMP2 Promoters for Effective Transgene Expression in Plants. Genes (Basel) 2020; 11:E1407. [PMID: 33256091 PMCID: PMC7760760 DOI: 10.3390/genes11121407] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 11/22/2020] [Accepted: 11/24/2020] [Indexed: 01/05/2023] Open
Abstract
Synthetic promoters are vital for genetic engineering-based strategies for crop improvement, but effective methodologies for their creation and systematic testing are lacking. We report here on the comparative analysis of the promoters pro-SmAMP1 and pro-SmAMP2 from Stellaria media ANTIMICROBIAL PEPTIDE1 (AMP1) and ANTIMICROBIAL PEPTIDE2 (AMP2). These promoters are more effective than the well-known Cauliflower mosaic virus 35S promoter. Although these promoters share about 94% identity, the pro-SmAMP1 promoter demonstrated stronger transient expression of a reporter gene in Agrobacterium infiltration of Nicotiana benthamiana leaves, while the pro-SmAMP2 promoter was more effective for the selection of transgenic tobacco (Nicotiana tabacum) cells when driving a selectable marker. Using the cap analysis of gene expression method, we detected no differences in the structure of the transcription start sites for either promoter in transgenic plants. For both promoters, we used fine-scale deletion analysis to identify 160 bp-long sequences that retain the unique properties of each promoter. With the use of chimeric promoters and directed mutagenesis, we demonstrated that the superiority of the pro-SmAMP1 promoter for Agrobacterium-mediated infiltration is caused by the proline-inducible ACTCAT cis-element strictly positioned relative to the TATA box in the core promoter. Surprisingly, the ACTCAT cis-element not only activated but also suppressed the efficiency of the pro-SmAMP1 promoter under proline stress. The absence of the ACTCAT cis-element and CAANNNNATC motif (negative regulator) in the pro-SmAMP2 promoter provided a more constitutive gene expression profile and better selection of transgenic cells on selective medium. We created a new synthetic promoter that enjoys high effectiveness both in transient expression and in selection of transgenic cells. Intact promoters with differing properties and high degrees of sequence identity may thus be used as a basis for the creation of new synthetic promoters for precise and coordinated gene expression.
Collapse
Affiliation(s)
- Larisa N. Efremova
- All-Russia Research Institute of Agricultural Biotechnology, Moscow 127550, Russia; (L.N.E.); (S.R.S.)
| | - Svetlana R. Strelnikova
- All-Russia Research Institute of Agricultural Biotechnology, Moscow 127550, Russia; (L.N.E.); (S.R.S.)
| | - Guzel R. Gazizova
- Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420008, Russia; (G.R.G.); (E.A.M.)
| | - Elena A. Minkina
- Institute of Fundamental Medicine and Biology, Kazan Federal University, Kazan 420008, Russia; (G.R.G.); (E.A.M.)
| | - Roman A. Komakhin
- All-Russia Research Institute of Agricultural Biotechnology, Moscow 127550, Russia; (L.N.E.); (S.R.S.)
| |
Collapse
|
16
|
Miller JM, Meki MH, Ou Q, George SA, Gams A, Abouleisa RRE, Tang XL, Ahern BM, Giridharan GA, El-Baz A, Hill BG, Satin J, Conklin DJ, Moslehi J, Bolli R, Ribeiro AJS, Efimov IR, Mohamed TMA. Heart slice culture system reliably demonstrates clinical drug-related cardiotoxicity. Toxicol Appl Pharmacol 2020; 406:115213. [PMID: 32877659 PMCID: PMC7554180 DOI: 10.1016/j.taap.2020.115213] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Revised: 08/20/2020] [Accepted: 08/22/2020] [Indexed: 02/07/2023]
Abstract
The limited availability of human heart tissue and its complex cell composition are major limiting factors for the reliable testing of drug efficacy and toxicity. Recently, we developed functional human and pig heart slice biomimetic culture systems that preserve the viability and functionality of 300 μm heart slices for up to 6 days. Here, we tested the reliability of this culture system for testing the cardiotoxicity of anti-cancer drugs. We tested three anti-cancer drugs (doxorubicin, trastuzumab, and sunitinib) with known different mechanisms of cardiotoxicity at three concentrations and assessed the effect of these drugs on heart slice viability, structure, function and gene expression. Slices incubated with any of these drugs for 48 h showed diminished in viability as well as loss of cardiomyocyte structure and function. Mechanistically, RNA sequencing of doxorubicin-treated tissues demonstrated a significant downregulation of cardiac genes and upregulation of oxidative stress responses. Trastuzumab treatment downregulated cardiac muscle contraction-related genes consistent with its clinically known effect on cardiomyocytes. Interestingly, sunitinib treatment resulted in significant downregulation of angiogenesis-related genes, in line with its mechanism of action. Similar to hiPS-derived-cardiomyocytes, heart slices recapitulated the expected toxicity of doxorubicin and trastuzumab, however, slices were superior in detecting sunitinib cardiotoxicity and mechanism in the clinically relevant concentration range of 0.1-1 μM. These results indicate that heart slice culture models have the potential to become a reliable platform for testing and elucidating mechanisms of drug cardiotoxicity.
Collapse
Affiliation(s)
- Jessica M Miller
- Institute of Molecular Cardiology, Department of Medicine, University of Louisville, KY, USA; Department of Bioengineering, University of Louisville, KY, USA
| | - Moustafa H Meki
- Institute of Molecular Cardiology, Department of Medicine, University of Louisville, KY, USA; Department of Bioengineering, University of Louisville, KY, USA
| | - Qinghui Ou
- Institute of Molecular Cardiology, Department of Medicine, University of Louisville, KY, USA
| | - Sharon A George
- Department of Biomedical Engineering, The George Washington University, Washington, DC, USA
| | - Anna Gams
- Department of Biomedical Engineering, The George Washington University, Washington, DC, USA
| | - Riham R E Abouleisa
- Institute of Molecular Cardiology, Department of Medicine, University of Louisville, KY, USA
| | - Xian-Liang Tang
- Institute of Molecular Cardiology, Department of Medicine, University of Louisville, KY, USA
| | - Brooke M Ahern
- Department of Physiology, University of Kentucky, KY, USA
| | | | - Ayman El-Baz
- Department of Bioengineering, University of Louisville, KY, USA
| | - Bradford G Hill
- Envirome Institute, Diabetes and Obesity Center, Department of Medicine, University of Louisville, KY, USA
| | - Jonathan Satin
- Department of Physiology, University of Kentucky, KY, USA
| | - Daniel J Conklin
- Envirome Institute, Diabetes and Obesity Center, Department of Medicine, University of Louisville, KY, USA
| | - Javid Moslehi
- Division of Cardiology, Cardio-Oncology Program, Vanderbilt University Medical Center, 2220 Pierce Avenue, Nashville, USA
| | - Roberto Bolli
- Institute of Molecular Cardiology, Department of Medicine, University of Louisville, KY, USA
| | - Alexandre J S Ribeiro
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research, Office of Translational Science, Office of Clinical Pharmacology, Division of Applied Regulatory Science, Silver Spring, MD, USA.
| | - Igor R Efimov
- Department of Biomedical Engineering, The George Washington University, Washington, DC, USA.
| | - Tamer M A Mohamed
- Institute of Molecular Cardiology, Department of Medicine, University of Louisville, KY, USA; Department of Bioengineering, University of Louisville, KY, USA; Envirome Institute, Diabetes and Obesity Center, Department of Medicine, University of Louisville, KY, USA; Department of Pharmacology and Toxicology, University of Louisville, KY, USA; Institute of Cardiovascular Sciences, University of Manchester, UK; Faculty of Pharmacy, Zagazig University, Egypt.
| |
Collapse
|
17
|
del Valle Morales D, Schoenberg D. Analyzing (Re)Capping of mRNA Using Transcript Specific 5' End Sequencing. Bio Protoc 2020; 10:e3791. [DOI: 10.21769/bioprotoc.3791] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 08/27/2020] [Accepted: 08/31/2020] [Indexed: 11/02/2022] Open
|