1
|
Seki M, Kuze Y, Zhang X, Kurotani KI, Notaguchi M, Nishio H, Kudoh H, Suzaki T, Yoshida S, Sugano S, Matsushita T, Suzuki Y. An improved method for the highly specific detection of transcription start sites. Nucleic Acids Res 2024; 52:e7. [PMID: 37994784 PMCID: PMC10810191 DOI: 10.1093/nar/gkad1116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 10/17/2023] [Accepted: 11/06/2023] [Indexed: 11/24/2023] Open
Abstract
Precise detection of the transcriptional start site (TSS) is a key for characterizing transcriptional regulation of genes and for annotation of newly sequenced genomes. Here, we describe the development of an improved method, designated 'TSS-seq2.' This method is an iterative improvement of TSS-seq, a previously published enzymatic cap-structure conversion method to detect TSSs in base sequences. By modifying the original procedure, including by introducing split ligation at the key cap-selection step, the yield and the accuracy of the reaction has been substantially improved. For example, TSS-seq2 can be conducted using as little as 5 ng of total RNA with an overall accuracy of 96%; this yield a less-biased and more precise detection of TSS. We then applied TSS-seq2 for TSS analysis of four plant species that had not yet been analyzed by any previous TSS method.
Collapse
Affiliation(s)
- Masahide Seki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Yuta Kuze
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Xiang Zhang
- Division of Biological Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Ken-ichi Kurotani
- Bioscience and Biotechnology Center, Nagoya University, Aichi, Japan
| | - Michitaka Notaguchi
- Bioscience and Biotechnology Center, Nagoya University, Aichi, Japan
- Department of Botany, Graduate School of Science, Kyoto University, Kyoto, Japan
- Graduate School of Bioagricultural Sciences, Nagoya University, Aichi, Nagoya, Japan
| | - Haruki Nishio
- Data Science and AI Innovation Research Promotion Center, Shiga University, Shiga, Japan
| | - Hiroshi Kudoh
- Center for Ecological Research, Kyoto University, Shiga, Japan
| | - Takuya Suzaki
- Faculty of Life and Environmental Sciences, University of Tsukuba, Ibaraki, Japan
- Tsukuba Plant-Innovation Research Center, University of Tsukuba, Ibaraki, Japan
| | - Satoko Yoshida
- Division of Biological Science, Graduate School of Science and Technology, Nara Institute of Science and Technology, Nara, Japan
| | - Sumio Sugano
- Institute of Kashiwa-no-ha Omics Gate, Chiba, Japan
- Future Medicine Education and Research Organization, Chiba University, Chiba, Japan
| | - Tomonao Matsushita
- Department of Botany, Graduate School of Science, Kyoto University, Kyoto, Japan
| | - Yutaka Suzuki
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| |
Collapse
|
2
|
Sokolova K, Theesfeld CL, Wong AK, Zhang Z, Dolinski K, Troyanskaya OG. Atlas of primary cell-type-specific sequence models of gene expression and variant effects. CELL REPORTS METHODS 2023; 3:100580. [PMID: 37703883 PMCID: PMC10545936 DOI: 10.1016/j.crmeth.2023.100580] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 05/05/2023] [Accepted: 08/18/2023] [Indexed: 09/15/2023]
Abstract
Human biology is rooted in highly specialized cell types programmed by a common genome, 98% of which is outside of genes. Genetic variation in the enormous noncoding space is linked to the majority of disease risk. To address the problem of linking these variants to expression changes in primary human cells, we introduce ExPectoSC, an atlas of modular deep-learning-based models for predicting cell-type-specific gene expression directly from sequence. We provide models for 105 primary human cell types covering 7 organ systems, demonstrate their accuracy, and then apply them to prioritize relevant cell types for complex human diseases. The resulting atlas of sequence-based gene expression and variant effects is publicly available in a user-friendly interface and readily extensible to any primary cell types. We demonstrate the accuracy of our approach through systematic evaluations and apply the models to prioritize ClinVar clinical variants of uncertain significance, verifying our top predictions experimentally.
Collapse
Affiliation(s)
- Ksenia Sokolova
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Chandra L Theesfeld
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA.
| | - Aaron K Wong
- Flatiron Institute, Simons Foundation, New York City, NY 10001, USA
| | - Zijun Zhang
- Flatiron Institute, Simons Foundation, New York City, NY 10001, USA; Division of Artificial Intelligence in Medicine, Cedars-Sinai Medical Center, 116 N. Robertson Boulevard, Los Angeles, CA 90048, USA
| | - Kara Dolinski
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA
| | - Olga G Troyanskaya
- Department of Computer Science, Princeton University, Princeton, NJ 08544, USA; Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08544, USA; Flatiron Institute, Simons Foundation, New York City, NY 10001, USA.
| |
Collapse
|
3
|
Salavati M, Clark R, Becker D, Kühn C, Plastow G, Dupont S, Moreira GCM, Charlier C, Clark EL. Improving the annotation of the cattle genome by annotating transcription start sites in a diverse set of tissues and populations using Cap Analysis Gene Expression sequencing. G3 (BETHESDA, MD.) 2023; 13:jkad108. [PMID: 37216666 PMCID: PMC10411599 DOI: 10.1093/g3journal/jkad108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 02/27/2023] [Accepted: 05/09/2023] [Indexed: 05/24/2023]
Abstract
Understanding the genomic control of tissue-specific gene expression and regulation can help to inform the application of genomic technologies in farm animal breeding programs. The fine mapping of promoters [transcription start sites (TSS)] and enhancers (divergent amplifying segments of the genome local to TSS) in different populations of cattle across a wide diversity of tissues provides information to locate and understand the genomic drivers of breed- and tissue-specific characteristics. To this aim, we used Cap Analysis Gene Expression (CAGE) sequencing, of 24 different tissues from 3 populations of cattle, to define TSS and their coexpressed short-range enhancers (<1 kb) in the ARS-UCD1.2_Btau5.0.1Y reference genome (1000bulls run9) and analyzed tissue and population specificity of expressed promoters. We identified 51,295 TSS and 2,328 TSS-Enhancer regions shared across the 3 populations (dairy, beef-dairy cross, and Canadian Kinsella composite cattle from 2 individuals, 1 of each sex, per population). Cross-species comparative analysis of CAGE data from 7 other species, including sheep, revealed a set of TSS and TSS-Enhancers that were specific to cattle. The CAGE data set will be combined with other transcriptomic information for the same tissues to create a new high-resolution map of transcript diversity across tissues and populations in cattle for the BovReg project. Here we provide the CAGE data set and annotation tracks for TSS and TSS-Enhancers in the cattle genome. This new annotation information will improve our understanding of the drivers of gene expression and regulation in cattle and help to inform the application of genomic technologies in breeding programs.
Collapse
Affiliation(s)
- Mazdak Salavati
- The Roslin Institute, University of Edinburgh, Edinburgh EH25 9RG, UK
| | - Richard Clark
- Edinburgh Clinical Research Facility, Genetics Core, University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Doreen Becker
- Institute of Genome Biology, Research Institute for Farm Animal Biology (FBN), Dummerstorf 18196, Germany
| | - Christa Kühn
- Institute of Genome Biology, Research Institute for Farm Animal Biology (FBN), Dummerstorf 18196, Germany
- Faculty of Agricultural and Environmental Sciences, University Rostock, Rostock 18059, Germany
| | - Graham Plastow
- Department of Agricultural, Food and Nutritional Science, Livestock Gentec, University of Alberta, Edmonton T6G 2H1, Canada
| | - Sébastien Dupont
- Unit of Animal Genomics, GIGA Institute, University of Liège, Liège 4000, Belgium
| | | | - Carole Charlier
- Unit of Animal Genomics, GIGA Institute, University of Liège, Liège 4000, Belgium
- Faculty of Veterinary Medicine, University of Liège, Liège 4000, Belgium
| | | |
Collapse
|
4
|
Adamopoulos PG, Tsiakanikas P, Stolidi I, Scorilas A. A versatile 5′ RACE-Seq methodology for the accurate identification of the 5′ termini of mRNAs. BMC Genomics 2022; 23:163. [PMID: 35219290 PMCID: PMC8881849 DOI: 10.1186/s12864-022-08386-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 02/14/2022] [Indexed: 12/12/2022] Open
Abstract
Background Technological advancements in the era of massive parallel sequencing have enabled the functional dissection of the human transcriptome. However, 5′ ends of mRNAs are significantly underrepresented in these datasets, hindering the efficient analysis of the complex human transcriptome. The implementation of the template-switching mechanism at the reverse transcription stage along with 5′ rapid amplification of cDNA ends (RACE) constitutes the most prominent and efficient strategy to specify the actual 5′ ends of cDNAs. In the current study, we developed a 5′ RACE-seq method by coupling a custom template-switching and 5′ RACE assay with targeted nanopore sequencing, to accurately unveil 5′ termini of mRNA targets. Results The optimization of the described 5′ RACE-seq method was accomplished using the human BCL2L12 as control gene. We unveiled that the selection of hybrid DNA/RNA template-switching oligonucleotides as well as the complete separation of the cDNA extension incubation from the template-switching process, significantly increase the overall efficiency of the downstream 5′ RACE. Collectively, our results support the existence of two distinct 5′ termini for BCL2L12, being in complete accordance with the results derived from both direct RNA and PCR-cDNA sequencing approaches from Oxford Nanopore Technologies. As proof of concept, we implemented the described 5′ RACE-seq methodology to investigate the 5′ UTRs of several kallikrein-related peptidases (KLKs) gene family members. Our results confirmed the existence of multiple annotated 5′ UTRs of the human KLK gene family members, but also identified novel, previously uncharacterized ones. Conclusions In this work we present an in-house developed 5′ RACE-seq method, based on the template-switching mechanism and targeted nanopore sequencing. This approach enables the broad and in-depth study of 5′ UTRs of any mRNA of interest, by offering a tremendous sequencing depth, while significantly reducing the cost-per reaction compared to commercially available kits. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08386-y.
Collapse
|
5
|
Lu Z, Berry K, Hu Z, Zhan Y, Ahn TH, Lin Z. TSSr: an R package for comprehensive analyses of TSS sequencing data. NAR Genom Bioinform 2021; 3:lqab108. [PMID: 34805991 PMCID: PMC8598296 DOI: 10.1093/nargab/lqab108] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Revised: 10/05/2021] [Accepted: 10/27/2021] [Indexed: 12/13/2022] Open
Abstract
Transcription initiation is regulated in a highly organized fashion to ensure proper cellular functions. Accurate identification of transcription start sites (TSSs) and quantitative characterization of transcription initiation activities are fundamental steps for studies of regulated transcriptions and core promoter structures. Several high-throughput techniques have been developed to sequence the very 5'end of RNA transcripts (TSS sequencing) on the genome scale. Bioinformatics tools are essential for processing, analysis, and visualization of TSS sequencing data. Here, we present TSSr, an R package that provides rich functions for mapping TSS and characterizations of structures and activities of core promoters based on all types of TSS sequencing data. Specifically, TSSr implements several newly developed algorithms for accurately identifying TSSs from mapped sequencing reads and inference of core promoters, which are a prerequisite for subsequent functional analyses of TSS data. Furthermore, TSSr also enables users to export various types of TSS data that can be visualized by genome browser for inspection of promoter activities in association with other genomic features, and to generate publication-ready TSS graphs. These user-friendly features could greatly facilitate studies of transcription initiation based on TSS sequencing data. The source code and detailed documentations of TSSr can be freely accessed at https://github.com/Linlab-slu/TSSr.
Collapse
Affiliation(s)
- Zhaolian Lu
- Department of Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Keenan Berry
- Program of Bioinformatics and Computational Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Zhenbin Hu
- Department of Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Yu Zhan
- Department of Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Tae-Hyuk Ahn
- Program of Bioinformatics and Computational Biology, Saint Louis University, St. Louis, MO 63103, USA
| | - Zhenguo Lin
- Department of Biology, Saint Louis University, St. Louis, MO 63103, USA
| |
Collapse
|
6
|
Chowdhary A, Satagopam V, Schneider R. Long Non-coding RNAs: Mechanisms, Experimental, and Computational Approaches in Identification, Characterization, and Their Biomarker Potential in Cancer. Front Genet 2021; 12:649619. [PMID: 34276764 PMCID: PMC8281131 DOI: 10.3389/fgene.2021.649619] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 04/20/2021] [Indexed: 01/09/2023] Open
Abstract
Long non-coding RNAs are diverse class of non-coding RNA molecules >200 base pairs of length having various functions like gene regulation, dosage compensation, epigenetic regulation. Dysregulation and genomic variations of several lncRNAs have been implicated in several diseases. Their tissue and developmental specific expression are contributing factors for them to be viable indicators of physiological states of the cells. Here we present an comprehensive review the molecular mechanisms and functions, state of the art experimental and computational pipelines and challenges involved in the identification and functional annotation of lncRNAs and their prospects as biomarkers. We also illustrate the application of co-expression networks on the TCGA-LIHC dataset for putative functional predictions of lncRNAs having a therapeutic potential in Hepatocellular carcinoma (HCC).
Collapse
Affiliation(s)
- Anshika Chowdhary
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
7
|
Salavati M, Caulton A, Clark R, Gazova I, Smith TPL, Worley KC, Cockett NE, Archibald AL, Clarke SM, Murdoch BM, Clark EL. Global Analysis of Transcription Start Sites in the New Ovine Reference Genome ( Oar rambouillet v1.0). Front Genet 2020; 11:580580. [PMID: 33193703 PMCID: PMC7645153 DOI: 10.3389/fgene.2020.580580] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 09/09/2020] [Indexed: 02/04/2023] Open
Abstract
The overall aim of the Ovine FAANG project is to provide a comprehensive annotation of the new highly contiguous sheep reference genome sequence (Oar rambouillet v1.0). Mapping of transcription start sites (TSS) is a key first step in understanding transcript regulation and diversity. Using 56 tissue samples collected from the reference ewe Benz2616, we have performed a global analysis of TSS and TSS-Enhancer clusters using Cap Analysis Gene Expression (CAGE) sequencing. CAGE measures RNA expression by 5' cap-trapping and has been specifically designed to allow the characterization of TSS within promoters to single-nucleotide resolution. We have adapted an analysis pipeline that uses TagDust2 for clean-up and trimming, Bowtie2 for mapping, CAGEfightR for clustering, and the Integrative Genomics Viewer (IGV) for visualization. Mapping of CAGE tags indicated that the expression levels of CAGE tag clusters varied across tissues. Expression profiles across tissues were validated using corresponding polyA+ mRNA-Seq data from the same samples. After removal of CAGE tags with <10 read counts, 39.3% of TSS overlapped with 5' ends of 31,113 transcripts that had been previously annotated by NCBI (out of a total of 56,308 from the NCBI annotation). For 25,195 of the transcripts, previously annotated by NCBI, no TSS meeting stringent criteria were identified. A further 14.7% of TSS mapped to within 50 bp of annotated promoter regions. Intersecting these predicted TSS regions with annotated promoter regions (±50 bp) revealed 46% of the predicted TSS were "novel" and previously un-annotated. Using whole-genome bisulfite sequencing data from the same tissues, we were able to determine that a proportion of these "novel" TSS were hypo-methylated (32.2%) indicating that they are likely to be reproducible rather than "noise". This global analysis of TSS in sheep will significantly enhance the annotation of gene models in the new ovine reference assembly. Our analyses provide one of the highest resolution annotations of transcript regulation and diversity in a livestock species to date.
Collapse
Affiliation(s)
- Mazdak Salavati
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, United Kingdom
- Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
| | - Alex Caulton
- AgResearch, Invermay Agricultural Centre, Mosgiel, New Zealand
- Genetics Otago, Department of Biochemistry, University of Otago, Dunedin, New Zealand
| | - Richard Clark
- Genetics Core, Edinburgh Clinical Research Facility, The University of Edinburgh, Edinburgh, United Kingdom
| | - Iveta Gazova
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, United Kingdom
- MRC Human Genetics Unit, The University of Edinburgh, Edinburgh, United Kingdom
| | - Timothy P. L. Smith
- USDA, Agricultural Research Service, U.S. Meat Animal Research Center, Clay Center, NE, United States
| | - Kim C. Worley
- Baylor College of Medicine, Houston, TX, United States
| | - Noelle E. Cockett
- Department of Animal, Dairy and Veterinary Sciences, Utah State University, Logan, UT, United States
| | - Alan L. Archibald
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, United Kingdom
| | | | - Brenda M. Murdoch
- Department of Animal, Veterinary and Food Sciences, University of Idaho, Moscow, ID, United States
| | - Emily L. Clark
- The Roslin Institute, Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Edinburgh, United Kingdom
- Centre for Tropical Livestock Genetics and Health (CTLGH), Roslin Institute, University of Edinburgh, Midlothian, United Kingdom
| |
Collapse
|
8
|
Wang J, Li B, Marques S, Steinmetz LM, Wei W, Pelechano V. TIF-Seq2 disentangles overlapping isoforms in complex human transcriptomes. Nucleic Acids Res 2020; 48:e104. [PMID: 32816037 PMCID: PMC7544212 DOI: 10.1093/nar/gkaa691] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 07/17/2020] [Accepted: 08/07/2020] [Indexed: 12/17/2022] Open
Abstract
Eukaryotic transcriptomes are complex, involving thousands of overlapping transcripts. The interleaved nature of the transcriptomes limits our ability to identify regulatory regions, and in some cases can lead to misinterpretation of gene expression. To improve the understanding of the overlapping transcriptomes, we have developed an optimized method, TIF-Seq2, able to sequence simultaneously the 5' and 3' ends of individual RNA molecules at single-nucleotide resolution. We investigated the transcriptome of a well characterized human cell line (K562) and identified thousands of unannotated transcript isoforms. By focusing on transcripts which are challenging to be investigated with RNA-Seq, we accurately defined boundaries of lowly expressed unannotated and read-through transcripts putatively encoding fusion genes. We validated our results by targeted long-read sequencing and standard RNA-Seq for chronic myeloid leukaemia patient samples. Taking the advantage of TIF-Seq2, we explored transcription regulation among overlapping units and investigated their crosstalk. We show that most overlapping upstream transcripts use poly(A) sites within the first 2 kb of the downstream transcription units. Our work shows that, by paring the 5' and 3' end of each RNA, TIF-Seq2 can improve the annotation of complex genomes, facilitate accurate assignment of promoters to genes and easily identify transcriptionally fused genes.
Collapse
Affiliation(s)
- Jingwen Wang
- SciLifeLab, Department of Microbiology, Tumor and Cell Biology. Karolinska Institutet, Solna, Sweden
| | - Bingnan Li
- SciLifeLab, Department of Microbiology, Tumor and Cell Biology. Karolinska Institutet, Solna, Sweden
| | - Sueli Marques
- SciLifeLab, Department of Microbiology, Tumor and Cell Biology. Karolinska Institutet, Solna, Sweden
| | - Lars M Steinmetz
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Heidelberg, Germany
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, USA
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA
| | - Wu Wei
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, USA
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- Center for Biomedical Informatics, Shanghai Engineering Research Center for Big Data in Pediatric Precision Medicine, Shanghai Children's Hospital, Shanghai Jiao Tong University, Shanghai 200040, China
| | - Vicent Pelechano
- SciLifeLab, Department of Microbiology, Tumor and Cell Biology. Karolinska Institutet, Solna, Sweden
| |
Collapse
|
9
|
Rodríguez-Martínez M, Boissiére T, Noe Gonzalez M, Litchfield K, Mitter R, Walker J, Kjœr S, Ismail M, Downward J, Swanton C, Svejstrup JQ. Evidence That STK19 Is Not an NRAS-dependent Melanoma Driver. Cell 2020; 181:1395-1405.e11. [PMID: 32531245 PMCID: PMC7298618 DOI: 10.1016/j.cell.2020.04.014] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Revised: 03/18/2020] [Accepted: 04/10/2020] [Indexed: 12/20/2022]
Abstract
STK19 was proposed to be a cancer driver, and recent work by Yin et al. (2019) in Cell suggested that the frequently recurring STK19 D89N substitution represents a gain-of-function change, allowing increased phosphorylation of NRAS to enhance melanocyte transformation. Here we show that the STK19 gene has been incorrectly annotated, and that the expressed protein is 110 amino acids shorter than indicated by current databases. The "cancer driving" STK19 D89N substitution is thus outside the coding region. We also fail to detect evidence of the mutation affecting STK19 expression; instead, it is a UV signature mutation, found in the promoter of other genes as well. Furthermore, STK19 is exclusively nuclear and chromatin-associated, while no evidence for it being a kinase was found. The data in this Matters Arising article raise fundamental questions about the recently proposed role for STK19 in melanoma progression via a function as an NRAS kinase, suggested by Yin et al. (2019) in Cell. See also the response by Yin et al. (2020), published in this issue.
Collapse
Affiliation(s)
- Marta Rodríguez-Martínez
- Mechanisms of Transcription Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Thierry Boissiére
- Mechanisms of Transcription Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Melvin Noe Gonzalez
- Mechanisms of Transcription Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Kevin Litchfield
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Richard Mitter
- Bioinformatics and Biostatistics, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Jane Walker
- Mechanisms of Transcription Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Svend Kjœr
- Structural Biology Science Technology Platform, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Mohamed Ismail
- Oncogene Biology Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Julian Downward
- Oncogene Biology Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Charles Swanton
- Cancer Evolution and Genome Instability Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Jesper Q Svejstrup
- Mechanisms of Transcription Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK.
| |
Collapse
|
10
|
Abugessaisa I, Noguchi S, Hasegawa A, Kondo A, Kawaji H, Carninci P, Kasukawa T. refTSS: A Reference Data Set for Human and Mouse Transcription Start Sites. J Mol Biol 2019; 431:2407-2422. [DOI: 10.1016/j.jmb.2019.04.045] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2018] [Revised: 04/25/2019] [Accepted: 04/29/2019] [Indexed: 01/22/2023]
|
11
|
Babarinde IA, Li Y, Hutchins AP. Computational Methods for Mapping, Assembly and Quantification for Coding and Non-coding Transcripts. Comput Struct Biotechnol J 2019; 17:628-637. [PMID: 31193391 PMCID: PMC6526290 DOI: 10.1016/j.csbj.2019.04.012] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 04/24/2019] [Accepted: 04/29/2019] [Indexed: 12/17/2022] Open
Abstract
The measurement of gene expression has long provided significant insight into biological functions. The development of high-throughput short-read sequencing technology has revealed transcriptional complexity at an unprecedented scale, and informed almost all areas of biology. However, as researchers have sought to gather more insights from the data, these new technologies have also increased the computational analysis burden. In this review, we describe typical computational pipelines for RNA-Seq analysis and discuss their strengths and weaknesses for the assembly, quantification and analysis of coding and non-coding RNAs. We also discuss the assembly of transposable elements into transcripts, and the difficulty these repetitive elements pose. In summary, RNA-Seq is a powerful technology that is likely to remain a key asset in the biologist's toolkit.
Collapse
Affiliation(s)
| | | | - Andrew P. Hutchins
- Department of Biology, Southern University of Science and Technology, 1088 Xueyuan Lu, Shenzhen, China
| |
Collapse
|