251
|
Zamora-Ballesteros C, Pinto G, Amaral J, Valledor L, Alves A, Diez JJ, Martín-García J. Dual RNA-Sequencing Analysis of Resistant ( Pinus pinea) and Susceptible ( Pinus radiata) Hosts during Fusarium circinatum Challenge. Int J Mol Sci 2021; 22:5231. [PMID: 34063405 PMCID: PMC8156185 DOI: 10.3390/ijms22105231] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Revised: 05/11/2021] [Accepted: 05/13/2021] [Indexed: 12/13/2022] Open
Abstract
Fusarium circinatum causes one of the most important diseases of conifers worldwide, the pine pitch canker (PPC). However, no effective field intervention measures aiming to control or eradicate PPC are available. Due to the variation in host genetic resistance, the development of resistant varieties is postulated as a viable and promising strategy. By using an integrated approach, this study aimed to identify differences in the molecular responses and physiological traits of the highly susceptible Pinus radiata and the highly resistant Pinus pinea to F. circinatum at an early stage of infection. Dual RNA-Seq analysis also allowed to evaluate pathogen behavior when infecting each pine species. No significant changes in the physiological analysis were found upon pathogen infection, although transcriptional reprogramming was observed mainly in the resistant species. The transcriptome profiling of P. pinea revealed an early perception of the pathogen infection together with a strong and coordinated defense activation through the reinforcement and lignification of the cell wall, the antioxidant activity, the induction of PR genes, and the biosynthesis of defense hormones. On the contrary, P. radiata had a weaker response, possibly due to impaired perception of the fungal infection that led to a reduced downstream defense signaling. Fusarium circinatum showed a different transcriptomic profile depending on the pine species being infected. While in P. pinea, the pathogen focused on the degradation of plant cell walls, active uptake of the plant nutrients was showed in P. radiata. These findings present useful knowledge for the development of breeding programs to manage PPC.
Collapse
Affiliation(s)
- Cristina Zamora-Ballesteros
- Sustainable Forest Management Research Institute, University of Valladolid—INIA, 34004 Palencia, Spain; (J.J.D.); (J.M.-G.)
- Department of Vegetal Production and Forest Resources, University of Valladolid, 34004 Palencia, Spain
| | - Gloria Pinto
- Centre for Environmental and Marine Studies, CESAM, Department of Biology, University of Aveiro, 3810-193 Aveiro, Portugal; (G.P.); (J.A.); (A.A.)
| | - Joana Amaral
- Centre for Environmental and Marine Studies, CESAM, Department of Biology, University of Aveiro, 3810-193 Aveiro, Portugal; (G.P.); (J.A.); (A.A.)
| | - Luis Valledor
- Department of Organisms and Systems Biology, University of Oviedo, 33071 Oviedo, Spain;
| | - Artur Alves
- Centre for Environmental and Marine Studies, CESAM, Department of Biology, University of Aveiro, 3810-193 Aveiro, Portugal; (G.P.); (J.A.); (A.A.)
| | - Julio J. Diez
- Sustainable Forest Management Research Institute, University of Valladolid—INIA, 34004 Palencia, Spain; (J.J.D.); (J.M.-G.)
- Department of Vegetal Production and Forest Resources, University of Valladolid, 34004 Palencia, Spain
| | - Jorge Martín-García
- Sustainable Forest Management Research Institute, University of Valladolid—INIA, 34004 Palencia, Spain; (J.J.D.); (J.M.-G.)
- Department of Vegetal Production and Forest Resources, University of Valladolid, 34004 Palencia, Spain
| |
Collapse
|
252
|
Chung PY, Shoji K, Izumi N, Tomari Y. Dynamic subcellular compartmentalization ensures fidelity of piRNA biogenesis in silkworms. EMBO Rep 2021; 22:e51342. [PMID: 33973704 DOI: 10.15252/embr.202051342] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 03/31/2021] [Accepted: 04/12/2021] [Indexed: 11/09/2022] Open
Abstract
PIWI-interacting RNAs (piRNAs) guide PIWI proteins to silence transposable elements and safeguard fertility in germ cells. Many protein factors required for piRNA biogenesis localize to perinuclear ribonucleoprotein (RNP) condensates named nuage, where target silencing and piRNA amplification are thought to occur. In mice, some of the piRNA factors are found in discrete cytoplasmic foci called processing bodies (P-bodies). However, the dynamics and biological significance of such compartmentalization of the piRNA pathway remain unclear. Here, by analyzing the subcellular localization of functional mutants of piRNA factors, we show that piRNA factors are actively compartmentalized into nuage and P-bodies in silkworm cells. Proper demixing of nuage and P-bodies requires target cleavage by the PIWI protein Siwi and ATP hydrolysis by the DEAD-box helicase BmVasa, disruption of which leads to promiscuous overproduction of piRNAs deriving from non-transposable elements. Our study highlights a role of dynamic subcellular compartmentalization in ensuring the fidelity of piRNA biogenesis.
Collapse
Affiliation(s)
- Pui Yuen Chung
- Laboratory of RNA Function, Institute for Quantitative Biosciences, The University of Tokyo, Tokyo, Japan.,Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
| | - Keisuke Shoji
- Laboratory of RNA Function, Institute for Quantitative Biosciences, The University of Tokyo, Tokyo, Japan
| | - Natsuko Izumi
- Laboratory of RNA Function, Institute for Quantitative Biosciences, The University of Tokyo, Tokyo, Japan
| | - Yukihide Tomari
- Laboratory of RNA Function, Institute for Quantitative Biosciences, The University of Tokyo, Tokyo, Japan.,Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
| |
Collapse
|
253
|
Dias GB, Altammami MA, El-Shafie HAF, Alhoshani FM, Al-Fageeh MB, Bergman CM, Manee MM. Haplotype-resolved genome assembly enables gene discovery in the red palm weevil Rhynchophorus ferrugineus. Sci Rep 2021; 11:9987. [PMID: 33976235 PMCID: PMC8113489 DOI: 10.1038/s41598-021-89091-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 04/08/2021] [Indexed: 01/22/2023] Open
Abstract
The red palm weevil Rhynchophorus ferrugineus (Coleoptera: Curculionidae) is an economically-important invasive species that attacks multiple species of palm trees around the world. A better understanding of gene content and function in R. ferrugineus has the potential to inform pest control strategies and thereby mitigate economic and biodiversity losses caused by this species. Using 10x Genomics linked-read sequencing, we produced a haplotype-resolved diploid genome assembly for R. ferrugineus from a single heterozygous individual with modest sequencing coverage (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\sim$$\end{document}∼ 62x). Benchmarking against conserved single-copy Arthropod orthologs suggests both pseudo-haplotypes in our R. ferrugineus genome assembly are highly complete with respect to gene content, and do not suffer from haplotype-induced duplication artifacts present in a recently published hybrid assembly for this species. Annotation of the larger pseudo-haplotype in our assembly provides evidence for 23,413 protein-coding loci in R. ferrugineus, including over 13,000 predicted proteins annotated with Gene Ontology terms and over 6000 loci independently supported by high-quality Iso-Seq transcriptomic data. Our assembly also includes 95% of R. ferrugineus chemosensory, detoxification and neuropeptide-related transcripts identified previously using RNA-seq transcriptomic data, and provides a platform for the molecular analysis of these and other functionally-relevant genes that can help guide management of this widespread insect pest.
Collapse
Affiliation(s)
- Guilherme B Dias
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA.
| | - Musaad A Altammami
- National Center for Biotechnology,, King Abdulaziz City for Science and Technology, Riyadh, 11442, Saudi Arabia
| | - Hamadttu A F El-Shafie
- Date Palm Research Center of Excellence, King Faisal University, Al-Ahsa, 31982, Saudi Arabia
| | - Fahad M Alhoshani
- National Center for Biotechnology,, King Abdulaziz City for Science and Technology, Riyadh, 11442, Saudi Arabia
| | - Mohamed B Al-Fageeh
- Life Sciences and Environment Research Institute, King Abdulaziz City for Science and Technology, Riyadh, 11442, Saudi Arabia
| | - Casey M Bergman
- Department of Genetics and Institute of Bioinformatics, University of Georgia, Athens, GA, 30602, USA
| | - Manee M Manee
- National Center for Biotechnology,, King Abdulaziz City for Science and Technology, Riyadh, 11442, Saudi Arabia.
| |
Collapse
|
254
|
RiboDoc: A Docker-based package for ribosome profiling analysis. Comput Struct Biotechnol J 2021; 19:2851-2860. [PMID: 34093996 PMCID: PMC8141510 DOI: 10.1016/j.csbj.2021.05.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 05/03/2021] [Accepted: 05/05/2021] [Indexed: 11/29/2022] Open
Abstract
Ribosome profiling (RiboSeq) has emerged as a powerful technique for studying the genome-wide regulation of translation in various cells. Several steps in the biological protocol have been improved, but the bioinformatics part of RiboSeq suffers from a lack of standardization, preventing the straightforward and complete reproduction of published results. Too many published studies provide insufficient detail about the bioinformatics pipeline used. The broad range of questions that can be asked with RiboSeq makes it difficult to use a single bioinformatics tool. Indeed, many scripts have been published for addressing diverse questions. Here (https://github.com/equipeGST/RiboDoc), we propose a unique tool (for use with multiple operating systems, OS) to standardize the general steps that must be performed systematically in RiboSeq analysis, together with the statistical analysis and quality control of the sample. The data generated can then be exploited with more specific tools. We hope that this tool will help to standardize bioinformatics analyses pipelines in the field of translation.
Collapse
|
255
|
Nanopore Sequencing and Hi-C Based De Novo Assembly of Trachidermus fasciatus Genome. Genes (Basel) 2021; 12:genes12050692. [PMID: 34066304 PMCID: PMC8148166 DOI: 10.3390/genes12050692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 04/25/2021] [Accepted: 04/27/2021] [Indexed: 11/17/2022] Open
Abstract
Trachidermus fasciatus is a roughskin sculpin fish widespread across the coastal areas of East Asia. Due to environmental destruction and overfishing, the population of this species is under threat. In order to protect this endangered species, it is important to have the genome sequenced. Reference genomes are essential for studying population genetics, domestic farming, and genetic resource protection. However, currently, no reference genome is available for Trachidermus fasciatus, and this has greatly hindered the research on this species. In this study, we integrated nanopore long-read sequencing, Illumina short-read sequencing, and Hi-C methods to thoroughly assemble the Trachidermus fasciatus genome. Our results provided a chromosome-level high-quality genome assembly with a predicted genome size of 542.6 Mbp (2n = 40) and a scaffold N50 of 24.9 Mbp. The BUSCO value for genome assembly completeness was higher than 96%, and the single-base accuracy was 99.997%. Based on EVM-StringTie genome annotation, a total of 19,147 protein-coding genes were identified, including 35,093 mRNA transcripts. In addition, a novel gene-finding strategy named RNR was introduced, and in total, 51 (82) novel genes (transcripts) were identified. Lastly, we present here the first reference genome for Trachidermus fasciatus; this sequence is expected to greatly facilitate future research on this species.
Collapse
|
256
|
Duckett DJ, Sullivan J, Pirro S, Carstens BC. Genomic Resources for the North American Water Vole ( Microtus richardsoni) and the Montane Vole ( Microtus montanus). GIGABYTE 2021; 2021:gigabyte19. [PMID: 36824326 PMCID: PMC9631978 DOI: 10.46471/gigabyte.19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2021] [Accepted: 05/04/2021] [Indexed: 11/09/2022] Open
Abstract
Voles of the genus Microtus are important research organisms, yet genomic resources are lacking. Such resources would benefit future studies of immunology, phylogeography, cryptic diversity, and more. We sequenced and assembled nuclear genomes from two subspecies of water vole (Microtus richardsoni) and from the montane vole (Microtus montanus). The water vole genomes were sequenced with Illumina and 10× Chromium plus Illumina sequencing, resulting in assemblies with ∼1600,000 and ∼30,000 scaffolds, respectively. The montane vole was also assembled into ∼13,000 scaffolds using Illumina sequencing. Mitochondrial genome assemblies were also performed for both species. Structural and functional annotation for the best water vole nuclear genome resulted in ∼24,500 annotated genes, with 83% of these having functional annotations. Assembly quality statistics for our nuclear assemblies fall within the range of genomes previously published in the genus Microtus, making the water vole and montane vole genomes useful additions to currently available genomic resources.
Collapse
Affiliation(s)
- Drew J. Duckett
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, 1315 Kinnear Rd., Columbus, OH 43212, USA, Corresponding author. E-mail:
| | - Jack Sullivan
- Department of Biological Sciences, University of Idaho, Box 443051, Moscow, ID 83844-3051, USA
| | - Stacy Pirro
- Iridian Genomes, Inc., 6213 Swords Way, Bethesda, MD 20817, USA
| | - Bryan C. Carstens
- Department of Evolution, Ecology, and Organismal Biology, The Ohio State University, 1315 Kinnear Rd., Columbus, OH 43212, USA
| |
Collapse
|
257
|
Agostini F, Zagalak J, Attig J, Ule J, Luscombe NM. Intergenic RNA mainly derives from nascent transcripts of known genes. Genome Biol 2021; 22:136. [PMID: 33952325 PMCID: PMC8097831 DOI: 10.1186/s13059-021-02350-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2020] [Accepted: 04/12/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Eukaryotic genomes undergo pervasive transcription, leading to the production of many types of stable and unstable RNAs. Transcription is not restricted to regions with annotated gene features but includes almost any genomic context. Currently, the source and function of most RNAs originating from intergenic regions in the human genome remain unclear. RESULTS We hypothesize that many intergenic RNAs can be ascribed to the presence of as-yet unannotated genes or the "fuzzy" transcription of known genes that extends beyond the annotated boundaries. To elucidate the contributions of these two sources, we assemble a dataset of more than 2.5 billion publicly available RNA-seq reads across 5 human cell lines and multiple cellular compartments to annotate transcriptional units in the human genome. About 80% of transcripts from unannotated intergenic regions can be attributed to the fuzzy transcription of existing genes; the remaining transcripts originate mainly from putative long non-coding RNA loci that are rarely spliced. We validate the transcriptional activity of these intergenic RNAs using independent measurements, including transcriptional start sites, chromatin signatures, and genomic occupancies of RNA polymerase II in various phosphorylation states. We also analyze the nuclear localization and sensitivities of intergenic transcripts to nucleases to illustrate that they tend to be rapidly degraded either on-chromatin by XRN2 or off-chromatin by the exosome. CONCLUSIONS We provide a curated atlas of intergenic RNAs that distinguishes between alternative processing of well-annotated genes from independent transcriptional units based on the combined analysis of chromatin signatures, nuclear RNA localization, and degradation pathways.
Collapse
Affiliation(s)
| | - Julian Zagalak
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, Queen Square, London, WC1N 3BG, UK
| | - Jan Attig
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK
| | - Jernej Ule
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK
- Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, Queen Square, London, WC1N 3BG, UK
| | - Nicholas M Luscombe
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT, UK
- UCL Genetics Institute, Department of Genetics, Environment and Evolution, University College London, Gower Street, London, WC1E 6BT, UK
- Okinawa Institute of Science & Technology Graduate University, 1919-1 Tancha, Onna-son, Kunigami-gun, Okinawa, 904-0495, Japan
| |
Collapse
|
258
|
Long Intergenic Non-Coding RNAs in the Mammary Parenchyma and Fat Pad of Pre-Weaning Heifer Calves: Identification and Functional Analysis. Animals (Basel) 2021; 11:ani11051268. [PMID: 33924848 PMCID: PMC8145500 DOI: 10.3390/ani11051268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 04/10/2021] [Accepted: 04/24/2021] [Indexed: 11/17/2022] Open
Abstract
Enhanced plane of nutrition at pre-weaning stage can promote the development of mammary gland especially heifer calves. Although several genes are involved in this process, long intergenic non-coding RNAs (lincRNAs) are regarded as key regulators in the regulated network and are still largely unknown. We identified and characterized 534 putative lincRNAs based on the published RNA-seq data, including heifer calves in two groups: fed enhanced milk replacer (EH, 1.13 kg/day, including 28% crude protein, 25% fat) group and fed restricted milk replacer (R, 0.45 kg/day, including 20% crude protein, 20% fat) group. Sub-samples from the mammary parenchyma (PAR) and mammary fat pad (MFP) were harvested from heifer calves. According to the information of these lincRNAs' quantitative trait loci (QTLs), the neighboring and co-expression genes were used to predict their function. By comparing EH vs R, 79 lincRNAs (61 upregulated, 18 downregulated) and 86 lincRNAs (54 upregulated, 32 downregulated) were differentially expressed in MFP and PAR, respectively. In MFP, some differentially expressed lincRNAs (DELs) are involved in lipid metabolism pathways, while, in PAR, among of DELs are involved in cell proliferation pathways. Taken together, this study explored the potential regulatory mechanism of lincRNAs in the mammary gland development of calves under different planes of nutrition.
Collapse
|
259
|
Lassance JM, Ding BJ, Löfstedt C. Evolution of the codling moth pheromone via an ancient gene duplication. BMC Biol 2021; 19:83. [PMID: 33892710 PMCID: PMC8063362 DOI: 10.1186/s12915-021-01001-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2020] [Accepted: 03/07/2021] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Defining the origin of genetic novelty is central to our understanding of the evolution of novel traits. Diversification among fatty acid desaturase (FAD) genes has played a fundamental role in the introduction of structural variation in fatty acyl derivatives. Because of its central role in generating diversity in insect semiochemicals, the FAD gene family has become a model to study how gene family expansions can contribute to the evolution of lineage-specific innovations. Here we used the codling moth (Cydia pomonella) as a study system to decipher the proximate mechanism underlying the production of the ∆8∆10 signature structure of olethreutine moths. Biosynthesis of the codling moth sex pheromone, (E8,E10)-dodecadienol (codlemone), involves two consecutive desaturation steps, the first of which is unusual in that it generates an E9 unsaturation. The second step is also atypical: it generates a conjugated diene system from the E9 monoene C12 intermediate via 1,4-desaturation. RESULTS Here we describe the characterization of the FAD gene acting in codlemone biosynthesis. We identify 27 FAD genes corresponding to the various functional classes identified in insects and Lepidoptera. These genes are distributed across the C. pomonella genome in tandem arrays or isolated genes, indicating that the FAD repertoire consists of both ancient and recent duplications and expansions. Using transcriptomics, we show large divergence in expression domains: some genes appear ubiquitously expressed across tissue and developmental stages; others appear more restricted in their expression pattern. Functional assays using heterologous expression systems reveal that one gene, Cpo_CPRQ, which is prominently and exclusively expressed in the female pheromone gland, encodes an FAD that possesses both E9 and ∆8∆10 desaturation activities. Phylogenetically, Cpo_CPRQ clusters within the Lepidoptera-specific ∆10/∆11 clade of FADs, a classic reservoir of unusual desaturase activities in moths. CONCLUSIONS Our integrative approach shows that the evolution of the signature pheromone structure of olethreutine moths relied on a gene belonging to an ancient gene expansion. Members of other expanded FAD subfamilies do not appear to play a role in chemical communication. This advises for caution when postulating the consequences of lineage-specific expansions based on genomics alone.
Collapse
Affiliation(s)
- Jean-Marc Lassance
- Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
- Department of Organismic and Evolutionary Biology, Harvard University, 16 Divinity Avenue, Cambridge, MA, 02138, USA
| | - Bao-Jian Ding
- Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Christer Löfstedt
- Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden.
| |
Collapse
|
260
|
Wu Q, Luo Y, Wu X, Bai X, Ye X, Liu C, Wan Y, Xiang D, Li Q, Zou L, Zhao G. Identification of the specific long-noncoding RNAs involved in night-break mediated flowering retardation in Chenopodium quinoa. BMC Genomics 2021; 22:284. [PMID: 33874907 PMCID: PMC8056640 DOI: 10.1186/s12864-021-07605-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 04/08/2021] [Indexed: 11/10/2022] Open
Abstract
Background Night-break (NB) has been proven to repress flowering of short-day plants (SDPs). Long-noncoding RNAs (lncRNAs) play key roles in plant flowering. However, investigation of the relationship between lncRNAs and NB responses is still limited, especially in Chenopodium quinoa, an important short-day coarse cereal. Results In this study, we performed strand-specific RNA-seq of leaf samples collected from quinoa seedlings treated by SD and NB. A total of 4914 high-confidence lncRNAs were identified, out of which 91 lncRNAs showed specific responses to SD and NB. Based on the expression profiles, we identified 17 positive- and 7 negative-flowering lncRNAs. Co-expression network analysis indicated that 1653 mRNAs were the common targets of both types of flowering lncRNAs. By mapping these targets to the known flowering pathways in model plants, we found some pivotal flowering homologs, including 2 florigen encoding genes (FT (FLOWERING LOCUS T) and TSF (TWIN SISTER of FT) homologs), 3 circadian clock related genes (EARLY FLOWERING 3 (ELF3), LATE ELONGATED HYPOCOTYL (LHY) and ELONGATED HYPOCOTYL 5 (HY5) homologs), 2 photoreceptor genes (PHYTOCHROME A (PHYA) and CRYPTOCHROME1 (CRY1) homologs), 1 B-BOX type CONSTANS (CO) homolog and 1 RELATED TO ABI3/VP1 (RAV1) homolog, were specifically affected by NB and competed by the positive and negative-flowering lncRNAs. We speculated that these potential flowering lncRNAs may mediate quinoa NB responses by modifying the expression of the floral homologous genes. Conclusions Together, the findings in this study will deepen our understanding of the roles of lncRNAs in NB responses, and provide valuable information for functional characterization in future. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07605-2.
Collapse
Affiliation(s)
- Qi Wu
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China.
| | - Yiming Luo
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| | - Xiaoyong Wu
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| | - Xue Bai
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| | - Xueling Ye
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| | - Changying Liu
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| | - Yan Wan
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| | - Dabing Xiang
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| | - Qiang Li
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| | - Liang Zou
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| | - Gang Zhao
- Key Laboratory of Coarse Cereal Processing, Ministry of Agriculture and Rural Affairs, Sichuan Engineering & Technology Research Center of Coarse Cereal Industralization, School of Food and Biological Engineering, Chengdu University, Chengluo road 2025, Shiling town, Longquanyi District, Chengdu, 610106, Sichuan Province, P.R. China
| |
Collapse
|
261
|
Vanamamalai VK, Garg P, Kolluri G, Gandham RK, Jali I, Sharma S. Transcriptomic analysis to infer key molecular players involved during host response to NDV challenge in Gallus gallus (Leghorn & Fayoumi). Sci Rep 2021; 11:8486. [PMID: 33875770 PMCID: PMC8055681 DOI: 10.1038/s41598-021-88029-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 03/22/2021] [Indexed: 11/09/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are the transcripts of length longer than 200 nucleotides. They are involved in the regulation of various biological activities. Leghorn and Fayoumi breeds of Gallus gallus were known to be having differential resistance against Newcastle Disease Virus (NDV) infection. Differentially expressed genes which were thought to be involved in this pattern of resistance were already studied. Here we report the analysis of the transcriptomic data of Harderian gland of Gallus gallus for studying the lncRNAs involved in regulation of these genes. Using bioinformatics approaches, a total of 37,411 lncRNAs were extracted and 359 lncRNAs were differentially expressing. Functional annotation using co-expression analysis revealed the involvement of lncRNAs in the regulation of various pathways. We also identified 1232 quantitative trait loci (QTLs) associated with the genes interacting with lncRNA. Additionally, we identified the role of lncRNAs as putative micro RNA precursors, and the interaction of differentially expressed Genes with transcription factors and micro RNAs. Our study revealed the role of lncRNAs during host response against NDV infection which would facilitate future experiments in unravelling regulatory mechanisms of development in the genetic improvement of the susceptible breeds of Gallus gallus.
Collapse
Affiliation(s)
- Venkata Krishna Vanamamalai
- National Institute of Animal Biotechnology (NIAB), Opposite Journalist Colony, Near Gowlidoddi Extended Q City Road, Gachibowli, Hyderabad, Telangana, 500032, India
| | - Priyanka Garg
- National Institute of Animal Biotechnology (NIAB), Opposite Journalist Colony, Near Gowlidoddi Extended Q City Road, Gachibowli, Hyderabad, Telangana, 500032, India
| | - Gautham Kolluri
- ICAR-Central Avian Research Institute, Izatnagar, Bareilly, Uttar Pradesh, 243122, India
| | - Ravi Kumar Gandham
- National Institute of Animal Biotechnology (NIAB), Opposite Journalist Colony, Near Gowlidoddi Extended Q City Road, Gachibowli, Hyderabad, Telangana, 500032, India
| | - Itishree Jali
- National Institute of Animal Biotechnology (NIAB), Opposite Journalist Colony, Near Gowlidoddi Extended Q City Road, Gachibowli, Hyderabad, Telangana, 500032, India
| | - Shailesh Sharma
- National Institute of Animal Biotechnology (NIAB), Opposite Journalist Colony, Near Gowlidoddi Extended Q City Road, Gachibowli, Hyderabad, Telangana, 500032, India.
| |
Collapse
|
262
|
Transcriptional profiles in Strongyloides stercoralis males reveal deviations from the Caenorhabditis sex determination model. Sci Rep 2021; 11:8254. [PMID: 33859232 PMCID: PMC8050236 DOI: 10.1038/s41598-021-87478-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 03/30/2021] [Indexed: 02/02/2023] Open
Abstract
The human and canine parasitic nematode Strongyloides stercoralis utilizes an XX/XO sex determination system, with parasitic females reproducing by mitotic parthenogenesis and free-living males and females reproducing sexually. However, the genes controlling S. stercoralis sex determination and male development are unknown. We observed precocious development of rhabditiform males in permissive hosts treated with corticosteroids, suggesting that steroid hormones can regulate male development. To examine differences in transcript abundance between free-living adult males and other developmental stages, we utilized RNA-Seq. We found two clusters of S. stercoralis-specific genes encoding predicted transmembrane proteins that are only expressed in free-living males. We additionally identified homologs of several genes important for sex determination in Caenorhabditis species, including mab-3, tra-1, fem-2, and sex-1, which may have similar functions. However, we identified three paralogs of gld-1; Ss-qki-1 transcripts were highly abundant in adult males, while Ss-qki-2 and Ss-qki-3 transcripts were highly abundant in adult females. We also identified paralogs of pumilio domain-containing proteins with sex-specific transcripts. Intriguingly, her-1 appears to have been lost in several parasite lineages, and we were unable to identify homologs of tra-2 outside of Caenorhabditis species. Together, our data suggest that different mechanisms control male development in S. stercoralis and Caenorhabditis species.
Collapse
|
263
|
Klein A, Husselmann LHH, Williams A, Bell L, Cooper B, Ragar B, Tabb DL. Proteomic Identification and Meta-Analysis in Salvia hispanica RNA-Seq de novo Assemblies. PLANTS (BASEL, SWITZERLAND) 2021; 10:765. [PMID: 33919777 PMCID: PMC8070742 DOI: 10.3390/plants10040765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/26/2021] [Accepted: 03/28/2021] [Indexed: 11/24/2022]
Abstract
While proteomics has demonstrated its value for model organisms and for organisms with mature genome sequence annotations, proteomics has been of less value in nonmodel organisms that are unaccompanied by genome sequence annotations. This project sought to determine the value of RNA-Seq experiments as a basis for establishing a set of protein sequences to represent a nonmodel organism, in this case, the pseudocereal chia. Assembling four publicly available chia RNA-Seq datasets produced transcript sequence sets with a high BUSCO completeness, though the number of transcript sequences and Trinity "genes" varied considerably among them. After six-frame translation, ProteinOrtho detected substantial numbers of orthologs among other species within the taxonomic order Lamiales. These protein sequence databases demonstrated a good identification efficiency for three different LC-MS/MS proteomics experiments, though a seed proteome showed considerable variability in the identification of peptides based on seed protein sequence inclusion. If a proteomics experiment emphasizes a particular tissue, an RNA-Seq experiment incorporating that same tissue is more likely to support a database search identification of that proteome.
Collapse
Affiliation(s)
- Ashwil Klein
- Department of Biotechnology, University of the Western Cape, Bellville 7535, South Africa; (A.K.); (L.H.H.H.); (A.W.)
| | - Lizex H. H. Husselmann
- Department of Biotechnology, University of the Western Cape, Bellville 7535, South Africa; (A.K.); (L.H.H.H.); (A.W.)
| | - Achmat Williams
- Department of Biotechnology, University of the Western Cape, Bellville 7535, South Africa; (A.K.); (L.H.H.H.); (A.W.)
| | - Liam Bell
- Centre for Proteomic and Genomic Research, Cape Town 7925, South Africa;
| | - Bret Cooper
- USDA Agricultural Research Service, Beltsville, MD 20705, USA;
| | - Brent Ragar
- Departments of Internal Medicine and Pediatrics, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02150, USA;
| | - David L. Tabb
- Department of Biotechnology, University of the Western Cape, Bellville 7535, South Africa; (A.K.); (L.H.H.H.); (A.W.)
- Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town 7500, South Africa
- Centre for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch 7602, South Africa
| |
Collapse
|
264
|
Zhang W, Xu M, Wang J, Wang S, Wang X, Yang J, Gao L, Gan S. Comparative Transcriptome Analysis of Key Genes and Pathways Activated in Response to Fat Deposition in Two Sheep Breeds With Distinct Tail Phenotype. Front Genet 2021; 12:639030. [PMID: 33897762 PMCID: PMC8060577 DOI: 10.3389/fgene.2021.639030] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 03/08/2021] [Indexed: 01/21/2023] Open
Abstract
Fat tail in sheep presents a valuable energy reserve that has historically facilitated adaptation to harsh environments. However, in modern intensive and semi-intensive sheep industry systems, breeds with leaner tails are more desirable. In the present study, RNA sequencing (RNA-Seq) was applied to determine the transcriptome profiles of tail fat tissues in two Chinese sheep breeds, fat-rumped Altay sheep and thin-tailed Xinjiang fine wool (XFW) sheep, with extreme fat tail phenotype difference. Then the differentially expressed genes (DEGs) and their sequence variations were further analyzed. In total, 21,527 genes were detected, among which 3,965 displayed significant expression variations in tail fat tissues of the two sheep breeds (P < 0.05), including 707 upregulated and 3,258 downregulated genes. Gene Ontology (GO) analysis disclosed that 198 DEGs were related to fat metabolism. In Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, the majority of DEGs were significantly enriched in "adipocytokine signaling," "PPAR signaling," and "metabolic pathways" (P < 0.05); moreover, some genes were involved in multiple pathways. Among the 198 DEGs, 22 genes were markedly up- or downregulated in tail fat tissue of Altay sheep, indicating that these genes might be closely related to the fat tail trait of this breed. A total of 41,724 and 42,193 SNPs were detected in the transcriptomic data of tail fat tissues obtained from Altay and XFW sheep, respectively. The distribution of seven SNPs in the coding regions of the 22 candidate genes was further investigated in populations of three sheep breeds with distinct tail phenotypes. In particular, the g.18167532T/C (Oar_v3.1) mutation of the ATP-binding cassette transporter A1 (ABCA1) gene and g.57036072G/T (Oar_v3.1) mutation of the solute carrier family 27 member 2 (SLC27A2) gene showed significantly different distributions and were closely associated with tail phenotype (P < 0.05). The present study provides transcriptomic evidence explaining the differences in fat- and thin-tailed sheep breeds and reveals numerous DEGs and SNPs associated with tail phenotype. Our data provide a valuable theoretical basis for selection of lean-tailed sheep breeds.
Collapse
Affiliation(s)
- Wei Zhang
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, China
- Xinjiang Agricultural Vocational Technical College, Changji, China
| | - Mengsi Xu
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, China
| | - Juanjuan Wang
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, China
| | - Shiyin Wang
- Xinjiang Agricultural Vocational Technical College, Changji, China
| | - Xinhua Wang
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, China
| | - Jingquan Yang
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, China
| | - Lei Gao
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, China
| | - Shangquan Gan
- State Key Laboratory of Sheep Genetic Improvement and Healthy Production, Xinjiang Academy of Agricultural and Reclamation Sciences, Shihezi, China
| |
Collapse
|
265
|
Eriksen RL, Padgitt-Cobb LK, Townsend MS, Henning JA. Gene expression for secondary metabolite biosynthesis in hop (Humulus lupulus L.) leaf lupulin glands exposed to heat and low-water stress. Sci Rep 2021; 11:5138. [PMID: 33664420 PMCID: PMC7970847 DOI: 10.1038/s41598-021-84691-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 02/05/2021] [Indexed: 01/31/2023] Open
Abstract
Hops are valued for their secondary metabolites, including bitter acids, flavonoids, oils, and polyphenols, that impart flavor in beer. Previous studies have shown that hop yield and bitter acid content decline with increased temperatures and low-water stress. We looked at physiological traits and differential gene expression in leaf, stem, and root tissue from hop (Humulus lupulus) cv. USDA Cascade in plants exposed to high temperature stress, low-water stress, and a compound treatment of both high temperature and low-water stress for six weeks. The stress conditions imposed in these experiments caused substantial changes to the transcriptome, with significant reductions in the expression of numerous genes involved in secondary metabolite biosynthesis. Of the genes involved in bitter acid production, the critical gene valerophenone synthase (VPS) experienced significant reductions in expression levels across stress treatments, suggesting stress-induced lability in this gene and/or its regulatory elements may be at least partially responsible for previously reported declines in bitter acid content. We also identified a number of transcripts with homology to genes shown to affect abiotic stress tolerance in other plants that may be useful as markers for breeding improved abiotic stress tolerance in hop. Lastly, we provide the first transcriptome from hop root tissue.
Collapse
Affiliation(s)
- Renée L. Eriksen
- grid.512836.b0000 0001 2205 063XUSDA Agricultural Research Service, Forage Seed and Cereal Research Unit, 3450 SW Campus Way, Corvallis, OR 97331 USA
| | - Lillian K. Padgitt-Cobb
- grid.4391.f0000 0001 2112 1969Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR 97331 USA
| | - M. Shaun Townsend
- grid.4391.f0000 0001 2112 1969Department of Crop and Soil Science, Oregon State University, Corvallis, OR 97331 USA
| | - John A. Henning
- grid.512836.b0000 0001 2205 063XUSDA Agricultural Research Service, Forage Seed and Cereal Research Unit, 3450 SW Campus Way, Corvallis, OR 97331 USA
| |
Collapse
|
266
|
Gottschalk C, Zhang S, Schwallier P, Rogers S, Bukovac MJ, van Nocker S. Genetic mechanisms associated with floral initiation and the repressive effect of fruit on flowering in apple (Malus x domestica Borkh). PLoS One 2021; 16:e0245487. [PMID: 33606701 PMCID: PMC7894833 DOI: 10.1371/journal.pone.0245487] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Accepted: 01/02/2021] [Indexed: 11/19/2022] Open
Abstract
Many apple cultivars are subject to biennial fluctuations in flowering and fruiting. It is believed that this phenomenon is caused by a repressive effect of developing fruit on the initiation of flowers in the apex of proximal bourse shoots. However, the genetic pathways of floral initiation are incompletely described in apple, and the biological nature of floral repression by fruit is currently unknown. In this study, we characterized the transcriptional landscape of bourse shoot apices in the biennial cultivar, 'Honeycrisp', during the period of floral initiation, in trees bearing a high fruit load and in trees without fruit. Trees with high fruit load produced almost exclusively vegetative growth in the subsequent year, whereas the trees without fruit produced flowers on the majority of the potential flowering nodes. Using RNA-based sequence data, we documented gene expression at high resolution, identifying >11,000 transcripts that had not been previously annotated, and characterized expression profiles associated with vegetative growth and flowering. We also conducted a census of genes related to known flowering genes, organized the phylogenetic and syntenic relationships of these genes, and compared expression among homeologs. Several genes closely related to AP1, FT, FUL, LFY, and SPLs were more strongly expressed in apices from non-bearing, floral-determined trees, consistent with their presumed floral-promotive roles. In contrast, a homolog of TFL1 exhibited strong and persistent up-regulation only in apices from bearing, vegetative-determined trees, suggesting a role in floral repression. Additionally, we identified four GIBBERELLIC ACID (GA) 2 OXIDASE genes that were expressed to relatively high levels in apices from bearing trees. These results define the flowering-related transcriptional landscape in apple, and strongly support previous studies implicating both gibberellins and TFL1 as key components in repression of flowering by fruit.
Collapse
Affiliation(s)
- Chris Gottschalk
- Department of Horticulture, Plant and Soil Science Building, Michigan State University, East Lansing, Michigan, United States of America
| | - Songwen Zhang
- Department of Horticulture, Plant and Soil Science Building, Michigan State University, East Lansing, Michigan, United States of America
| | - Phil Schwallier
- Michigan State University Extension, East Lansing, Michigan, United States of America
| | - Sean Rogers
- Department of Horticulture, Plant and Soil Science Building, Michigan State University, East Lansing, Michigan, United States of America
| | - Martin J. Bukovac
- Department of Horticulture, Plant and Soil Science Building, Michigan State University, East Lansing, Michigan, United States of America
| | - Steve van Nocker
- Department of Horticulture, Plant and Soil Science Building, Michigan State University, East Lansing, Michigan, United States of America
| |
Collapse
|
267
|
Chialva C, Blein T, Crespi M, Lijavetzky D. Insights into long non-coding RNA regulation of anthocyanin carrot root pigmentation. Sci Rep 2021; 11:4093. [PMID: 33603038 PMCID: PMC7892999 DOI: 10.1038/s41598-021-83514-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 02/04/2021] [Indexed: 01/31/2023] Open
Abstract
Carrot (Daucus carota L.) is one of the most cultivated vegetable in the world and of great importance in the human diet. Its storage organs can accumulate large quantities of anthocyanins, metabolites that confer the purple pigmentation to carrot tissues and whose biosynthesis is well characterized. Long non-coding RNAs (lncRNAs) play critical roles in regulating gene expression of various biological processes in plants. In this study, we used a high throughput stranded RNA-seq to identify and analyze the expression profiles of lncRNAs in phloem and xylem root samples using two genotypes with a strong difference in anthocyanin production. We discovered and annotated 8484 new genes, including 2095 new protein-coding and 6373 non-coding transcripts. Moreover, we identified 639 differentially expressed lncRNAs between the phenotypically contrasted genotypes, including certain only detected in a particular tissue. We then established correlations between lncRNAs and anthocyanin biosynthesis genes in order to identify a molecular framework for the differential expression of the pathway between genotypes. A specific natural antisense transcript linked to the DcMYB7 key anthocyanin biosynthetic transcription factor suggested how the regulation of this pathway may have evolved between genotypes.
Collapse
Affiliation(s)
- Constanza Chialva
- grid.507426.2Facultad de Ciencias Agrarias, Instituto de Biología Agrícola de Mendoza (IBAM), UNCuyo, CONICET, Almirante Brown 500, M5528AHB Chacras de Coria, Mendoza Argentina
| | - Thomas Blein
- grid.4444.00000 0001 2112 9282Institute of Plant Sciences Paris-Saclay (IPS2), CNRS, INRA, University Paris-Saclay and University of Paris, Batiment 630, Gif Sur Yvette, France
| | - Martin Crespi
- grid.4444.00000 0001 2112 9282Institute of Plant Sciences Paris-Saclay (IPS2), CNRS, INRA, University Paris-Saclay and University of Paris, Batiment 630, Gif Sur Yvette, France
| | - Diego Lijavetzky
- grid.507426.2Facultad de Ciencias Agrarias, Instituto de Biología Agrícola de Mendoza (IBAM), UNCuyo, CONICET, Almirante Brown 500, M5528AHB Chacras de Coria, Mendoza Argentina
| |
Collapse
|
268
|
Wang C, Wallerman O, Arendt ML, Sundström E, Karlsson Å, Nordin J, Mäkeläinen S, Pielberg GR, Hanson J, Ohlsson Å, Saellström S, Rönnberg H, Ljungvall I, Häggström J, Bergström TF, Hedhammar Å, Meadows JRS, Lindblad-Toh K. A novel canine reference genome resolves genomic architecture and uncovers transcript complexity. Commun Biol 2021; 4:185. [PMID: 33568770 PMCID: PMC7875987 DOI: 10.1038/s42003-021-01698-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 12/17/2020] [Indexed: 12/13/2022] Open
Abstract
We present GSD_1.0, a high-quality domestic dog reference genome with chromosome length scaffolds and contiguity increased 55-fold over CanFam3.1. Annotation with generated and existing long and short read RNA-seq, miRNA-seq and ATAC-seq, revealed that 32.1% of lifted over CanFam3.1 gaps harboured previously hidden functional elements, including promoters, genes and miRNAs in GSD_1.0. A catalogue of canine "dark" regions was made to facilitate mapping rescue. Alignment in these regions is difficult, but we demonstrate that they harbour trait-associated variation. Key genomic regions were completed, including the Dog Leucocyte Antigen (DLA), T Cell Receptor (TCR) and 366 COSMIC cancer genes. 10x linked-read sequencing of 27 dogs (19 breeds) uncovered 22.1 million SNPs, indels and larger structural variants. Subsequent intersection with protein coding genes showed that 1.4% of these could directly influence gene products, and so provide a source of normal or aberrant phenotypic modifications.
Collapse
Affiliation(s)
- Chao Wang
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.
| | - Ola Wallerman
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Maja-Louise Arendt
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Department of Veterinary Clinical Sciences, University of Copenhagen, Frederiksberg D, Denmark
| | - Elisabeth Sundström
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Åsa Karlsson
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Jessika Nordin
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Suvi Mäkeläinen
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Gerli Rosengren Pielberg
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Jeanette Hanson
- Department of Clinical Sciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Åsa Ohlsson
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Sara Saellström
- Department of Clinical Sciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Henrik Rönnberg
- Department of Clinical Sciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Ingrid Ljungvall
- Department of Clinical Sciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Jens Häggström
- Department of Clinical Sciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Tomas F Bergström
- Department of Animal Breeding and Genetics, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Åke Hedhammar
- Department of Clinical Sciences, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Jennifer R S Meadows
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Kerstin Lindblad-Toh
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
269
|
Dohlman AB, Arguijo Mendoza D, Ding S, Gao M, Dressman H, Iliev ID, Lipkin SM, Shen X. The cancer microbiome atlas: a pan-cancer comparative analysis to distinguish tissue-resident microbiota from contaminants. Cell Host Microbe 2021; 29:281-298.e5. [PMID: 33382980 PMCID: PMC7878430 DOI: 10.1016/j.chom.2020.12.001] [Citation(s) in RCA: 96] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Revised: 09/29/2020] [Accepted: 12/01/2020] [Indexed: 12/30/2022]
Abstract
Studying the microbial composition of internal organs and their associations with disease remains challenging due to the difficulty of acquiring clinical biopsies. We designed a statistical model to analyze the prevalence of species across sample types from The Cancer Genome Atlas (TCGA), revealing that species equiprevalent across sample types are predominantly contaminants, bearing unique signatures from each TCGA-designated sequencing center. Removing such species mitigated batch effects and isolated the tissue-resident microbiome, which was validated by original matched TCGA samples. Gene copies and nucleotide variants can further distinguish mixed-evidence species. We, thus, present The Cancer Microbiome Atlas (TCMA), a collection of curated, decontaminated microbial compositions of oropharyngeal, esophageal, gastrointestinal, and colorectal tissues. This led to the discovery of prognostic species and blood signatures of mucosal barrier injuries and enabled systematic matched microbe-host multi-omic analyses, which will help guide future studies of the microbiome's role in human health and disease.
Collapse
Affiliation(s)
- Anders B Dohlman
- Department of Biomedical Engineering, Center for Genomics and Computational Biology, Duke Microbiome Center, Duke University, Durham, NC 27708, USA.
| | - Diana Arguijo Mendoza
- Department of Biomedical Engineering, Center for Genomics and Computational Biology, Duke Microbiome Center, Duke University, Durham, NC 27708, USA
| | - Shengli Ding
- Department of Biomedical Engineering, Center for Genomics and Computational Biology, Duke Microbiome Center, Duke University, Durham, NC 27708, USA
| | - Michael Gao
- Duke Institute for Health Innovation, Duke University, Durham, NC 27701, USA
| | - Holly Dressman
- Department of Molecular Genetics and Microbiology, Director of Duke Microbiome Center, Duke University, Durham, NC 27708, USA
| | - Iliyan D Iliev
- Department of Medicine, Weill Cornell Medical College, Cornell University, New York City, NY 10065, USA
| | - Steven M Lipkin
- Department of Medicine, Weill Cornell Medical College, Cornell University, New York City, NY 10065, USA
| | - Xiling Shen
- Department of Biomedical Engineering, Center for Genomics and Computational Biology, Duke Microbiome Center, Duke University, Durham, NC 27708, USA.
| |
Collapse
|
270
|
Bryzghalov O, Makałowska I, Szcześniak MW. lncEvo: automated identification and conservation study of long noncoding RNAs. BMC Bioinformatics 2021; 22:59. [PMID: 33563213 PMCID: PMC7871587 DOI: 10.1186/s12859-021-03991-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 02/01/2021] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND Long noncoding RNAs represent a large class of transcripts with two common features: they exceed an arbitrary length threshold of 200 nt and are assumed to not encode proteins. Although a growing body of evidence indicates that the vast majority of lncRNAs are potentially nonfunctional, hundreds of them have already been revealed to perform essential gene regulatory functions or to be linked to a number of cellular processes, including those associated with the etiology of human diseases. To better understand the biology of lncRNAs, it is essential to perform a more in-depth study of their evolution. In contrast to protein-encoding transcripts, however, they do not show the strong sequence conservation that usually results from purifying selection; therefore, software that is typically used to resolve the evolutionary relationships of protein-encoding genes and transcripts is not applicable to the study of lncRNAs. RESULTS To tackle this issue, we developed lncEvo, a computational pipeline that consists of three modules: (1) transcriptome assembly from RNA-Seq data, (2) prediction of lncRNAs, and (3) conservation study-a genome-wide comparison of lncRNA transcriptomes between two species of interest, including search for orthologs. Importantly, one can choose to apply lncEvo solely for transcriptome assembly or lncRNA prediction, without calling the conservation-related part. CONCLUSIONS lncEvo is an all-in-one tool built with the Nextflow framework, utilizing state-of-the-art software and algorithms with customizable trade-offs between speed and sensitivity, ease of use and built-in reporting functionalities. The source code of the pipeline is freely available for academic and nonacademic use under the MIT license at https://gitlab.com/spirit678/lncrna_conservation_nf .
Collapse
Affiliation(s)
- Oleksii Bryzghalov
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University in Poznan, Uniwersytetu Poznanskiego 6, 61-614, Poznan, Poland
| | - Izabela Makałowska
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University in Poznan, Uniwersytetu Poznanskiego 6, 61-614, Poznan, Poland
| | - Michał Wojciech Szcześniak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University in Poznan, Uniwersytetu Poznanskiego 6, 61-614, Poznan, Poland.
| |
Collapse
|
271
|
Gunnarsson S, Prabakaran S. In silico identification of novel open reading frames in Plasmodium falciparum oocyte and salivary gland sporozoites using proteogenomics framework. Malar J 2021; 20:71. [PMID: 33546698 PMCID: PMC7866754 DOI: 10.1186/s12936-021-03598-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 01/16/2021] [Indexed: 11/25/2022] Open
Abstract
Background Plasmodium falciparum causes the deadliest form of malaria, which remains one of the most prevalent infectious diseases. Unfortunately, the only licensed vaccine showed limited protection and resistance to anti-malarial drug is increasing, which can be largely attributed to the biological complexity of the parasite’s life cycle. The progression from one developmental stage to another in P. falciparum involves drastic changes in gene expressions, where its infectivity to human hosts varies greatly depending on the stage. Approaches to identify candidate genes that are responsible for the development of infectivity to human hosts typically involve differential gene expression analysis between stages. However, the detection may be limited to annotated proteins and open reading frames (ORFs) predicted using restrictive criteria. Methods The above problem is particularly relevant for P. falciparum; whose genome annotation is relatively incomplete given its clinical significance. In this work, systems proteogenomics approach was used to address this challenge, as it allows computational detection of unannotated, novel Open Reading Frames (nORFs), which are neglected by conventional analyses. Two pairs of transcriptome/proteome were obtained from a previous study where one was collected in the mosquito-infectious oocyst sporozoite stage, and the other in the salivary gland sporozoite stage with human infectivity. They were then re-analysed using the proteogenomics framework to identify nORFs in each stage. Results Translational products of nORFs that map to antisense, intergenic, intronic, 3′ UTR and 5′ UTR regions, as well as alternative reading frames of canonical proteins were detected. Some of these nORFs also showed differential expression between the two life cycle stages studied. Their regulatory roles were explored through further bioinformatics analyses including the expression regulation on the parent reference genes, in silico structure prediction, and gene ontology term enrichment analysis. Conclusion The identification of nORFs in P. falciparum sporozoites highlights the biological complexity of the parasite. Although the analyses are solely computational, these results provide a starting point for further experimental validation of the existence and functional roles of these nORFs,
Collapse
Affiliation(s)
- Sophie Gunnarsson
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Sudhakaran Prabakaran
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK.
| |
Collapse
|
272
|
Cui Y, Wu B, Peng A, Song X, Chen X. The Genome of Banana Leaf Blight Pathogen Fusarium sacchari str. FS66 Harbors Widespread Gene Transfer From Fusarium oxysporum. FRONTIERS IN PLANT SCIENCE 2021; 12:629859. [PMID: 33613610 PMCID: PMC7889605 DOI: 10.3389/fpls.2021.629859] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Accepted: 01/12/2021] [Indexed: 06/12/2023]
Abstract
Fusarium species have been identified as pathogens causing many different plant diseases, and here we report an emerging banana leaf blight (BLB) caused by F. sacchari (Fs) discovered in Guangdong, China. From the symptomatic tissues collected in the field, a fungal isolate was obtained, which induced similar symptoms on healthy banana seedlings after inoculation. Koch's postulates were fulfilled after the re-isolation of the pathogen. Phylogenetic analysis on two gene segments and the whole genome sequence identified the pathogen belonging to Fs and named as Fs str. FS66. A 45.74 Mb genome of FS66 was acquired through de novo assembly using long-read sequencing data, and its contig N50 (1.97 Mb) is more than 10-fold larger than the previously available genome in the species. Based on transcriptome sequencing and ab initio gene annotation, a total of 14,486 protein-encoding genes and 418 non-coding RNAs were predicted. A total of 48 metabolite biosynthetic gene clusters including the fusaric acid biosynthesis gene cluster were predicted in silico in the FS66 genome. Comparison between FS66 and other 11 Fusarium genomes identified tens to hundreds of genes specifically gained and lost in FS66, including some previously correlated with Fusarium pathogenicity. The FS66 genome also harbors widespread gene transfer on the core chromosomes putatively from F. oxysporum species complex (FOSC), including 30 involved in Fusarium pathogenicity/virulence. This study not only reports the BLB caused by Fs, but also provides important information and clues for further understanding of the genome evolution among pathogenic Fusarium species.
Collapse
Affiliation(s)
- Yiping Cui
- Guangdong Provincial Key Laboratory of High Technology for Plant Protection, Plant Protection Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | - Bo Wu
- School of Computing, Clemson University, Clemson, SC, United States
| | - Aitian Peng
- Guangdong Provincial Key Laboratory of High Technology for Plant Protection, Plant Protection Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | - Xiaobing Song
- Guangdong Provincial Key Laboratory of High Technology for Plant Protection, Plant Protection Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| | - Xia Chen
- Guangdong Provincial Key Laboratory of High Technology for Plant Protection, Plant Protection Research Institute, Guangdong Academy of Agricultural Sciences, Guangzhou, China
| |
Collapse
|
273
|
Bennett M, Ulitsky I, Alloza I, Vandenbroeck K, Miscianinov V, Mahmoud AD, Ballantyne M, Rodor J, Baker AH. Novel Transcript Discovery Expands the Repertoire of Pathologically-Associated, Long Non-Coding RNAs in Vascular Smooth Muscle Cells. Int J Mol Sci 2021; 22:1484. [PMID: 33540814 PMCID: PMC7867340 DOI: 10.3390/ijms22031484] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 01/28/2021] [Accepted: 01/29/2021] [Indexed: 01/23/2023] Open
Abstract
Vascular smooth muscle cells (VSMCs) provide vital contractile force within blood vessel walls, yet can also propagate cardiovascular pathologies through proliferative and pro-inflammatory activities. Such phenotypes are driven, in part, by the diverse effects of long non-coding RNAs (lncRNAs) on gene expression. However, lncRNA characterisation in VSMCs in pathological states is hampered by incomplete lncRNA representation in reference annotation. We aimed to improve lncRNA representation in such contexts by assembling non-reference transcripts in RNA sequencing datasets describing VSMCs stimulated in vitro with cytokines, growth factors, or mechanical stress, as well as those isolated from atherosclerotic plaques. All transcripts were then subjected to a rigorous lncRNA prediction pipeline. We substantially improved coverage of lncRNAs responding to pro-mitogenic stimuli, with non-reference lncRNAs contributing 21-32% for each dataset. We also demonstrate non-reference lncRNAs were biased towards enriched expression within VSMCs, and transcription from enhancer sites, suggesting particular relevance to VSMC processes, and the regulation of neighbouring protein-coding genes. Both VSMC-enriched and enhancer-transcribed lncRNAs were large components of lncRNAs responding to pathological stimuli, yet without novel transcript discovery 33-46% of these lncRNAs would remain hidden. Our comprehensive VSMC lncRNA repertoire allows proper prioritisation of candidates for characterisation and exemplifies a strategy to broaden our knowledge of lncRNA across a range of disease states.
Collapse
MESH Headings
- Aorta/cytology
- Coronary Vessels/cytology
- Cytokines/pharmacology
- Datasets as Topic
- Enhancer Elements, Genetic
- Gene Expression Profiling
- Humans
- Intercellular Signaling Peptides and Proteins/pharmacology
- Muscle, Smooth, Vascular/cytology
- Muscle, Smooth, Vascular/metabolism
- Myocytes, Smooth Muscle/drug effects
- Myocytes, Smooth Muscle/metabolism
- Plaque, Atherosclerotic/metabolism
- RNA, Long Noncoding/analysis
- RNA, Long Noncoding/isolation & purification
- RNA-Seq
- Stress, Mechanical
- Transcription, Genetic/drug effects
- Transcriptome
Collapse
Affiliation(s)
- Matthew Bennett
- Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh EH16 4TJ, UK; (M.B.); (V.M.); (A.D.M.); (M.B.); (J.R.)
| | - Igor Ulitsky
- Department of Biological Regulation, Weizmann Institute of Science, Rehovot 76100, Israel;
| | - Iraide Alloza
- Inflammation & Biomarkers Group, Biocruces Bizkaia Health Research Institute, Cruces Plaza, 48903 Barakaldo, Spain; (I.A.); (K.V.)
| | - Koen Vandenbroeck
- Inflammation & Biomarkers Group, Biocruces Bizkaia Health Research Institute, Cruces Plaza, 48903 Barakaldo, Spain; (I.A.); (K.V.)
- Ikerbasque, Basque Foundation for Science, 3 María Díaz Haroko Kalea, 48013 Bilbao, Spain
| | - Vladislav Miscianinov
- Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh EH16 4TJ, UK; (M.B.); (V.M.); (A.D.M.); (M.B.); (J.R.)
| | - Amira Dia Mahmoud
- Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh EH16 4TJ, UK; (M.B.); (V.M.); (A.D.M.); (M.B.); (J.R.)
| | - Margaret Ballantyne
- Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh EH16 4TJ, UK; (M.B.); (V.M.); (A.D.M.); (M.B.); (J.R.)
| | - Julie Rodor
- Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh EH16 4TJ, UK; (M.B.); (V.M.); (A.D.M.); (M.B.); (J.R.)
| | - Andrew H. Baker
- Centre for Cardiovascular Science, Queen’s Medical Research Institute, University of Edinburgh, 47 Little France Crescent, Edinburgh EH16 4TJ, UK; (M.B.); (V.M.); (A.D.M.); (M.B.); (J.R.)
| |
Collapse
|
274
|
Li Z, Liu L, Jiang S, Li Q, Feng C, Du Q, Zou D, Xiao J, Zhang Z, Ma L. LncExpDB: an expression database of human long non-coding RNAs. Nucleic Acids Res 2021; 49:D962-D968. [PMID: 33045751 PMCID: PMC7778919 DOI: 10.1093/nar/gkaa850] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 09/12/2020] [Accepted: 09/22/2020] [Indexed: 12/14/2022] Open
Abstract
Expression profiles of long non-coding RNAs (lncRNAs) across diverse biological conditions provide significant insights into their biological functions, interacting targets as well as transcriptional reliability. However, there lacks a comprehensive resource that systematically characterizes the expression landscape of human lncRNAs by integrating their expression profiles across a wide range of biological conditions. Here, we present LncExpDB (https://bigd.big.ac.cn/lncexpdb), an expression database of human lncRNAs that is devoted to providing comprehensive expression profiles of lncRNA genes, exploring their expression features and capacities, identifying featured genes with potentially important functions, and building interactions with protein-coding genes across various biological contexts/conditions. Based on comprehensive integration and stringent curation, LncExpDB currently houses expression profiles of 101 293 high-quality human lncRNA genes derived from 1977 samples of 337 biological conditions across nine biological contexts. Consequently, LncExpDB estimates lncRNA genes' expression reliability and capacities, identifies 25 191 featured genes, and further obtains 28 443 865 lncRNA-mRNA interactions. Moreover, user-friendly web interfaces enable interactive visualization of expression profiles across various conditions and easy exploration of featured lncRNAs and their interacting partners in specific contexts. Collectively, LncExpDB features comprehensive integration and curation of lncRNA expression profiles and thus will serve as a fundamental resource for functional studies on human lncRNAs.
Collapse
Affiliation(s)
- Zhao Li
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Lin Liu
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Shuai Jiang
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Qianpeng Li
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Changrui Feng
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Qiang Du
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Dong Zou
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Jingfa Xiao
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Zhang Zhang
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Lina Ma
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
275
|
Varabyou A, Salzberg SL, Pertea M. Effects of transcriptional noise on estimates of gene and transcript expression in RNA sequencing experiments. Genome Res 2021; 31:301-308. [PMID: 33361112 PMCID: PMC7849408 DOI: 10.1101/gr.266213.120] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 12/18/2020] [Indexed: 12/25/2022]
Abstract
RNA sequencing is widely used to measure gene expression across a vast range of animal and plant tissues and conditions. Most studies of computational methods for gene expression analysis use simulated data to evaluate the accuracy of these methods. These simulations typically include reads generated from known genes at varying levels of expression. Until now, simulations did not include reads from noisy transcripts, which might include erroneous transcription, erroneous splicing, and other processes that affect transcription in living cells. Here we examine the effects of realistic amounts of transcriptional noise on the ability of leading computational methods to assemble and quantify the genes and transcripts in an RNA sequencing experiment. We show that the inclusion of noise leads to systematic errors in the ability of these programs to measure expression, including systematic underestimates of transcript abundance levels and large increases in the number of false-positive genes and transcripts. Our results also suggest that alignment-free computational methods sometimes fail to detect transcripts expressed at relatively low levels.
Collapse
Affiliation(s)
- Ales Varabyou
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland 21211, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Steven L Salzberg
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland 21211, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
- Department of Biostatistics, Johns Hopkins University, Baltimore, Maryland 21205, USA
| | - Mihaela Pertea
- Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland 21211, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| |
Collapse
|
276
|
Erady C, Boxall A, Puntambekar S, Suhas Jagannathan N, Chauhan R, Chong D, Meena N, Kulkarni A, Kasabe B, Prathivadi Bhayankaram K, Umrania Y, Andreani A, Nel J, Wayland MT, Pina C, Lilley KS, Prabakaran S. Pan-cancer analysis of transcripts encoding novel open-reading frames (nORFs) and their potential biological functions. NPJ Genom Med 2021; 6:4. [PMID: 33495453 PMCID: PMC7835362 DOI: 10.1038/s41525-020-00167-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 11/18/2020] [Indexed: 12/13/2022] Open
Abstract
Uncharacterized and unannotated open-reading frames, which we refer to as novel open reading frames (nORFs), may sometimes encode peptides that remain unexplored for novel therapeutic opportunities. To our knowledge, no systematic identification and characterization of transcripts encoding nORFs or their translation products in cancer, or in any other physiological process has been performed. We use our curated nORFs database (nORFs.org), together with RNA-Seq data from The Cancer Genome Atlas (TCGA) and Genotype-Expression (GTEx) consortiums, to identify transcripts containing nORFs that are expressed frequently in cancer or matched normal tissue across 22 cancer types. We show nORFs are subject to extensive dysregulation at the transcript level in cancer tissue and that a small subset of nORFs are associated with overall patient survival, suggesting that nORFs may have prognostic value. We also show that nORF products can form protein-like structures with post-translational modifications. Finally, we perform in silico screening for inhibitors against nORF-encoded proteins that are disrupted in stomach and esophageal cancer, showing that they can potentially be targeted by inhibitors. We hope this work will guide and motivate future studies that perform in-depth characterization of nORF functions in cancer and other diseases.
Collapse
Affiliation(s)
- Chaitanya Erady
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Adam Boxall
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Shraddha Puntambekar
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | - N Suhas Jagannathan
- Cancer and Stem Cell Biology Programme, and Centre for Computational Biology, Duke-NUS Medical School, Singapore, 169857, Singapore
| | - Ruchi Chauhan
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - David Chong
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Narendra Meena
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Apurv Kulkarni
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | - Bhagyashri Kasabe
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | | | - Yagnesh Umrania
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Adam Andreani
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Jean Nel
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Matthew T Wayland
- Department of Zoology, University of Cambridge, Downing Street, Cambridge, CB2 3EJ, UK
| | - Cristina Pina
- Department of Haematology, Cambridge Biomedical Campus, Cambridge, CB2 0PT, UK
| | - Kathryn S Lilley
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Sudhakaran Prabakaran
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK.
| |
Collapse
|
277
|
Know your enemy - transcriptome of myxozoan Tetracapsuloides bryosalmonae reveals potential drug targets against proliferative kidney disease in salmonids. Parasitology 2021; 148:726-739. [PMID: 33478602 PMCID: PMC8056827 DOI: 10.1017/s003118202100010x] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The myxozoan Tetracapsuloides bryosalmonae is a widely spread endoparasite that causes proliferative kidney disease (PKD) in salmonid fish. We developed an in silico pipeline to separate transcripts of T. bryosalmonae from the kidney tissue of its natural vertebrate host, brown trout (Salmo trutta). After stringent filtering, we constructed a partial transcriptome assembly T. bryosalmonae, comprising 3427 transcripts. Based on homology-restricted searches of the assembled parasite transcriptome and Atlantic salmon (Salmo salar) proteome, we identified four protein targets (Endoglycoceramidase, Legumain-like protease, Carbonic anhydrase 2, Pancreatic lipase-related protein 2) for the development of anti-parasitic drugs against T. bryosalmonae. Earlier work of these proteins on parasitic protists and helminths suggests that the identified anti-parasitic drug targets represent promising chemotherapeutic candidates also against T. bryosalmonae, and strengthen the view that the known inhibitors can be effective in evolutionarily distant organisms. In addition, we identified differentially expressed T. bryosalmonae genes between moderately and severely infected fish, indicating an increased abundance of T. bryosalmonae sporogonic stages in fish with low parasite load. In conclusion, this study paves the way for future genomic research in T. bryosalmonae and represents an important step towards the development of effective drugs against PKD.
Collapse
|
278
|
Karpe SD, Tiwari V, Ramanathan S. InsectOR-Webserver for sensitive identification of insect olfactory receptor genes from non-model genomes. PLoS One 2021; 16:e0245324. [PMID: 33465132 PMCID: PMC7815150 DOI: 10.1371/journal.pone.0245324] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 12/29/2020] [Indexed: 11/18/2022] Open
Abstract
Insect Olfactory Receptors (ORs) are diverse family of membrane protein receptors responsible for most of the insect olfactory perception and communication, and hence they are of utmost importance for developing repellents or pesticides. Accurate gene prediction of insect ORs from newly sequenced genomes is an important but challenging task. We have developed a dedicated webserver, 'insectOR', to predict and validate insect OR genes using multiple gene prediction algorithms, accompanied by relevant validations. It is possible to employ this server nearly automatically and perform rapid prediction of the OR gene loci from thousands of OR-protein-to-genome alignments, resolve gene boundaries for tandem OR genes and refine them further to provide more complete OR gene models. InsectOR outperformed the popular genome annotation pipelines (MAKER and NCBI eukaryotic genome annotation) in terms of overall sensitivity at base, exon and locus level, when tested on two distantly related insect genomes. It displayed more than 95% nucleotide level precision in both tests. Finally, given the same input data and parameters, InsectOR missed less than 2% gene loci, in contrast to 55% loci missed by MAKER for Drosophila melanogaster. The webserver is freely available on the web at http://caps.ncbs.res.in/insectOR/ and the basic package can be downloaded from https://github.com/sdk15/insectOR for local use. This tool will allow biologists to perform quick preliminary identification of insect olfactory receptor genes from newly sequenced genomes and also assist in their further detailed annotation. Its usage can also be extended to other divergent gene families.
Collapse
Affiliation(s)
- Snehal Dilip Karpe
- National Centre for Biological Sciences (NCBS), TIFR, Bengaluru, Karnataka, India
| | - Vikas Tiwari
- National Centre for Biological Sciences (NCBS), TIFR, Bengaluru, Karnataka, India
| | - Sowdhamini Ramanathan
- National Centre for Biological Sciences (NCBS), TIFR, Bengaluru, Karnataka, India
- * E-mail:
| |
Collapse
|
279
|
Gershman A, Romer TG, Fan Y, Razaghi R, Smith WA, Timp W. De novo genome assembly of the tobacco hornworm moth (Manduca sexta). G3 (BETHESDA, MD.) 2021; 11:jkaa047. [PMID: 33561252 PMCID: PMC8022704 DOI: 10.1093/g3journal/jkaa047] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 11/20/2020] [Indexed: 01/24/2023]
Abstract
The tobacco hornworm, Manduca sexta, is a lepidopteran insect that is used extensively as a model system for studying insect biology, development, neuroscience, and immunity. However, current studies rely on the highly fragmented reference genome Msex_1.0, which was created using now-outdated technologies and is hindered by a variety of deficiencies and inaccuracies. We present a new reference genome for M. sexta, JHU_Msex_v1.0, applying a combination of modern technologies in a de novo assembly to increase continuity, accuracy, and completeness. The assembly is 470 Mb and is ∼20× more continuous than the original assembly, with scaffold N50 > 14 Mb. We annotated the assembly by lifting over existing annotations and supplementing with additional supporting RNA-based data for a total of 25,256 genes. The new reference assembly is accessible in annotated form for public use. We demonstrate that improved continuity of the M. sexta genome improves resequencing studies and benefits future research on M. sexta as a model organism.
Collapse
Affiliation(s)
- Ariel Gershman
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, 21287, USA
| | - Tatiana G Romer
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Yunfan Fan
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Roham Razaghi
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Wendy A Smith
- Department of Biology, Northeastern University, Boston, MA, 02215, USA
| | - Winston Timp
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD, 21287, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA
| |
Collapse
|
280
|
Chen S, Guo X, He X, Di R, Zhang X, Zhang J, Wang X, Chu M. Insight Into Pituitary lncRNA and mRNA at Two Estrous Stages in Small Tail Han Sheep With Different FecB Genotypes. Front Endocrinol (Lausanne) 2021; 12:789564. [PMID: 35178025 PMCID: PMC8844552 DOI: 10.3389/fendo.2021.789564] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 12/27/2021] [Indexed: 12/21/2022] Open
Abstract
The pituitary is a remarkably dynamic organ with roles in hormone (FSH and LH) synthesis and secretion. In animals with the FecB (fecundity Booroola) mutation, the pituitary experiences hormone fluctuations during the follicular-luteal transition, which is implicated in the expression and regulation of many genes and regulators. Long non-coding RNAs (lncRNAs) are a novel type of regulatory factors for the reproductive process. Nevertheless, the expression patterns of lncRNAs and their roles in FecB-mediated follicular development and ovulation remain obscure. Thus, we profiled the pituitary transcriptome during the follicular (F, 45 h after evacuation vaginal sponges) and luteal (L, 216 h after evacuation vaginal sponges) phases in FecB-mutant homozygous (BB) and wild-type (WW) Small Tail Han sheep. We identified 78 differentially expressed genes (DEGs) and 41 differentially expressed lncRNAs (DELs) between BB_F and BB_L, 32 DEGs and 26 DELs between BB_F and WW_F, 16 DEGs and 29 DELs between BB_L and WW_L, and 50 DEGs and 18 DELs between WW_F and WW_L. The results of real-time quantitative PCR (RT-qPCR) correlated well with the transcriptome data. In both the follicular and luteal phases, DEGs (GRID2, glutamate ionotropic receptor delta type subunit 2; ST14, ST14 transmembrane serine protease matriptase) were enriched in hormone synthesis, secretion, and action. MSTRG.47470 and MSTRG.101530 were the trans-regulated elements of ID1 (inhibitor of DNA binding 3, HLH protein) and the DEG ID3 (inhibitor of DNA binding 3, HLH protein), and EEF2 (eukaryotic translation elongation factor 2), respectively; these factors might be involved in melatonin and peptide hormone secretion. In the FecB-mediated follicular phase, MSTRG.125392 targeted seizure-related 6 homolog like (SEZ6L), and MSTRG.125394 and MSTRG.83276 targeted the DEG KCNQ3 (potassium voltage-gated channel subfamily Q member 3) in cis, while MSTRG.55861 targeted FKBP4 (FKBP prolyl isomerase 4) in trans. In the FecB-mediated luteal phase, LOC105613905, MSTRG.81536, and MSTRG.150434 modulated TGFB1, SMAD3, OXT, respectively, in trans. We postulated that the FecB mutation in pituitary tissue elevated the expression of certain genes associated with pituitary development and hormone secretion. Furthermore, this study provides new insights into how the pituitary regulates follicular development and ovulation, illustrated by the effect of the FecB mutation.
Collapse
Affiliation(s)
- Si Chen
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xiaofei Guo
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
- Institute of Animal Husbandry and Veterinary Medicine, Tianjin Academy of Agricultural Sciences, Tianjin, China
| | - Xiaoyun He
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Ran Di
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Xiaosheng Zhang
- Institute of Animal Husbandry and Veterinary Medicine, Tianjin Academy of Agricultural Sciences, Tianjin, China
| | - Jinlong Zhang
- Institute of Animal Husbandry and Veterinary Medicine, Tianjin Academy of Agricultural Sciences, Tianjin, China
| | - Xiangyu Wang
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
- *Correspondence: Xiangyu Wang, ; Mingxing Chu,
| | - Mingxing Chu
- Key Laboratory of Animal Genetics and Breeding and Reproduction of Ministry of Agriculture and Rural Affairs, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
- *Correspondence: Xiangyu Wang, ; Mingxing Chu,
| |
Collapse
|
281
|
Wang H, Liu S, Dai X, Yang Y, Luo Y, Gao Y, Liu X, Wei W, Wang H, Xu X, Reddy ASN, Jaiswal P, Li W, Liu B, Gu L. PSDX: A Comprehensive Multi-Omics Association Database of Populus trichocarpa With a Focus on the Secondary Growth in Response to Stresses. FRONTIERS IN PLANT SCIENCE 2021; 12:655565. [PMID: 34122478 PMCID: PMC8195342 DOI: 10.3389/fpls.2021.655565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 04/26/2021] [Indexed: 05/16/2023]
Abstract
Populus trichocarpa (P. trichocarpa) is a model tree for the investigation of wood formation. In recent years, researchers have generated a large number of high-throughput sequencing data in P. trichocarpa. However, no comprehensive database that provides multi-omics associations for the investigation of secondary growth in response to diverse stresses has been reported. Therefore, we developed a public repository that presents comprehensive measurements of gene expression and post-transcriptional regulation by integrating 144 RNA-Seq, 33 ChIP-seq, and six single-molecule real-time (SMRT) isoform sequencing (Iso-seq) libraries prepared from tissues subjected to different stresses. All the samples from different studies were analyzed to obtain gene expression, co-expression network, and differentially expressed genes (DEG) using unified parameters, which allowed comparison of results from different studies and treatments. In addition to gene expression, we also identified and deposited pre-processed data about alternative splicing (AS), alternative polyadenylation (APA) and alternative transcription initiation (ATI). The post-transcriptional regulation, differential expression, and co-expression network datasets were integrated into a new P. trichocarpa Stem Differentiating Xylem (PSDX) database (http://forestry.fafu.edu.cn/db/SDX), which further highlights gene families of RNA-binding proteins and stress-related genes. The PSDX also provides tools for data query, visualization, a genome browser, and the BLAST option for sequence-based query. Much of the data is also available for bulk download. The availability of PSDX contributes to the research related to the secondary growth in response to stresses in P. trichocarpa, which will provide new insights that can be useful for the improvement of stress tolerance in woody plants.
Collapse
Affiliation(s)
- Huiyuan Wang
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Sheng Liu
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xiufang Dai
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, China
| | - Yongkang Yang
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Yunjun Luo
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Yubang Gao
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xuqing Liu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Wentao Wei
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Huihui Wang
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Xi Xu
- College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Anireddy S. N. Reddy
- Department of Biology and Program in Cell and Molecular Biology, Colorado State University, Fort Collins, CO, United States
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, United States
| | - Wei Li
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, Harbin, China
| | - Bo Liu
- College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, China
- *Correspondence: Bo Liu,
| | - Lianfeng Gu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, China
- Lianfeng Gu,
| |
Collapse
|
282
|
Lim CS, Sozzi V, Littlejohn M, Yuen LK, Warner N, Betz-Stablein B, Luciani F, Revill PA, Brown CM. Quantitative analysis of the splice variants expressed by the major hepatitis B virus genotypes. Microb Genom 2021; 7:mgen000492. [PMID: 33439114 PMCID: PMC8115900 DOI: 10.1099/mgen.0.000492] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 11/23/2020] [Indexed: 12/13/2022] Open
Abstract
Hepatitis B virus (HBV) is a major human pathogen that causes liver diseases. The main HBV RNAs are unspliced transcripts that encode the key viral proteins. Recent studies have shown that some of the HBV spliced transcript isoforms are predictive of liver cancer, yet the roles of these spliced transcripts remain elusive. Furthermore, there are nine major HBV genotypes common in different regions of the world, these genotypes may express different spliced transcript isoforms. To systematically study the HBV splice variants, we transfected human hepatoma cells, Huh7, with four HBV genotypes (A2, B2, C2 and D3), followed by deep RNA-sequencing. We found that 13-28 % of HBV RNAs were splice variants, which were reproducibly detected across independent biological replicates. These comprised 6 novel and 10 previously identified splice variants. In particular, a novel, singly spliced transcript was detected in genotypes A2 and D3 at high levels. The biological relevance of these splice variants was supported by their identification in HBV-positive liver biopsy and serum samples, and in HBV-infected primary human hepatocytes. Interestingly the levels of HBV splice variants varied across the genotypes, but the spliced pregenomic RNA SP1 and SP9 were the two most abundant splice variants. Counterintuitively, these singly spliced SP1 and SP9 variants had a suboptimal 5' splice site, supporting the idea that splicing of HBV RNAs is tightly controlled by the viral post-transcriptional regulatory RNA element.
Collapse
Affiliation(s)
- Chun Shen Lim
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| | - Vitina Sozzi
- Victorian Infectious Diseases Reference Laboratory, Royal Melbourne Hospital at the Peter Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
| | - Margaret Littlejohn
- Victorian Infectious Diseases Reference Laboratory, Royal Melbourne Hospital at the Peter Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
| | - Lilly K.W. Yuen
- Victorian Infectious Diseases Reference Laboratory, Royal Melbourne Hospital at the Peter Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
| | - Nadia Warner
- Victorian Infectious Diseases Reference Laboratory, Royal Melbourne Hospital at the Peter Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
| | - Brigid Betz-Stablein
- Systems Medicine, School of Medical Sciences, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
- Present address: Dermatology Research Centre, Diamantina Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Fabio Luciani
- Systems Medicine, School of Medical Sciences, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | - Peter A. Revill
- Victorian Infectious Diseases Reference Laboratory, Royal Melbourne Hospital at the Peter Doherty Institute for Infection and Immunity, Melbourne, Victoria, Australia
- Department of Microbiology and Immunology, University of Melbourne, Melbourne, Victoria, Australia
| | - Chris M. Brown
- Department of Biochemistry, School of Biomedical Sciences, University of Otago, Dunedin, New Zealand
| |
Collapse
|
283
|
Kahlon PS, Seta SM, Zander G, Scheikl D, Hückelhoven R, Joosten MHAJ, Stam R. Population studies of the wild tomato species Solanum chilense reveal geographically structured major gene-mediated pathogen resistance. Proc Biol Sci 2020; 287:20202723. [PMID: 33352079 DOI: 10.1098/rspb.2020.2723] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Natural plant populations encounter strong pathogen pressure and defence-associated genes are known to be under selection dependent on the pressure by the pathogens. Here, we use populations of the wild tomato Solanum chilense to investigate natural resistance against Cladosporium fulvum, a well-known ascomycete pathogen of domesticated tomatoes. Host populations used are from distinct geographical origins and share a defined evolutionary history. We show that distinct populations of S. chilense differ in resistance against the pathogen. Screening for major resistance gene-mediated pathogen recognition throughout the whole species showed clear geographical differences between populations and complete loss of pathogen recognition in the south of the species range. In addition, we observed high complexity in a homologues of Cladosporium resistance (Hcr) locus, underlying the recognition of C. fulvum, in central and northern populations. Our findings show that major gene-mediated recognition specificity is diverse in a natural plant-pathosystem. We place major gene resistance in a geographical context that also defined the evolutionary history of that species. Data suggest that the underlying loci are more complex than previously anticipated, with small-scale gene recombination being possibly responsible for maintaining balanced polymorphisms in the populations that experience pathogen pressure.
Collapse
Affiliation(s)
- Parvinderdeep S Kahlon
- Chair of Phytopathology, TUM School of Life Sciences, Technical University of Munich, Emil-Ramann-Str. 2, 85354 Freising, Germany
| | - Shallet Mindih Seta
- Chair of Phytopathology, TUM School of Life Sciences, Technical University of Munich, Emil-Ramann-Str. 2, 85354 Freising, Germany
| | - Gesche Zander
- Chair of Phytopathology, TUM School of Life Sciences, Technical University of Munich, Emil-Ramann-Str. 2, 85354 Freising, Germany
| | - Daniela Scheikl
- Section of Population Genetics, TUM School of Life Sciences, Technical University of Munich, Liesel-Beckmann Str. 2, 85354 Freising, Germany
| | - Ralph Hückelhoven
- Chair of Phytopathology, TUM School of Life Sciences, Technical University of Munich, Emil-Ramann-Str. 2, 85354 Freising, Germany
| | - Matthieu H A J Joosten
- Laboratory of Phytopathology, Wageningen University and Research, Droevendaalsesteeg 1, 6708 PB Wageningen, The Netherlands
| | - Remco Stam
- Chair of Phytopathology, TUM School of Life Sciences, Technical University of Munich, Emil-Ramann-Str. 2, 85354 Freising, Germany
| |
Collapse
|
284
|
Kirov I, Dudnikov M, Merkulov P, Shingaliev A, Omarov M, Kolganova E, Sigaeva A, Karlov G, Soloviev A. Nanopore RNA Sequencing Revealed Long Non-Coding and LTR Retrotransposon-Related RNAs Expressed at Early Stages of Triticale SEED Development. PLANTS 2020; 9:plants9121794. [PMID: 33348863 PMCID: PMC7765848 DOI: 10.3390/plants9121794] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 12/10/2020] [Accepted: 12/15/2020] [Indexed: 01/22/2023]
Abstract
The intergenic space of plant genomes encodes many functionally important yet unexplored RNAs. The genomic loci encoding these RNAs are often considered “junk”, DNA as they are frequently associated with repeat-rich regions of the genome. The latter makes the annotations of these loci and the assembly of the corresponding transcripts using short RNAseq reads particularly challenging. Here, using long-read Nanopore direct RNA sequencing, we aimed to identify these “junk” RNA molecules, including long non-coding RNAs (lncRNAs) and transposon-derived transcripts expressed during early stages (10 days post anthesis) of seed development of triticale (AABBRR, 2n = 6x = 42), an interspecific hybrid between wheat and rye. Altogether, we found 796 lncRNAs and 20 LTR retrotransposon-related transcripts (RTE-RNAs) expressed at this stage, with most of them being previously unannotated and located in the intergenic as well as intronic regions. Sequence analysis of the lncRNAs provide evidence for the frequent exonization of Class I (retrotransposons) and class II (DNA transposons) transposon sequences and suggest direct influence of “junk” DNA on the structure and origin of lncRNAs. We show that the expression patterns of lncRNAs and RTE-related transcripts have high stage specificity. In turn, almost half of the lncRNAs located in Genomes A and B have the highest expression levels at 10–30 days post anthesis in wheat. Detailed analysis of the protein-coding potential of the RTE-RNAs showed that 75% of them carry open reading frames (ORFs) for a diverse set of GAG proteins, the main component of virus-like particles of LTR retrotransposons. We further experimentally demonstrated that some RTE-RNAs originate from autonomous LTR retrotransposons with ongoing transposition activity during early stages of triticale seed development. Overall, our results provide a framework for further exploration of the newly discovered lncRNAs and RTE-RNAs in functional and genome-wide association studies in triticale and wheat. Our study also demonstrates that Nanopore direct RNA sequencing is an indispensable tool for the elucidation of lncRNA and retrotransposon transcripts.
Collapse
Affiliation(s)
- Ilya Kirov
- Laboratory of Marker-Assisted and Genomic Selection of Plants, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya str. 42, 127550 Moscow, Russia; (M.D.); (P.M.); (A.S.); (M.O.); (E.K.); (A.S.); (G.K.); (A.S.)
- Kurchatov Genomics Center of ARRIAB, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Street, 42, 127550 Moscow, Russia
- Correspondence:
| | - Maxim Dudnikov
- Laboratory of Marker-Assisted and Genomic Selection of Plants, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya str. 42, 127550 Moscow, Russia; (M.D.); (P.M.); (A.S.); (M.O.); (E.K.); (A.S.); (G.K.); (A.S.)
- Kurchatov Genomics Center of ARRIAB, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Street, 42, 127550 Moscow, Russia
| | - Pavel Merkulov
- Laboratory of Marker-Assisted and Genomic Selection of Plants, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya str. 42, 127550 Moscow, Russia; (M.D.); (P.M.); (A.S.); (M.O.); (E.K.); (A.S.); (G.K.); (A.S.)
| | - Andrey Shingaliev
- Laboratory of Marker-Assisted and Genomic Selection of Plants, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya str. 42, 127550 Moscow, Russia; (M.D.); (P.M.); (A.S.); (M.O.); (E.K.); (A.S.); (G.K.); (A.S.)
| | - Murad Omarov
- Laboratory of Marker-Assisted and Genomic Selection of Plants, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya str. 42, 127550 Moscow, Russia; (M.D.); (P.M.); (A.S.); (M.O.); (E.K.); (A.S.); (G.K.); (A.S.)
- Faculty of Computer Science, National Research University Higher School of Economics, Pokrovsky Boulvar, 11, 109028 Moscow, Russia
| | - Elizaveta Kolganova
- Laboratory of Marker-Assisted and Genomic Selection of Plants, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya str. 42, 127550 Moscow, Russia; (M.D.); (P.M.); (A.S.); (M.O.); (E.K.); (A.S.); (G.K.); (A.S.)
| | - Alexandra Sigaeva
- Laboratory of Marker-Assisted and Genomic Selection of Plants, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya str. 42, 127550 Moscow, Russia; (M.D.); (P.M.); (A.S.); (M.O.); (E.K.); (A.S.); (G.K.); (A.S.)
| | - Gennady Karlov
- Laboratory of Marker-Assisted and Genomic Selection of Plants, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya str. 42, 127550 Moscow, Russia; (M.D.); (P.M.); (A.S.); (M.O.); (E.K.); (A.S.); (G.K.); (A.S.)
| | - Alexander Soloviev
- Laboratory of Marker-Assisted and Genomic Selection of Plants, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya str. 42, 127550 Moscow, Russia; (M.D.); (P.M.); (A.S.); (M.O.); (E.K.); (A.S.); (G.K.); (A.S.)
| |
Collapse
|
285
|
Shumate A, Salzberg SL. Liftoff: accurate mapping of gene annotations. Bioinformatics 2020; 37:1639-1643. [PMID: 33320174 PMCID: PMC8289374 DOI: 10.1093/bioinformatics/btaa1016] [Citation(s) in RCA: 242] [Impact Index Per Article: 60.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 10/22/2020] [Accepted: 11/24/2020] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION Improvements in DNA sequencing technology and computational methods have led to a substantial increase in the creation of high-quality genome assemblies of many species. To understand the biology of these genomes, annotation of gene features and other functional elements is essential; however for most species, only the reference genome is well-annotated. RESULTS One strategy to annotate new or improved genome assemblies is to map or 'lift over' the genes from a previously-annotated reference genome. Here we describe Liftoff, a new genome annotation lift-over tool capable of mapping genes between two assemblies of the same or closely-related species. Liftoff aligns genes from a reference genome to a target genome and finds the mapping that maximizes sequence identity while preserving the structure of each exon, transcript, and gene. We show that Liftoff can accurately map 99.9% of genes between two versions of the human reference genome with an average sequence identity >99.9%. We also show that Liftoff can map genes across species by successfully lifting over 98.3% of human protein-coding genes to a chimpanzee genome assembly with 98.2% sequence identity. AVAILABILITY AND IMPLEMENTATION Liftoff can be installed via bioconda and PyPI. Additionally, the source code for Liftoff is available at https://github.com/agshumate/Liftoff. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD.,Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD
| | - Steven L Salzberg
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD.,Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD.,Department of Computer Science, Johns Hopkins University, Baltimore, MD.,Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
286
|
Yoon SJ, Son HY, Shim JK, Moon JH, Kim EH, Chang JH, Teo WY, Kim SH, Park SW, Huh YM, Kang SG. Co-expression of cancer driver genes: IDH-wildtype glioblastoma-derived tumorspheres. J Transl Med 2020; 18:482. [PMID: 33317554 PMCID: PMC7734785 DOI: 10.1186/s12967-020-02647-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Accepted: 11/27/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Driver genes of GBM may be crucial for the onset of isocitrate dehydrogenase (IDH)-wildtype (WT) glioblastoma (GBM). However, it is still unknown whether the genes are expressed in the identical cluster of cells. Here, we have examined the gene expression patterns of GBM tissues and patient-derived tumorspheres (TSs) and aimed to find a progression-related gene. METHODS We retrospectively collected primary IDH-WT GBM tissue samples (n = 58) and tumor-free cortical tissue samples (control, n = 20). TSs are isolated from the IDH-WT GBM tissue with B27 neurobasal medium. Associations among the driver genes were explored in the bulk tissue, bulk cell, and a single cell RNAsequencing techniques (scRNAseq) considering the alteration status of TP53, PTEN, EGFR, and TERT promoter as well as MGMT promoter methylation. Transcriptomic perturbation by temozolomide (TMZ) was examined in the two TSs. RESULTS We comprehensively compared the gene expression of the known driver genes as well as MGMT, PTPRZ1, or IDH1. Bulk RNAseq databases of the primary GBM tissue revealed a significant association between TERT and TP53 (p < 0.001, R = 0.28) and its association increased in the recurrent tumor (p < 0.001, R = 0.86). TSs reflected the tissue-level patterns of association between the two genes (p < 0.01, R = 0.59, n = 20). A scRNAseq data of a TS revealed the TERT and TP53 expressing cells are in a same single cell cluster. The driver-enriched cluster dominantly expressed the glioma-associated long noncoding RNAs. Most of the driver-associated genes were downregulated after TMZ except IGFBP5. CONCLUSIONS GBM tissue level expression patterns of EGFR, TERT, PTEN, IDH1, PTPRZ1, and MGMT are observed in the GBM TSs. The driver gene-associated cluster of the GBM single cells were enriched with the glioma-associated long noncoding RNAs.
Collapse
Affiliation(s)
- Seon-Jin Yoon
- Department of Biochemistry and Molecular Biology, College of Medicine, Yonsei University, Seoul, Korea
- Brain Korea 21 PLUS Project for Medical Science, Yonsei University, Seoul, Korea
| | - Hye Young Son
- Severance Biomedical Science Institute, College of Medicine, Yonsei University, Seoul, Korea
| | - Jin-Kyoung Shim
- Department of Neurosurgery, Brain Tumor Center, Severance Hospital, College of Medicine, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Ju Hyung Moon
- Department of Neurosurgery, Brain Tumor Center, Severance Hospital, College of Medicine, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Eui-Hyun Kim
- Department of Neurosurgery, Brain Tumor Center, Severance Hospital, College of Medicine, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Jong Hee Chang
- Department of Neurosurgery, Brain Tumor Center, Severance Hospital, College of Medicine, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea
| | - Wan Yee Teo
- Cancer and Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore
- National Cancer Center, Singapore, Singapore
- KK Women's and Children's Hospital, Singapore, Singapore
- Institute of Molecular and Cell Biology, A*STAR, Singapore, Singapore
| | - Se Hoon Kim
- Department of Pathology, Severance Hospital, College of Medicine, Yonsei University, Seoul, Korea
| | - Sahng Wook Park
- Department of Biochemistry and Molecular Biology, College of Medicine, Yonsei University, Seoul, Korea
- Brain Korea 21 PLUS Project for Medical Science, Yonsei University, Seoul, Korea
| | - Yong-Min Huh
- Department of Biochemistry and Molecular Biology, College of Medicine, Yonsei University, Seoul, Korea.
- Severance Biomedical Science Institute, College of Medicine, Yonsei University, Seoul, Korea.
- Department of Radiology, Severance Hospital, Yonsei University College of Medicine, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.
- YUHS-KRIBB Medical Convergence Research Institute, Seoul, Republic of Korea.
| | - Seok-Gu Kang
- Department of Neurosurgery, Brain Tumor Center, Severance Hospital, College of Medicine, Yonsei University, 50-1 Yonsei-ro, Seodaemun-gu, Seoul, 03722, Republic of Korea.
- Department of Medical Science, Yonsei University Graduate School, Seoul, Korea.
| |
Collapse
|
287
|
Kirov I, Omarov M, Merkulov P, Dudnikov M, Gvaramiya S, Kolganova E, Komakhin R, Karlov G, Soloviev A. Genomic and Transcriptomic Survey Provides New Insight into the Organization and Transposition Activity of Highly Expressed LTR Retrotransposons of Sunflower ( Helianthus annuus L.). Int J Mol Sci 2020; 21:E9331. [PMID: 33297579 PMCID: PMC7730604 DOI: 10.3390/ijms21239331] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 12/01/2020] [Accepted: 12/04/2020] [Indexed: 12/21/2022] Open
Abstract
LTR retrotransposons (RTEs) play a crucial role in plant genome evolution and adaptation. Although RTEs are generally silenced in somatic plant tissues under non-stressed conditions, some expressed RTEs (exRTEs) escape genome defense mechanisms. As our understanding of exRTE organization in plants is rudimentary, we systematically surveyed the genomic and transcriptomic organization and mobilome (transposition) activity of sunflower (Helianthus annuus L.) exRTEs. We identified 44 transcribed RTEs in the sunflower genome and demonstrated their distinct genomic features: more recent insertion time, longer open reading frame (ORF) length, and smaller distance to neighboring genes. We showed that GAG-encoding ORFs are present at significantly higher frequencies in exRTEs, compared with non-expressed RTEs. Most exRTEs exhibit variation in copy number among sunflower cultivars and one exRTE Gagarin produces extrachromosomal circular DNA in seedling, demonstrating recent and ongoing transposition activity. Nanopore direct RNA sequencing of full-length RTE RNA revealed complex patterns of alternative splicing in RTE RNAs, resulting in isoforms that carry ORFs for distinct RTE proteins. Together, our study demonstrates that tens of expressed sunflower RTEs with specific genomic organization shape the hidden layer of the transcriptome, pointing to the evolution of specific strategies that circumvent existing genome defense mechanisms.
Collapse
Affiliation(s)
- Ilya Kirov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.O.); (P.M.); (M.D.); (S.G.); (E.K.); (R.K.); (G.K.); (A.S.)
- Kurchatov Genomics Center of ARRIAB, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Street, 42, 127550 Moscow, Russia
| | - Murad Omarov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.O.); (P.M.); (M.D.); (S.G.); (E.K.); (R.K.); (G.K.); (A.S.)
- Faculty of Computer Science, National Research University Higher School of Economics, Pokrovsky Boulvar 11, 109028 Moscow, Russia
| | - Pavel Merkulov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.O.); (P.M.); (M.D.); (S.G.); (E.K.); (R.K.); (G.K.); (A.S.)
| | - Maxim Dudnikov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.O.); (P.M.); (M.D.); (S.G.); (E.K.); (R.K.); (G.K.); (A.S.)
- Kurchatov Genomics Center of ARRIAB, All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Street, 42, 127550 Moscow, Russia
| | - Sofya Gvaramiya
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.O.); (P.M.); (M.D.); (S.G.); (E.K.); (R.K.); (G.K.); (A.S.)
| | - Elizaveta Kolganova
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.O.); (P.M.); (M.D.); (S.G.); (E.K.); (R.K.); (G.K.); (A.S.)
| | - Roman Komakhin
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.O.); (P.M.); (M.D.); (S.G.); (E.K.); (R.K.); (G.K.); (A.S.)
| | - Gennady Karlov
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.O.); (P.M.); (M.D.); (S.G.); (E.K.); (R.K.); (G.K.); (A.S.)
| | - Alexander Soloviev
- All-Russia Research Institute of Agricultural Biotechnology, Timiryazevskaya Str. 42, 127550 Moscow, Russia; (M.O.); (P.M.); (M.D.); (S.G.); (E.K.); (R.K.); (G.K.); (A.S.)
| |
Collapse
|
288
|
Aid M, Busman-Sahay K, Vidal SJ, Maliga Z, Bondoc S, Starke C, Terry M, Jacobson CA, Wrijil L, Ducat S, Brook OR, Miller AD, Porto M, Pellegrini KL, Pino M, Hoang TN, Chandrashekar A, Patel S, Stephenson K, Bosinger SE, Andersen H, Lewis MG, Hecht JL, Sorger PK, Martinot AJ, Estes JD, Barouch DH. Vascular Disease and Thrombosis in SARS-CoV-2-Infected Rhesus Macaques. Cell 2020; 183:1354-1366.e13. [PMID: 33065030 PMCID: PMC7546181 DOI: 10.1016/j.cell.2020.10.005] [Citation(s) in RCA: 158] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 09/30/2020] [Accepted: 10/05/2020] [Indexed: 12/14/2022]
Abstract
The COVID-19 pandemic has led to extensive morbidity and mortality throughout the world. Clinical features that drive SARS-CoV-2 pathogenesis in humans include inflammation and thrombosis, but the mechanistic details underlying these processes remain to be determined. In this study, we demonstrate endothelial disruption and vascular thrombosis in histopathologic sections of lungs from both humans and rhesus macaques infected with SARS-CoV-2. To define key molecular pathways associated with SARS-CoV-2 pathogenesis in macaques, we performed transcriptomic analyses of bronchoalveolar lavage and peripheral blood and proteomic analyses of serum. We observed macrophage infiltrates in lung and upregulation of macrophage, complement, platelet activation, thrombosis, and proinflammatory markers, including C-reactive protein, MX1, IL-6, IL-1, IL-8, TNFα, and NF-κB. These results suggest a model in which critical interactions between inflammatory and thrombosis pathways lead to SARS-CoV-2-induced vascular disease. Our findings suggest potential therapeutic targets for COVID-19.
Collapse
Affiliation(s)
- Malika Aid
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA
| | | | - Samuel J Vidal
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA
| | - Zoltan Maliga
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Stephen Bondoc
- Oregon Health & Sciences University, Beaverton, OR 97006, USA
| | - Carly Starke
- Oregon Health & Sciences University, Beaverton, OR 97006, USA
| | - Margaret Terry
- Oregon Health & Sciences University, Beaverton, OR 97006, USA
| | - Connor A Jacobson
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Linda Wrijil
- Tufts University Cummings School of Veterinary Medicine, North Grafton, MA 01536, USA
| | - Sarah Ducat
- Tufts University Cummings School of Veterinary Medicine, North Grafton, MA 01536, USA
| | - Olga R Brook
- Department of Radiology, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA 02215, USA
| | - Andrew D Miller
- Department of Biomedical Sciences, Section of Anatomic Pathology, Cornell University College of Veterinary Medicine, Ithaca, NY 14853, USA
| | | | - Kathryn L Pellegrini
- Yerkes Genomics Core Laboratory, Yerkes National Primate Research Center, Emory University, Atlanta, GA 30329, USA
| | - Maria Pino
- Division of Microbiology and Immunology, Yerkes National Primate Research Center, Emory University, Atlanta, GA 30329, USA
| | - Timothy N Hoang
- Division of Microbiology and Immunology, Yerkes National Primate Research Center, Emory University, Atlanta, GA 30329, USA
| | - Abishek Chandrashekar
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA
| | - Shivani Patel
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA
| | - Kathryn Stephenson
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA
| | - Steven E Bosinger
- Division of Microbiology and Immunology, Yerkes National Primate Research Center, Emory University, Atlanta, GA 30329, USA; Yerkes Genomics Core Laboratory, Yerkes National Primate Research Center, Emory University, Atlanta, GA 30329, USA; Department of Pathology & Laboratory Medicine, Emory School of Medicine, Emory University, Atlanta, GA 30329, USA
| | | | | | - Jonathan L Hecht
- Department of Pathology, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA
| | - Peter K Sorger
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Amanda J Martinot
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA; Tufts University Cummings School of Veterinary Medicine, North Grafton, MA 01536, USA
| | - Jacob D Estes
- Oregon Health & Sciences University, Beaverton, OR 97006, USA
| | - Dan H Barouch
- Center for Virology and Vaccine Research, Beth Israel Deaconess Medical Center, Boston, MA 02215, USA; Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA 02139, USA.
| |
Collapse
|
289
|
An integrative atlas of chicken long non-coding genes and their annotations across 25 tissues. Sci Rep 2020; 10:20457. [PMID: 33235280 PMCID: PMC7686352 DOI: 10.1038/s41598-020-77586-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2020] [Accepted: 11/11/2020] [Indexed: 12/11/2022] Open
Abstract
Long non-coding RNAs (LNC) regulate numerous biological processes. In contrast to human, the identification of LNC in farm species, like chicken, is still lacunar. We propose a catalogue of 52,075 chicken genes enriched in LNC (http://www.fragencode.org/), built from the Ensembl reference extended using novel LNC modelled here from 364 RNA-seq and LNC from four public databases. The Ensembl reference grew from 4,643 to 30,084 LNC, of which 59% and 41% with expression ≥ 0.5 and ≥ 1 TPM respectively. Characterization of these LNC relatively to the closest protein coding genes (PCG) revealed that 79% of LNC are in intergenic regions, as in other species. Expression analysis across 25 tissues revealed an enrichment of co-expressed LNC:PCG pairs, suggesting co-regulation and/or co-function. As expected LNC were more tissue-specific than PCG (25% vs. 10%). Similarly to human, 16% of chicken LNC hosted one or more miRNA. We highlighted a new chicken LNC, hosting miR155, conserved in human, highly expressed in immune tissues like miR155, and correlated with immunity-related PCG in both species. Among LNC:PCG pairs tissue-specific in the same tissue, we revealed an enrichment of divergent pairs with the PCG coding transcription factors, as for example LHX5, HXD3 and TBX4, in both human and chicken.
Collapse
|
290
|
Abstract
The giant sequoia (Sequoiadendron giganteum) of California are massive, long-lived trees that grow along the U.S. Sierra Nevada mountains. Genomic data are limited in giant sequoia and producing a reference genome sequence has been an important goal to allow marker development for restoration and management. Using deep-coverage Illumina and Oxford Nanopore sequencing, combined with Dovetail chromosome conformation capture libraries, the genome was assembled into eleven chromosome-scale scaffolds containing 8.125 Gbp of sequence. Iso-Seq transcripts, assembled from three distinct tissues, was used as evidence to annotate a total of 41,632 protein-coding genes. The genome was found to contain, distributed unevenly across all 11 chromosomes and in 63 orthogroups, over 900 complete or partial predicted NLR genes, of which 375 are supported by annotation derived from protein evidence and gene modeling. This giant sequoia reference genome sequence represents the first genome sequenced in the Cupressaceae family, and lays a foundation for using genomic tools to aid in giant sequoia conservation and management.
Collapse
|
291
|
Schaarschmidt S, Fischer A, Lawas LMF, Alam R, Septiningsih EM, Bailey-Serres J, Jagadish SVK, Huettel B, Hincha DK, Zuther E. Utilizing PacBio Iso-Seq for Novel Transcript and Gene Discovery of Abiotic Stress Responses in Oryza sativa L. Int J Mol Sci 2020; 21:ijms21218148. [PMID: 33142722 PMCID: PMC7663775 DOI: 10.3390/ijms21218148] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Revised: 10/20/2020] [Accepted: 10/30/2020] [Indexed: 01/05/2023] Open
Abstract
The wide natural variation present in rice is an important source of genes to facilitate stress tolerance breeding. However, identification of candidate genes from RNA-Seq studies is hampered by the lack of high-quality genome assemblies for the most stress tolerant cultivars. A more targeted solution is the reconstruction of transcriptomes to provide templates to map RNA-seq reads. Here, we sequenced transcriptomes of ten rice cultivars of three subspecies on the PacBio Sequel platform. RNA was isolated from different organs of plants grown under control and abiotic stress conditions in different environments. Reconstructed de novo reference transcriptomes resulted in 37,500 to 54,600 plant-specific high-quality isoforms per cultivar. Isoforms were collapsed to reduce sequence redundancy and evaluated, e.g., for protein completeness (BUSCO). About 40% of all identified transcripts were novel isoforms compared to the Nipponbare reference transcriptome. For the drought/heat tolerant aus cultivar N22, 56 differentially expressed genes in developing seeds were identified at combined heat and drought in the field. The newly generated rice transcriptomes are useful to identify candidate genes for stress tolerance breeding not present in the reference transcriptomes/genomes. In addition, our approach provides a cost-effective alternative to genome sequencing for identification of candidate genes in highly stress tolerant genotypes.
Collapse
Affiliation(s)
- Stephanie Schaarschmidt
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam, Germany; (A.F.); (L.M.F.L.); (D.K.H.)
- Correspondence: (S.S.); (E.Z.)
| | - Axel Fischer
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam, Germany; (A.F.); (L.M.F.L.); (D.K.H.)
| | - Lovely Mae F. Lawas
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam, Germany; (A.F.); (L.M.F.L.); (D.K.H.)
- Department of Biological Sciences, Auburn University, Auburn, AL 36849, USA
| | - Rejbana Alam
- Center for Plant Cell Biology, Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA 92521, USA; (R.A.); (J.B.-S.)
| | - Endang M. Septiningsih
- Department of Soil and Crop Sciences, Texas A&M University, College Station, TX 77843, USA;
| | - Julia Bailey-Serres
- Center for Plant Cell Biology, Department of Botany and Plant Sciences, University of California Riverside, Riverside, CA 92521, USA; (R.A.); (J.B.-S.)
| | - S. V. Krishna Jagadish
- International Rice Research Institute, DAPO Box 7777, Metro Manila 1301, Philippines;
- Department of Agronomy, Kansas State University, Manhattan, KS 66506, USA
| | - Bruno Huettel
- Max Planck Genome Centre Cologne, Carl-von-Linné-Weg 10, 50829 Cologne, Germany;
| | - Dirk K. Hincha
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam, Germany; (A.F.); (L.M.F.L.); (D.K.H.)
| | - Ellen Zuther
- Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam, Germany; (A.F.); (L.M.F.L.); (D.K.H.)
- Correspondence: (S.S.); (E.Z.)
| |
Collapse
|
292
|
Kuo RI, Cheng Y, Zhang R, Brown JWS, Smith J, Archibald AL, Burt DW. Illuminating the dark side of the human transcriptome with long read transcript sequencing. BMC Genomics 2020; 21:751. [PMID: 33126848 PMCID: PMC7596999 DOI: 10.1186/s12864-020-07123-7] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 10/06/2020] [Indexed: 12/13/2022] Open
Abstract
Background The human transcriptome annotation is regarded as one of the most complete of any eukaryotic species. However, limitations in sequencing technologies have biased the annotation toward multi-exonic protein coding genes. Accurate high-throughput long read transcript sequencing can now provide additional evidence for rare transcripts and genes such as mono-exonic and non-coding genes that were previously either undetectable or impossible to differentiate from sequencing noise. Results We developed the Transcriptome Annotation by Modular Algorithms (TAMA) software to leverage the power of long read transcript sequencing and address the issues with current data processing pipelines. TAMA achieved high sensitivity and precision for gene and transcript model predictions in both reference guided and unguided approaches in our benchmark tests using simulated Pacific Biosciences (PacBio) and Nanopore sequencing data and real PacBio datasets. By analyzing PacBio Sequel II Iso-Seq sequencing data of the Universal Human Reference RNA (UHRR) using TAMA and other commonly used tools, we found that the convention of using alignment identity to measure error correction performance does not reflect actual gain in accuracy of predicted transcript models. In addition, inter-read error correction can cause major changes to read mapping, resulting in potentially over 6 K erroneous gene model predictions in the Iso-Seq based human genome annotation. Using TAMA’s genome assembly based error correction and gene feature evidence, we predicted 2566 putative novel non-coding genes and 1557 putative novel protein coding gene models. Conclusions Long read transcript sequencing data has the power to identify novel genes within the highly annotated human genome. The use of parameter tuning and extensive output information of the TAMA software package allows for in depth exploration of eukaryotic transcriptomes. We have found long read data based evidence for thousands of unannotated genes within the human genome. More development in sequencing library preparation and data processing are required for differentiating sequencing noise from real genes in long read RNA sequencing data.
Collapse
Affiliation(s)
- Richard I Kuo
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK.
| | - Yuanyuan Cheng
- The University of Queensland, St. Lucia, Brisbane, QLD, 4072, Australia.,School of Life and Environmental Sciences, University of Sydney, Sydney, New South Wales, Australia
| | - Runxuan Zhang
- Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, Scotland, UK
| | - John W S Brown
- Plant Sciences Division, School of Life Sciences, University of Dundee, Invergowrie, Dundee, Scotland, UK.,Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, Scotland, UK
| | - Jacqueline Smith
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK
| | - Alan L Archibald
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK
| | - David W Burt
- The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Midlothian, EH25 9RG, UK.,The University of Queensland, St. Lucia, Brisbane, QLD, 4072, Australia
| |
Collapse
|
293
|
Alonge M, Shumate A, Puiu D, Zimin AV, Salzberg SL. Chromosome-Scale Assembly of the Bread Wheat Genome Reveals Thousands of Additional Gene Copies. Genetics 2020; 216:599-608. [PMID: 32796007 PMCID: PMC7536849 DOI: 10.1534/genetics.120.303501] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2020] [Accepted: 08/10/2020] [Indexed: 11/18/2022] Open
Abstract
Bread wheat (Triticum aestivum) is a major food crop and an important plant system for agricultural genetics research. However, due to the complexity and size of its allohexaploid genome, genomic resources are limited compared to other major crops. The IWGSC recently published a reference genome and associated annotation (IWGSC CS v1.0, Chinese Spring) that has been widely adopted and utilized by the wheat community. Although this reference assembly represents all three wheat subgenomes at chromosome-scale, it was derived from short reads, and thus is missing a substantial portion of the expected 16 Gbp of genomic sequence. We earlier published an independent wheat assembly (Triticum_aestivum_3.1, Chinese Spring) that came much closer in length to the expected genome size, although it was only a contig-level assembly lacking gene annotations. Here, we describe a reference-guided effort to scaffold those contigs into chromosome-length pseudomolecules, add in any missing sequence that was unique to the IWGSC CS v1.0 assembly, and annotate the resulting pseudomolecules with genes. Our updated assembly, Triticum_aestivum_4.0, contains 15.07 Gbp of nongap sequence anchored to chromosomes, which is 1.2 Gbps more than the previous reference assembly. It includes 108,639 genes unambiguously localized to chromosomes, including over 2000 genes that were previously unplaced. We also discovered >5700 additional gene copies, facilitating the accurate annotation of functional gene duplications including at the Ppd-B1 photoperiod response locus.
Collapse
Affiliation(s)
- Michael Alonge
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218
| | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland 21211
| | - Daniela Puiu
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland 21211
| | - Aleksey V Zimin
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland 21211
| | - Steven L Salzberg
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, Maryland 21218
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland 21211
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland 21205
| |
Collapse
|
294
|
Pham GM, Hamilton JP, Wood JC, Burke JT, Zhao H, Vaillancourt B, Ou S, Jiang J, Buell CR. Construction of a chromosome-scale long-read reference genome assembly for potato. Gigascience 2020; 9:giaa100. [PMID: 32964225 PMCID: PMC7509475 DOI: 10.1093/gigascience/giaa100] [Citation(s) in RCA: 131] [Impact Index Per Article: 32.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 08/26/2020] [Accepted: 09/05/2020] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND Worldwide, the cultivated potato, Solanum tuberosum L., is the No. 1 vegetable crop and a critical food security crop. The genome sequence of DM1-3 516 R44, a doubled monoploid clone of S. tuberosum Group Phureja, was published in 2011 using a whole-genome shotgun sequencing approach with short-read sequence data. Current advanced sequencing technologies now permit generation of near-complete, high-quality chromosome-scale genome assemblies at minimal cost. FINDINGS Here, we present an updated version of the DM1-3 516 R44 genome sequence (v6.1) using Oxford Nanopore Technologies long reads coupled with proximity-by-ligation scaffolding (Hi-C), yielding a chromosome-scale assembly. The new (v6.1) assembly represents 741.6 Mb of sequence (87.8%) of the estimated 844 Mb genome, of which 741.5 Mb is non-gapped with 731.2 Mb anchored to the 12 chromosomes. Use of Oxford Nanopore Technologies full-length complementary DNA sequencing enabled annotation of 32,917 high-confidence protein-coding genes encoding 44,851 gene models that had a significantly improved representation of conserved orthologs compared with the previous annotation. The new assembly has improved contiguity with a 595-fold increase in N50 contig size, 99% reduction in the number of contigs, a 44-fold increase in N50 scaffold size, and an LTR Assembly Index score of 13.56, placing it in the category of reference genome quality. The improved assembly also permitted annotation of the centromeres via alignment to sequencing reads derived from CENH3 nucleosomes. CONCLUSIONS Access to advanced sequencing technologies and improved software permitted generation of a high-quality, long-read, chromosome-scale assembly and improved annotation dataset for the reference genotype of potato that will facilitate research aimed at improving agronomic traits and understanding genome evolution.
Collapse
Affiliation(s)
- Gina M Pham
- Department of Plant Biology, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA
| | - John P Hamilton
- Department of Plant Biology, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA
| | - Joshua C Wood
- Department of Plant Biology, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA
| | - Joseph T Burke
- Department of Plant Biology, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA
| | - Hainan Zhao
- Department of Plant Biology, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA
| | - Brieanne Vaillancourt
- Department of Plant Biology, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA
| | - Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, 2200 Osborne Dr, Ames, IA 50011, USA
| | - Jiming Jiang
- Department of Plant Biology, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA
- Department of Horticulture, Michigan State University, 1066 Bogue St, East Lansing, MI 48824, USA
- MSU AgBioResearch, Michigan State University, 446 W. Circle Drive, East Lansing, MI 48824, USA
| | - C Robin Buell
- Department of Plant Biology, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA
- MSU AgBioResearch, Michigan State University, 446 W. Circle Drive, East Lansing, MI 48824, USA
- Plant Resilience Institute, Michigan State University, 612 Wilson Road, East Lansing, MI 48824, USA
| |
Collapse
|
295
|
Yao J, Wu DC, Nottingham RM, Lambowitz AM. Identification of protein-protected mRNA fragments and structured excised intron RNAs in human plasma by TGIRT-seq peak calling. eLife 2020; 9:e60743. [PMID: 32876046 PMCID: PMC7518892 DOI: 10.7554/elife.60743] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Accepted: 09/01/2020] [Indexed: 12/18/2022] Open
Abstract
Human plasma contains > 40,000 different coding and non-coding RNAs that are potential biomarkers for human diseases. Here, we used thermostable group II intron reverse transcriptase sequencing (TGIRT-seq) combined with peak calling to simultaneously profile all RNA biotypes in apheresis-prepared human plasma pooled from healthy individuals. Extending previous TGIRT-seq analysis, we found that human plasma contains largely fragmented mRNAs from > 19,000 protein-coding genes, abundant full-length, mature tRNAs and other structured small non-coding RNAs, and less abundant tRNA fragments and mature and pre-miRNAs. Many of the mRNA fragments identified by peak calling correspond to annotated protein-binding sites and/or have stable predicted secondary structures that could afford protection from plasma nucleases. Peak calling also identified novel repeat RNAs, miRNA-sized RNAs, and putatively structured intron RNAs of potential biological, evolutionary, and biomarker significance, including a family of full-length excised intron RNAs, subsets of which correspond to mirtron pre-miRNAs or agotrons.
Collapse
Affiliation(s)
- Jun Yao
- Institute for Cellular and Molecular Biology and Departments of Molecular Biosciences and Oncology, University of TexasAustinUnited States
| | - Douglas C Wu
- Institute for Cellular and Molecular Biology and Departments of Molecular Biosciences and Oncology, University of TexasAustinUnited States
| | - Ryan M Nottingham
- Institute for Cellular and Molecular Biology and Departments of Molecular Biosciences and Oncology, University of TexasAustinUnited States
| | - Alan M Lambowitz
- Institute for Cellular and Molecular Biology and Departments of Molecular Biosciences and Oncology, University of TexasAustinUnited States
| |
Collapse
|
296
|
Lawson ND, Li R, Shin M, Grosse A, Yukselen O, Stone OA, Kucukural A, Zhu L. An improved zebrafish transcriptome annotation for sensitive and comprehensive detection of cell type-specific genes. eLife 2020; 9:55792. [PMID: 32831172 PMCID: PMC7486121 DOI: 10.7554/elife.55792] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Accepted: 08/21/2020] [Indexed: 02/07/2023] Open
Abstract
The zebrafish is ideal for studying embryogenesis and is increasingly applied to model human disease. In these contexts, RNA-sequencing (RNA-seq) provides mechanistic insights by identifying transcriptome changes between experimental conditions. Application of RNA-seq relies on accurate transcript annotation for a genome of interest. Here, we find discrepancies in analysis from RNA-seq datasets quantified using Ensembl and RefSeq zebrafish annotations. These issues were due, in part, to variably annotated 3' untranslated regions and thousands of gene models missing from each annotation. Since these discrepancies could compromise downstream analyses and biological reproducibility, we built a more comprehensive zebrafish transcriptome annotation that addresses these deficiencies. Our annotation improves detection of cell type-specific genes in both bulk and single cell RNA-seq datasets, where it also improves resolution of cell clustering. Thus, we demonstrate that our new transcriptome annotation can outperform existing annotations, providing an important resource for zebrafish researchers.
Collapse
Affiliation(s)
- Nathan D Lawson
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, United States
| | - Rui Li
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, United States
| | - Masahiro Shin
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, United States
| | - Ann Grosse
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, United States
| | - Onur Yukselen
- Bioinformatics Core, University of Massachusetts Medical School, Worcester, United States
| | - Oliver A Stone
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Alper Kucukural
- Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, United States.,Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, United States
| | - Lihua Zhu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Medical School, Worcester, United States.,Program in Molecular Medicine, University of Massachusetts Medical School, Worcester, United States.,Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, United States
| |
Collapse
|
297
|
Nip KM, Chiu R, Yang C, Chu J, Mohamadi H, Warren RL, Birol I. RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes. Genome Res 2020; 30:1191-1200. [PMID: 32817073 PMCID: PMC7462077 DOI: 10.1101/gr.260174.119] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 07/23/2020] [Indexed: 12/27/2022]
Abstract
Despite the rapid advance in single-cell RNA sequencing (scRNA-seq) technologies within the last decade, single-cell transcriptome analysis workflows have primarily used gene expression data while isoform sequence analysis at the single-cell level still remains fairly limited. Detection and discovery of isoforms in single cells is difficult because of the inherent technical shortcomings of scRNA-seq data, and existing transcriptome assembly methods are mainly designed for bulk RNA samples. To address this challenge, we developed RNA-Bloom, an assembly algorithm that leverages the rich information content aggregated from multiple single-cell transcriptomes to reconstruct cell-specific isoforms. Assembly with RNA-Bloom can be either reference-guided or reference-free, thus enabling unbiased discovery of novel isoforms or foreign transcripts. We compared both assembly strategies of RNA-Bloom against five state-of-the-art reference-free and reference-based transcriptome assembly methods. In our benchmarks on a simulated 384-cell data set, reference-free RNA-Bloom reconstructed 37.9%–38.3% more isoforms than the best reference-free assembler, whereas reference-guided RNA-Bloom reconstructed 4.1%–11.6% more isoforms than reference-based assemblers. When applied to a real 3840-cell data set consisting of more than 4 billion reads, RNA-Bloom reconstructed 9.7%–25.0% more isoforms than the best competing reference-based and reference-free approaches evaluated. We expect RNA-Bloom to boost the utility of scRNA-seq data beyond gene expression analysis, expanding what is informatically accessible now.
Collapse
Affiliation(s)
- Ka Ming Nip
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Readman Chiu
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Chen Yang
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Justin Chu
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Hamid Mohamadi
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - René L Warren
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6
| | - Inanc Birol
- Canada's Michael Smith Genome Sciences Centre, BC Cancer, Vancouver, British Columbia, Canada V5Z 4S6.,Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada V6H 3N1
| |
Collapse
|
298
|
Li Y, Liu Y, Yang H, Zhang T, Naruse K, Tu Q. Dynamic transcriptional and chromatin accessibility landscape of medaka embryogenesis. Genome Res 2020; 30:924-937. [PMID: 32591361 PMCID: PMC7370878 DOI: 10.1101/gr.258871.119] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2019] [Accepted: 06/17/2020] [Indexed: 12/13/2022]
Abstract
Medaka (Oryzias latipes) has become an important vertebrate model widely used in genetics, developmental biology, environmental sciences, and many other fields. A high-quality genome sequence and a variety of genetic tools are available for this model organism. However, existing genome annotation is still rudimentary, as it was mainly based on computational prediction and short-read RNA-seq data. Here we report a dynamic transcriptome landscape of medaka embryogenesis profiled by long-read RNA-seq, short-read RNA-seq, and ATAC-seq. By integrating these data sets, we constructed a much-improved gene model set including about 17,000 novel isoforms and identified 1600 transcription factors, 1100 long noncoding RNAs, and 150,000 potential cis-regulatory elements as well. Time-series data sets provided another dimension of information. With the expression dynamics of genes and accessibility dynamics of cis-regulatory elements, we investigated isoform switching, as well as regulatory logic between accessible elements and genes, during embryogenesis. We built a user-friendly medaka omics data portal to present these data sets. This resource provides the first comprehensive omics data sets of medaka embryogenesis. Ultimately, we term these three assays as the minimum ENCODE toolbox and propose the use of it as the initial and essential profiling genomic assays for model organisms that have limited data available. This work will be of great value for the research community using medaka as the model organism and many others as well.
Collapse
Affiliation(s)
- Yingshu Li
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yongjie Liu
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hang Yang
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ting Zhang
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Kiyoshi Naruse
- Laboratory of Bioresources, National Institute for Basic Biology, Okazaki 444-8585, Aichi, Japan
| | - Qiang Tu
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China.,Key Laboratory of Genetic Network Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|