1
|
Tu M, Zeng J, Zhang J, Fan G, Song G. Unleashing the power within short-read RNA-seq for plant research: Beyond differential expression analysis and toward regulomics. FRONTIERS IN PLANT SCIENCE 2022; 13:1038109. [PMID: 36570898 PMCID: PMC9773216 DOI: 10.3389/fpls.2022.1038109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 11/21/2022] [Indexed: 06/17/2023]
Abstract
RNA-seq has become a state-of-the-art technique for transcriptomic studies. Advances in both RNA-seq techniques and the corresponding analysis tools and pipelines have unprecedently shaped our understanding in almost every aspects of plant sciences. Notably, the integration of huge amount of RNA-seq with other omic data sets in the model plants and major crop species have facilitated plant regulomics, while the RNA-seq analysis has still been primarily used for differential expression analysis in many less-studied plant species. To unleash the analytical power of RNA-seq in plant species, especially less-studied species and biomass crops, we summarize recent achievements of RNA-seq analysis in the major plant species and representative tools in the four types of application: (1) transcriptome assembly, (2) construction of expression atlas, (3) network analysis, and (4) structural alteration. We emphasize the importance of expression atlas, coexpression networks and predictions of gene regulatory relationships in moving plant transcriptomes toward regulomics, an omic view of genome-wide transcription regulation. We highlight what can be achieved in plant research with RNA-seq by introducing a list of representative RNA-seq analysis tools and resources that are developed for certain minor species or suitable for the analysis without species limitation. In summary, we provide an updated digest on RNA-seq tools, resources and the diverse applications for plant research, and our perspective on the power and challenges of short-read RNA-seq analysis from a regulomic point view. A full utilization of these fruitful RNA-seq resources will promote plant omic research to a higher level, especially in those less studied species.
Collapse
Affiliation(s)
- Min Tu
- School of Chemical and Environmental Engineering, Wuhan Polytechnic University, Wuhan, China
| | - Jian Zeng
- Guangdong Provincial Key Laboratory of Utilization and Conservation of Food and Medicinal Resources in Northern Region, Shaoguan University, Shaoguan, Guangdong, China
| | - Juntao Zhang
- School of Chemical and Environmental Engineering, Wuhan Polytechnic University, Wuhan, China
| | - Guozhi Fan
- School of Chemical and Environmental Engineering, Wuhan Polytechnic University, Wuhan, China
| | - Guangsen Song
- School of Chemical and Environmental Engineering, Wuhan Polytechnic University, Wuhan, China
| |
Collapse
|
2
|
Cantó-Pastor A, Mason GA, Brady SM, Provart NJ. Arabidopsis bioinformatics: tools and strategies. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2021; 108:1585-1596. [PMID: 34695270 DOI: 10.1111/tpj.15547] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 10/01/2021] [Accepted: 10/19/2021] [Indexed: 06/13/2023]
Abstract
The sequencing of the Arabidopsis thaliana genome 21 years ago ushered in the genomics era for plant research. Since then, an incredible variety of bioinformatic tools permit easy access to large repositories of genomic, transcriptomic, proteomic, epigenomic and other '-omic' data. In this review, we cover some more recent tools (and highlight the 'classics') for exploring such data in order to help formulate quality, testable hypotheses, often without having to generate new experimental data. We cover tools for examining gene expression and co-expression patterns, undertaking promoter analyses and gene set enrichment analyses, and exploring protein-protein and protein-DNA interactions. We will touch on tools that integrate different data sets at the end of the article.
Collapse
Affiliation(s)
- Alex Cantó-Pastor
- Department of Plant Biology and Genome Center, University of California Davis, 1 Shields Avenue, Davis, CA, 95616, USA
| | - G Alex Mason
- Department of Plant Biology and Genome Center, University of California Davis, 1 Shields Avenue, Davis, CA, 95616, USA
| | - Siobhan M Brady
- Department of Plant Biology and Genome Center, University of California Davis, 1 Shields Avenue, Davis, CA, 95616, USA
| | - Nicholas J Provart
- Department of Cell and Systems Biology/Centre for the Analysis of Genome Evolution and Function, University of Toronto, 25 Willcocks Street, Toronto, ON, M5S 3B2, Canada
| |
Collapse
|
3
|
Defoort J, Van de Peer Y, Carretero-Paulet L. The Evolution of Gene Duplicates in Angiosperms and the Impact of Protein-Protein Interactions and the Mechanism of Duplication. Genome Biol Evol 2020; 11:2292-2305. [PMID: 31364708 PMCID: PMC6735927 DOI: 10.1093/gbe/evz156] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/10/2019] [Indexed: 01/17/2023] Open
Abstract
Gene duplicates, generated through either whole genome duplication (WGD) or small-scale duplication (SSD), are prominent in angiosperms and are believed to play an important role in adaptation and in generating evolutionary novelty. Previous studies reported contrasting evolutionary and functional dynamics of duplicate genes depending on the mechanism of origin, a behavior that is hypothesized to stem from constraints to maintain the relative dosage balance between the genes concerned and their interaction context. However, the mechanisms ultimately influencing loss and retention of gene duplicates over evolutionary time are not yet fully elucidated. Here, by using a robust classification of gene duplicates in Arabidopsis thaliana, Solanum lycopersicum, and Zea mays, large RNAseq expression compendia and an extensive protein-protein interaction (PPI) network from Arabidopsis, we investigated the impact of PPIs on the differential evolutionary and functional fate of WGD and SSD duplicates. In all three species, retained WGD duplicates show stronger constraints to diverge at the sequence and expression level than SSD ones, a pattern that is also observed for shared PPI partners between Arabidopsis duplicates. PPIs are preferentially distributed among WGD duplicates and specific functional categories. Furthermore, duplicates with PPIs tend to be under stronger constraints to evolve than their counterparts without PPIs regardless of their mechanism of origin. Our results support dosage balance constraint as a specific property of genes involved in biological interactions, including physical PPIs, and suggest that additional factors may be differently influencing the evolution of genes following duplication, depending on the species, time, and mechanism of origin.
Collapse
Affiliation(s)
- Jonas Defoort
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Belgium.,VIB Center for Plant Systems Biology, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, Belgium
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Belgium.,VIB Center for Plant Systems Biology, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, Belgium.,Department of Biochemistry, Genetics and Microbiology, University of Pretoria, South Africa
| | - Lorenzo Carretero-Paulet
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Belgium.,VIB Center for Plant Systems Biology, Ghent, Belgium.,Bioinformatics Institute Ghent, Ghent University, Belgium
| |
Collapse
|
4
|
Subba P, Narayana Kotimoole C, Prasad TSK. Plant Proteome Databases and Bioinformatic Tools: An Expert Review and Comparative Insights. ACTA ACUST UNITED AC 2019; 23:190-206. [DOI: 10.1089/omi.2019.0024] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Pratigya Subba
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
| | - Chinmaya Narayana Kotimoole
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
| | - Thottethodi Subrahmanya Keshava Prasad
- Center for Systems Biology and Molecular Medicine, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore, India
- Institute of Bioinformatics, International Technology Park, Bangalore, India
| |
Collapse
|
5
|
Van Leene J, Han C, Gadeyne A, Eeckhout D, Matthijs C, Cannoot B, De Winne N, Persiau G, Van De Slijke E, Van de Cotte B, Stes E, Van Bel M, Storme V, Impens F, Gevaert K, Vandepoele K, De Smet I, De Jaeger G. Capturing the phosphorylation and protein interaction landscape of the plant TOR kinase. NATURE PLANTS 2019; 5:316-327. [PMID: 30833711 DOI: 10.1038/s41477-019-0378-z] [Citation(s) in RCA: 148] [Impact Index Per Article: 29.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2018] [Accepted: 01/28/2019] [Indexed: 05/18/2023]
Abstract
The target of rapamycin (TOR) kinase is a conserved regulatory hub that translates environmental and nutritional information into permissive or restrictive growth decisions. Despite the increased appreciation of the essential role of the TOR complex in plants, no large-scale phosphoproteomics or interactomics studies have been performed to map TOR signalling events in plants. To fill this gap, we combined a systematic phosphoproteomics screen with a targeted protein complex analysis in the model plant Arabidopsis thaliana. Integration of the phosphoproteome and protein complex data on the one hand shows that both methods reveal complementary subspaces of the plant TOR signalling network, enabling proteome-wide discovery of both upstream and downstream network components. On the other hand, the overlap between both data sets reveals a set of candidate direct TOR substrates. The integrated network embeds both evolutionarily-conserved and plant-specific TOR signalling components, uncovering an intriguing complex interplay with protein synthesis. Overall, the network provides a rich data set to start addressing fundamental questions about how TOR controls key processes in plants, such as autophagy, auxin signalling, chloroplast development, lipid metabolism, nucleotide biosynthesis, protein translation or senescence.
Collapse
Affiliation(s)
- Jelle Van Leene
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Chao Han
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- The Key Laboratory of Plant Cell Engineering and Germplasm Innovation, College of Life Sciences, Shandong University, Jinan, China
| | - Astrid Gadeyne
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Dominique Eeckhout
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Caroline Matthijs
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Bernard Cannoot
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Nancy De Winne
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Geert Persiau
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Eveline Van De Slijke
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Brigitte Van de Cotte
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Elisabeth Stes
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Michiel Van Bel
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Veronique Storme
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Francis Impens
- Department of Biochemistry, Ghent University, Ghent, Belgium
- VIB Center for Medical Biotechnology, Ghent, Belgium
- VIB Proteomics Core, Ghent, Belgium
| | - Kris Gevaert
- Department of Biochemistry, Ghent University, Ghent, Belgium
- VIB Center for Medical Biotechnology, Ghent, Belgium
- VIB Proteomics Core, Ghent, Belgium
| | - Klaas Vandepoele
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Ive De Smet
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
| | - Geert De Jaeger
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.
- VIB Center for Plant Systems Biology, Ghent, Belgium.
| |
Collapse
|
6
|
Pandey S, Sahoo D. Identification of gene expression logical invariants in Arabidopsis. PLANT DIRECT 2019; 3:e00123. [PMID: 31245766 PMCID: PMC6508763 DOI: 10.1002/pld3.123] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2018] [Revised: 12/28/2018] [Accepted: 02/01/2019] [Indexed: 06/09/2023]
Abstract
Numerous gene expression datasets from diverse tissue samples from the plant variety Arabidopsis thaliana have been already deposited in the public domain. There have been several attempts to do large scale meta-analyses of all of these datasets. Most of these analyses summarize pairwise gene expression relationships using correlation, or identify differentially expressed genes in two conditions. We propose here a new large scale meta-analysis of the publicly available Arabidopsis datasets to identify Boolean logical relationships between genes. Boolean logic is a branch of mathematics that deals with two possible values. In the context of gene expression datasets we use qualitative high and low expression values. A strong logical relationship between genes emerges if at least one of the quadrants is sparsely populated. We pointed out serious issues in the data normalization steps widely accepted and published recently in this context. We put together a web resource where gene expression relationships can be explored online which helps visualize the logical relationships between genes. We believe that this website will be useful in identifying important genes in different biological context. The web link is http://hegemon.ucsd.edu/plant/.
Collapse
|
7
|
Zwaenepoel A, Diels T, Amar D, Van Parys T, Shamir R, Van de Peer Y, Tzfadia O. MorphDB: Prioritizing Genes for Specialized Metabolism Pathways and Gene Ontology Categories in Plants. FRONTIERS IN PLANT SCIENCE 2018; 9:352. [PMID: 29616063 PMCID: PMC5867296 DOI: 10.3389/fpls.2018.00352] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2018] [Accepted: 03/02/2018] [Indexed: 05/20/2023]
Abstract
Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest.
Collapse
Affiliation(s)
- Arthur Zwaenepoel
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Tim Diels
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - David Amar
- Stanford Center for Inherited Cardiovascular Disease, Stanford University, Stanford, CA, United States
| | - Thomas Van Parys
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
| | - Ron Shamir
- Blavatnik School of Computer Science, Tel-Aviv University, Tel-Aviv, Israel
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
- Genomics Research Institute, University of Pretoria, Pretoria, South Africa
- *Correspondence: Yves Van de Peer
| | - Oren Tzfadia
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- VIB Center for Plant Systems Biology, Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, Belgium
- Oren Tzfadia
| |
Collapse
|
8
|
Vaneechoutte D, Estrada AR, Lin YC, Loraine AE, Vandepoele K. Genome-wide characterization of differential transcript usage in Arabidopsis thaliana. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 92:1218-1231. [PMID: 29031026 DOI: 10.1111/tpj.13746] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Revised: 09/29/2017] [Accepted: 10/03/2017] [Indexed: 05/21/2023]
Abstract
Alternative splicing and the usage of alternate transcription start- or stop sites allows a single gene to produce multiple transcript isoforms. Most plant genes express certain isoforms at a significantly higher level than others, but under specific conditions this expression dominance can change, resulting in a different set of dominant isoforms. These events of differential transcript usage (DTU) have been observed for thousands of Arabidopsis thaliana, Zea mays and Vitis vinifera genes, and have been linked to development and stress response. However, neither the characteristics of these genes, nor the implications of DTU on their protein coding sequences or functions, are currently well understood. Here we present a dataset of isoform dominance and DTU for all genes in the AtRTD2 reference transcriptome based on a protocol that was benchmarked on simulated data and validated through comparison with a published reverse transciptase-polymerase chain reaction panel. We report DTU events for 8148 genes across 206 public RNA-Seq samples, and find that protein sequences are affected in 22% of the cases. The observed DTU events show high consistency across replicates, and reveal reproducible patterns in response to treatment and development. We also demonstrate that genes with different evolutionary ages, expression breadths and functions show large differences in the frequency at which they undergo DTU, and in the effect that these events have on their protein sequences. Finally, we showcase how the generated dataset can be used to explore DTU events for genes of interest or to find genes with specific DTU in samples of interest.
Collapse
Affiliation(s)
- Dries Vaneechoutte
- VIB Center for Plant Systems Biology, VIB, Technologiepark 927, B-9052, Gent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, B-9052, Gent, Belgium
| | - April R Estrada
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, Kannapolis, NC, 28081, USA
| | - Ying-Chen Lin
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, Kannapolis, NC, 28081, USA
| | - Ann E Loraine
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, North Carolina Research Campus, Kannapolis, NC, 28081, USA
| | - Klaas Vandepoele
- VIB Center for Plant Systems Biology, VIB, Technologiepark 927, B-9052, Gent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 927, B-9052, Gent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Technologiepark 927, 9052, Ghent, Belgium
| |
Collapse
|