1
|
Quality control and evaluation of plant epigenomics data. THE PLANT CELL 2022; 34:503-513. [PMID: 34648025 PMCID: PMC8773985 DOI: 10.1093/plcell/koab255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Accepted: 10/08/2021] [Indexed: 05/22/2023]
Abstract
Epigenomics is the study of molecular signatures associated with discrete regions within genomes, many of which are important for a wide range of nuclear processes. The ability to profile the epigenomic landscape associated with genes, repetitive regions, transposons, transcription, differential expression, cis-regulatory elements, and 3D chromatin interactions has vastly improved our understanding of plant genomes. However, many epigenomic and single-cell genomic assays are challenging to perform in plants, leading to a wide range of data quality issues; thus, the data require rigorous evaluation prior to downstream analyses and interpretation. In this commentary, we provide considerations for the evaluation of plant epigenomics and single-cell genomics data quality with the aim of improving the quality and utility of studies using those data across diverse plant species.
Collapse
|
2
|
Abstract
Cyclophilin A/DIAGEOTROPICA (DGT) has been linked to auxin-regulated development in tomato and appears to affect multiple developmental pathways. Loss of DGT function results in a pleiotropic phenotype that is strongest in the roots, including shortened roots with no lateral branching. Here, we present an RNA-Seq dataset comparing the gene expression profiles of wildtype (‘Ailsa Craig’) and
dgt tissues from three spatially separated developmental stages of the tomato root tip, with three replicates for each tissue and genotype. We also identify differentially expressed genes, provide an initial comparison of genes affected in each genotype and tissue, and provide the pipeline used to analyze the data. Further analysis of this dataset can be used to gain insight into the effects of DGT on various root developmental pathways in tomato.
Collapse
|
3
|
Metabolomics analysis reveals both plant variety and choice of hormone treatment modulate vinca alkaloid production in Catharanthus roseus. PLANT DIRECT 2020; 4:e00267. [PMID: 33005857 PMCID: PMC7520646 DOI: 10.1002/pld3.267] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 08/18/2020] [Accepted: 08/24/2020] [Indexed: 05/09/2023]
Abstract
The medicinal plant Catharanthus roseus produces numerous secondary metabolites of interest for the treatment of many diseases - most notably for the terpene indole alkaloid (TIA) vinblastine, which is used in the treatment of leukemia and Hodgkin's lymphoma. Historically, methyl jasmonate (MeJA) has been used to induce TIA production, but in the past, this has only been investigated in whole seedlings, cell culture, or hairy root culture. This study examines the effects of the phytohormones MeJA and ethylene on the induction of TIA biosynthesis and accumulation in the shoots and roots of 8-day-old seedlings of two varieties of C. roseus. Using LCMS and RT-qPCR, we demonstrate the importance of variety selection, as we observe markedly different induction patterns of important TIA precursor compounds. Additionally, both phytohormone choice and concentration have significant effects on TIA biosynthesis. Finally, our study suggests that several early-induction pathway steps as well as pathway-specific genes are likely to be transcriptionally regulated. Our findings highlight the need for a complete set of'omics resources in commonly used C. roseus varieties and the need for caution when extrapolating results from one cultivar to another.
Collapse
|
4
|
PlantSimLab - a modeling and simulation web tool for plant biologists. BMC Bioinformatics 2019; 20:508. [PMID: 31638901 PMCID: PMC6805577 DOI: 10.1186/s12859-019-3094-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 09/10/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND At the molecular level, nonlinear networks of heterogeneous molecules control many biological processes, so that systems biology provides a valuable approach in this field, building on the integration of experimental biology with mathematical modeling. One of the biggest challenges to making this integration a reality is that many life scientists do not possess the mathematical expertise needed to build and manipulate mathematical models well enough to use them as tools for hypothesis generation. Available modeling software packages often assume some modeling expertise. There is a need for software tools that are easy to use and intuitive for experimentalists. RESULTS This paper introduces PlantSimLab, a web-based application developed to allow plant biologists to construct dynamic mathematical models of molecular networks, interrogate them in a manner similar to what is done in the laboratory, and use them as a tool for biological hypothesis generation. It is designed to be used by experimentalists, without direct assistance from mathematical modelers. CONCLUSIONS Mathematical modeling techniques are a useful tool for analyzing complex biological systems, and there is a need for accessible, efficient analysis tools within the biological community. PlantSimLab enables users to build, validate, and use intuitive qualitative dynamic computer models, with a graphical user interface that does not require mathematical modeling expertise. It makes analysis of complex models accessible to a larger community, as it is platform-independent and does not require extensive mathematical expertise.
Collapse
|
5
|
IndeCut evaluates performance of network motif discovery algorithms. Bioinformatics 2019; 34:1514-1521. [PMID: 29236975 DOI: 10.1093/bioinformatics/btx798] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 12/08/2017] [Indexed: 12/24/2022] Open
Abstract
Motivation Genomic networks represent a complex map of molecular interactions which are descriptive of the biological processes occurring in living cells. Identifying the small over-represented circuitry patterns in these networks helps generate hypotheses about the functional basis of such complex processes. Network motif discovery is a systematic way of achieving this goal. However, a reliable network motif discovery outcome requires generating random background networks which are the result of a uniform and independent graph sampling method. To date, there has been no method to numerically evaluate whether any network motif discovery algorithm performs as intended on realistically sized datasets-thus it was not possible to assess the validity of resulting network motifs. Results In this work, we present IndeCut, the first method to date that characterizes network motif finding algorithm performance in terms of uniform sampling on realistically sized networks. We demonstrate that it is critical to use IndeCut prior to running any network motif finder for two reasons. First, IndeCut indicates the number of samples needed for a tool to produce an outcome that is both reproducible and accurate. Second, IndeCut allows users to choose the tool that generates samples in the most independent fashion for their network of interest among many available options. Availability and implementation The open source software package is available at https://github.com/megrawlab/IndeCut. Contact megrawm@science.oregonstate.edu or david.koslicki@math.oregonstate.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
6
|
Identification of transcription factors from NF-Y, NAC, and SPL families responding to osmotic stress in multiple tomato varieties. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2018; 274:441-450. [PMID: 30080633 DOI: 10.1016/j.plantsci.2018.06.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2018] [Revised: 06/21/2018] [Accepted: 06/24/2018] [Indexed: 06/08/2023]
Abstract
Identifying osmotic stress-responsive transcription factors (TFs) can facilitate discovery of master regulators mediating salt and/or drought tolerance. To date, few RNA-seq datasets for high resolution time course of salt or drought stress treatments are publicly available for certain crop species. However, such datasets may be available for other crops, and in combination with orthology analysis may be used to infer candidate osmotic stress regulators across distantly related species. Here, we demonstrate the utility of this approach for identification and validation of osmotic stress-responsive transcription factors in tomato. First, we developed physiologically calibrated salt and dehydration-responsive systems for tomato cultivars using real time measurements of transpiration rate and photosynthetic efficiency. Next, we identified differentially expressed TFs in rice using raw RNA-seq datasets for a publicly available salt stress time course. Putative salt stress-responsive TFs in tomato were then inferred based on their orthology with the transcription factors upregulated by salt in rice. Finally, using our osmotic stress system, we experimentally validated stress-responsive expression of predicted tomato candidates representing NUCLEAR FACTOR Y, SQUAMOSA PROMOTER BINDING, and NAC domain TF families. Quantification of transcript copy numbers confirmed that mRNAs encoding all three TFs were strongly upregulated not only by salt but also by drought stress. Induction by both salt and dehydration occurred in a temporal manner across diverse tomato cultivars, suggesting that the identified TFs may play important roles in regulating osmotic stress responses.
Collapse
|
7
|
The Next Generation of Training for Arabidopsis Researchers: Bioinformatics and Quantitative Biology. PLANT PHYSIOLOGY 2017; 175:1499-1509. [PMID: 29208732 PMCID: PMC5717721 DOI: 10.1104/pp.17.01490] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2017] [Accepted: 10/31/2017] [Indexed: 05/20/2023]
Abstract
Training for experimental plant biologists needs to combine bioinformatics, quantitative approaches, computational biology, and training in the art of collaboration, best achieved through fully integrated curriculum development.
Collapse
|
8
|
Establishment of Expression in the SHORTROOT-SCARECROW Transcriptional Cascade through Opposing Activities of Both Activators and Repressors. Dev Cell 2016; 39:585-596. [PMID: 27923776 DOI: 10.1016/j.devcel.2016.09.031] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Revised: 05/27/2016] [Accepted: 09/29/2016] [Indexed: 12/28/2022]
Abstract
Tissue-specific gene expression is often thought to arise from spatially restricted transcriptional cascades. However, it is unclear how expression is established at the top of these cascades in the absence of pre-existing specificity. We generated a transcriptional network to explore how transcription factor expression is established in the Arabidopsis thaliana root ground tissue. Regulators of the SHORTROOT-SCARECROW transcriptional cascade were validated in planta. At the top of this cascade, we identified both activators and repressors of SHORTROOT. The aggregate spatial expression of these regulators is not sufficient to predict transcriptional specificity. Instead, modeling, transcriptional reporters, and synthetic promoters support a mechanism whereby expression at the top of the SHORTROOT-SCARECROW cascade is established through opposing activities of activators and repressors.
Collapse
|
9
|
Small Genetic Circuits and MicroRNAs: Big Players in Polymerase II Transcriptional Control in Plants. THE PLANT CELL 2016; 28:286-303. [PMID: 26869700 PMCID: PMC4790873 DOI: 10.1105/tpc.15.00852] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 02/10/2016] [Indexed: 05/11/2023]
Abstract
RNA Polymerase II (Pol II) regulatory cascades involving transcription factors (TFs) and their targets orchestrate the genetic circuitry of every eukaryotic organism. In order to understand how these cascades function, they can be dissected into small genetic networks, each containing just a few Pol II transcribed genes, that generate specific signal-processing outcomes. Small RNA regulatory circuits involve direct regulation of a small RNA by a TF and/or direct regulation of a TF by a small RNA and have been shown to play unique roles in many organisms. Here, we will focus on small RNA regulatory circuits containing Pol II transcribed microRNAs (miRNAs). While the role of miRNA-containing regulatory circuits as modular building blocks for the function of complex networks has long been on the forefront of studies in the animal kingdom, plant studies are poised to take a lead role in this area because of their advantages in probing transcriptional and posttranscriptional control of Pol II genes. The relative simplicity of tissue- and cell-type organization, miRNA targeting, and genomic structure make the Arabidopsis thaliana plant model uniquely amenable for small RNA regulatory circuit studies in a multicellular organism. In this Review, we cover analysis, tools, and validation methods for probing the component interactions in miRNA-containing regulatory circuits. We then review the important roles that plant miRNAs are playing in these circuits and summarize methods for the identification of small genetic circuits that strongly influence plant function. We conclude by noting areas of opportunity where new plant studies are imminently needed.
Collapse
|
10
|
NanoCAGE-XL and CapFilter: an approach to genome wide identification of high confidence transcription start sites. BMC Genomics 2015; 16:597. [PMID: 26268438 PMCID: PMC4534009 DOI: 10.1186/s12864-015-1670-6] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Accepted: 05/29/2015] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Identifying the transcription start sites (TSS) of genes is essential for characterizing promoter regions. Several protocols have been developed to capture the 5' end of transcripts via Cap Analysis of Gene Expression (CAGE) or linker-ligation strategies such as Paired-End Analysis of Transcription Start Sites (PEAT), but often require large amounts of tissue. More recently, nanoCAGE was developed for sequencing on the Illumina GAIIx to overcome these difficulties. RESULTS Here we present the first publicly available adaptation of nanoCAGE for sequencing on recent ultra-high throughput platforms such as Illumina HiSeq-2000, and CapFilter, a computational pipeline that greatly increases confidence in TSS identification. We report excellent gene coverage, reproducibility, and precision in transcription start site discovery for samples from Arabidopsis thaliana roots. CONCLUSION nanoCAGE-XL together with CapFilter allows for genome wide identification of high confidence transcription start sites in large eukaryotic genomes.
Collapse
|
11
|
TIPR: transcription initiation pattern recognition on a genome scale. Bioinformatics 2015; 31:3725-32. [PMID: 26254489 DOI: 10.1093/bioinformatics/btv464] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 08/03/2015] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION The computational identification of gene transcription start sites (TSSs) can provide insights into the regulation and function of genes without performing expensive experiments, particularly in organisms with incomplete annotations. High-resolution general-purpose TSS prediction remains a challenging problem, with little recent progress on the identification and differentiation of TSSs which are arranged in different spatial patterns along the chromosome. RESULTS In this work, we present the Transcription Initiation Pattern Recognizer (TIPR), a sequence-based machine learning model that identifies TSSs with high accuracy and resolution for multiple spatial distribution patterns along the genome, including broadly distributed TSS patterns that have previously been difficult to characterize. TIPR predicts not only the locations of TSSs but also the expected spatial initiation pattern each TSS will form along the chromosome-a novel capability for TSS prediction algorithms. As spatial initiation patterns are associated with spatiotemporal expression patterns and gene function, this capability has the potential to improve gene annotations and our understanding of the regulation of transcription initiation. The high nucleotide resolution of this model locates TSSs within 10 nucleotides or less on average. AVAILABILITY AND IMPLEMENTATION Model source code is made available online at http://megraw.cgrb.oregonstate.edu/software/TIPR/. CONTACT megrawm@science.oregonstate.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
12
|
Alternative splicing in plants: directing traffic at the crossroads of adaptation and environmental stress. CURRENT OPINION IN PLANT BIOLOGY 2015; 24:125-35. [PMID: 25835141 DOI: 10.1016/j.pbi.2015.02.008] [Citation(s) in RCA: 125] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2015] [Revised: 02/19/2015] [Accepted: 02/20/2015] [Indexed: 05/20/2023]
Abstract
In recent years, high-throughput sequencing-based analysis of plant transcriptomes has suggested that up to ∼60% of plant gene loci encode alternatively spliced mature transcripts. These studies have also revealed that alternative splicing in plants can be regulated by cell type, developmental stage, the environment, and the circadian clock. Alternative splicing is coupled to RNA surveillance and processing mechanisms, including nonsense mediated decay. Recently, non-protein-coding transcripts have also been shown to undergo alternative splicing. These discoveries collectively describe a robust system of post-transcriptional regulatory feedback loops which influence RNA abundance. In this review, we summarize recent studies describing the specific roles alternative splicing and RNA surveillance play in plant adaptation to environmental stresses and the regulation of the circadian clock.
Collapse
|
13
|
Environmental stresses modulate abundance and timing of alternatively spliced circadian transcripts in Arabidopsis. MOLECULAR PLANT 2015; 8:207-27. [PMID: 25680774 DOI: 10.1016/j.molp.2014.10.011] [Citation(s) in RCA: 56] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Revised: 10/10/2014] [Accepted: 10/20/2014] [Indexed: 05/21/2023]
Abstract
Environmental stresses profoundly altered accumulation of nonsense mRNAs including intron-retaining (IR) transcripts in Arabidopsis. Temporal patterns of stress-induced IR mRNAs were dissected using both oscillating and non-oscillating transcripts. Broad-range thermal cycles triggered a sharp increase in the long IR CCA1 isoforms and altered their phasing to different times of day. Both abiotic and biotic stresses such as drought or Pseudomonas syringae infection induced a similar increase. Thermal stress induced a time delay in accumulation of CCA1 I4Rb transcripts, whereas functional mRNA showed steady oscillations. Our data favor a hypothesis that stress-induced instabilities of the central oscillator can be in part compensated through fluctuations in abundance and out-of-phase oscillations of CCA1 IR transcripts. Taken together, our results support a concept that mRNA abundance can be modulated through altering ratios between functional and nonsense/IR transcripts. SR45 protein specifically bound to the retained CCA1 intron in vitro, suggesting that this splicing factor could be involved in regulation of intron retention. Transcriptomes of nonsense-mediated mRNA decay (NMD)-impaired and heat-stressed plants shared a set of retained introns associated with stress- and defense-inducible transcripts. Constitutive activation of certain stress response networks in an NMD mutant could be linked to disequilibrium between functional and nonsense mRNAs.
Collapse
|
14
|
The cyclophilin A DIAGEOTROPICA gene affects auxin transport in both root and shoot to control lateral root formation. Development 2015; 142:712-21. [PMID: 25617431 DOI: 10.1242/dev.113225] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Cyclophilin A is a conserved peptidyl-prolyl cis-trans isomerase (PPIase) best known as the cellular receptor of the immunosuppressant cyclosporine A. Despite significant effort, evidence of developmental functions of cyclophilin A in non-plant systems has remained obscure. Mutations in a tomato (Solanum lycopersicum) cyclophilin A ortholog, DIAGEOTROPICA (DGT), have been shown to abolish the organogenesis of lateral roots; however, a mechanistic explanation of the phenotype is lacking. Here, we show that the dgt mutant lacks auxin maxima relevant to priming and specification of lateral root founder cells. DGT is expressed in shoot and root, and localizes to both the nucleus and cytoplasm during lateral root organogenesis. Mutation of ENTIRE/IAA9, a member of the auxin-responsive Aux/IAA protein family of transcriptional repressors, partially restores the inability of dgt to initiate lateral root primordia but not the primordia outgrowth. By comparison, grafting of a wild-type scion restores the process of lateral root formation, consistent with participation of a mobile signal. Antibodies do not detect movement of the DGT protein into the dgt rootstock; however, experiments with radiolabeled auxin and an auxin-specific microelectrode demonstrate abnormal auxin fluxes. Functional studies of DGT in heterologous yeast and tobacco-leaf auxin-transport systems demonstrate that DGT negatively regulates PIN-FORMED (PIN) auxin efflux transporters by affecting their plasma membrane localization. Studies in tomato support complex effects of the dgt mutation on PIN expression level, expression domain and plasma membrane localization. Our data demonstrate that DGT regulates auxin transport in lateral root formation.
Collapse
|
15
|
Improved DNase-seq protocol facilitates high resolution mapping of DNase I hypersensitive sites in roots in Arabidopsis thaliana. PLANT METHODS 2015; 11:42. [PMID: 26339280 PMCID: PMC4558764 DOI: 10.1186/s13007-015-0087-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2015] [Accepted: 08/21/2015] [Indexed: 05/19/2023]
Abstract
BACKGROUND Identifying cis-regulatory elements is critical in understanding the direct and indirect regulatory mechanisms of gene expression. Current approaches include DNase-seq, a technique that combines sensitivity to the nonspecific endonuclease DNase I with high throughput sequencing to identify regions of regulatory DNA on a genome-wide scale. While this method was originally developed for human cell lines, later adaptations made the processing of plant tissues possible. Challenges still remain in processing recalcitrant tissues that have low DNA content. RESULTS By removing steps requiring the use of gel agarose plugs in DNase-seq, we were able to significantly reduce the time required to perform the protocol by at least 2 days, while also making possible the processing of difficult plant tissues. We refer to this simplified protocol as DNase I SIM (for simplified in-nucleus method). We were able to successfully create DNase-seq libraries for both leaf and root tissues in Arabidopsis using DNase I SIM. CONCLUSION This protocol simplifies and facilitates generation of DNase-seq libraries from plant tissues for high resolution mapping of DNase I hypersensitive sites.
Collapse
|
16
|
Environmental Stresses Modulate Abundance and Timing of Alternatively Spliced Circadian Transcripts in Arabidopsis. MOLECULAR PLANT 2014:ssu130. [PMID: 25366180 DOI: 10.1093/mp/ssu130] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Environmental stresses profoundly altered accumulation of nonsense mRNAs including intron retaining (IR) transcripts in Arabidopsis. Temporal patterns of stress-induced IR mRNAs were dissected using both oscillating and non-oscillating transcripts. Broad range thermal cycles triggered a sharp increase in the long intron retaining CCA1 isoforms and altered their phasing to different times of day. Both abiotic and biotic stresses such as drought or P. syringae infection induced similar increase. Thermal stress induced a time delay in accumulation of CCA1 I4Rb transcripts whereas functional mRNA showed steady oscillations. Our data favor a hypothesis that stress-induced instabilities of the central oscillator can be in part compensated through fluctuations in abundance and out of phase oscillations of CCA1 IR transcripts. Altogether, our results support a concept that mRNA abundance can be modulated through altering ratios between functional and nonsense/IR transcripts. SR45 protein specifically bound to the retained CCA1 intron in vitro, suggesting that this splicing factor could be involved in regulation of intron retention. Transcriptomes of NMD-impaired and heat-stressed plants shared a set of retained introns associated with stress- and defense-inducible transcripts. Constitutive activation of certain stress response networks in an NMD mutant could be linked to disequilibrium between functional and nonsense mRNAs.
Collapse
|
17
|
A comparative study of ripening among berries of the grape cluster reveals an altered transcriptional programme and enhanced ripening rate in delayed berries. JOURNAL OF EXPERIMENTAL BOTANY 2014; 65:5889-902. [PMID: 25135520 PMCID: PMC4203125 DOI: 10.1093/jxb/eru329] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
Transcriptional studies in relation to fruit ripening generally aim to identify the transcriptional states associated with physiological ripening stages and the transcriptional changes between stages within the ripening programme. In non-climacteric fruits such as grape, all ripening-related genes involved in this programme have not been identified, mainly due to the lack of mutants for comparative transcriptomic studies. A feature in grape cluster ripening (Vitis vinifera cv. Pinot noir), where all berries do not initiate the ripening at the same time, was exploited to study their shifted ripening programmes in parallel. Berries that showed marked ripening state differences in a véraison-stage cluster (ripening onset) ultimately reached similar ripeness states toward maturity, indicating the flexibility of the ripening programme. The expression variance between these véraison-stage berry classes, where 11% of the genes were found to be differentially expressed, was reduced significantly toward maturity, resulting in the synchronization of their transcriptional states. Defined quantitative expression changes (transcriptional distances) not only existed between the véraison transitional stages, but also between the véraison to maturity stages, regardless of the berry class. It was observed that lagging berries complete their transcriptional programme in a shorter time through altered gene expressions and ripening-related hormone dynamics, and enhance the rate of physiological ripening progression. Finally, the reduction in expression variance of genes can identify new genes directly associated with ripening and also assess the relevance of gene activity to the phase of the ripening programme.
Collapse
|
18
|
Paired-end analysis of transcription start sites in Arabidopsis reveals plant-specific promoter signatures. THE PLANT CELL 2014; 26:2746-60. [PMID: 25035402 PMCID: PMC4145111 DOI: 10.1105/tpc.114.125617] [Citation(s) in RCA: 83] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Revised: 06/03/2014] [Accepted: 06/24/2014] [Indexed: 05/19/2023]
Abstract
Understanding plant gene promoter architecture has long been a challenge due to the lack of relevant large-scale data sets and analysis methods. Here, we present a publicly available, large-scale transcription start site (TSS) data set in plants using a high-resolution method for analysis of 5' ends of mRNA transcripts. Our data set is produced using the paired-end analysis of transcription start sites (PEAT) protocol, providing millions of TSS locations from wild-type Columbia-0 Arabidopsis thaliana whole root samples. Using this data set, we grouped TSS reads into "TSS tag clusters" and categorized clusters into three spatial initiation patterns: narrow peak, broad with peak, and weak peak. We then designed a machine learning model that predicts the presence of TSS tag clusters with outstanding sensitivity and specificity for all three initiation patterns. We used this model to analyze the transcription factor binding site content of promoters exhibiting these initiation patterns. In contrast to the canonical notions of TATA-containing and more broad "TATA-less" promoters, the model shows that, in plants, the vast majority of transcription start sites are TATA free and are defined by a large compendium of known DNA sequence binding elements. We present results on the usage of these elements and provide our Plant PEAT Peaks (3PEAT) model that predicts the presence of TSSs directly from sequence.
Collapse
|
19
|
Sustained-input switches for transcription factors and microRNAs are central building blocks of eukaryotic gene circuits. Genome Biol 2013; 14:R85. [PMID: 23972209 PMCID: PMC4054853 DOI: 10.1186/gb-2013-14-8-r85] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2013] [Accepted: 08/23/2013] [Indexed: 12/02/2022] Open
Abstract
WaRSwap is a randomization algorithm that for the first time provides a practical network motif discovery method for large multi-layer networks, for example those that include transcription factors, microRNAs, and non-regulatory protein coding genes. The algorithm is applicable to systems with tens of thousands of genes, while accounting for critical aspects of biological networks, including self-loops, large hubs, and target rearrangements. We validate WaRSwap on a newly inferred regulatory network from Arabidopsis thaliana, and compare outcomes on published Drosophila and human networks. Specifically, sustained input switches are among the few over-represented circuits across this diverse set of eukaryotes.
Collapse
|
20
|
Abstract
Because proteins are the major functional components of cells, knowledge of their cellular localization is crucial to gaining an understanding of the biology of multicellular organisms. We have generated a protein expression map of the Arabidopsis root providing the identity and cell type-specific localization of nearly 2,000 proteins. Grouping proteins into functional categories revealed unique cellular functions and identified cell type-specific biomarkers. Cellular colocalization provided support for numerous protein-protein interactions. With a binary comparison, we found that RNA and protein expression profiles are weakly correlated. We then performed peak integration at cell type-specific resolution and found an improved correlation with transcriptome data using continuous values. We performed GeLC-MS/MS (in-gel tryptic digestion followed by liquid chromatography-tandem mass spectrometry) proteomic experiments on mutants with ectopic and no root hairs, providing complementary proteomic data. Finally, among our root hair-specific proteins we identified two unique regulators of root hair development.
Collapse
|
21
|
Abstract
Tightly controlled gene expression is a hallmark of multicellular development and is accomplished by transcription factors (TFs) and microRNAs (miRNAs). Although many studies have focused on identifying downstream targets of these molecules, less is known about the factors that regulate their differential expression. We used data from high spatial resolution gene expression experiments and yeast one-hybrid (Y1H) and two-hybrid (Y2H) assays to delineate a subset of interactions occurring within a gene regulatory network (GRN) that determines tissue-specific TF and miRNA expression in plants. We find that upstream TFs are expressed in more diverse cell types than their targets and that promoters that are bound by a relatively large number of TFs correspond to key developmental regulators. The regulatory consequence of many TFs for their target was experimentally determined using genetic analysis. Remarkably, molecular phenotypes were identified for 65% of the TFs, but morphological phenotypes were associated with only 16%. This indicates that the GRN is robust, and that gene expression changes may be canalized or buffered.
Collapse
|
22
|
Editing of Epstein-Barr virus-encoded BART6 microRNAs controls their dicer targeting and consequently affects viral latency. J Biol Chem 2010; 285:33358-33370. [PMID: 20716523 DOI: 10.1074/jbc.m110.138362] [Citation(s) in RCA: 173] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Certain primary transcripts of miRNA (pri-microRNAs) undergo RNA editing that converts adenosine to inosine. The Epstein-Barr virus (EBV) genome encodes multiple microRNA genes of its own. Here we report that primary transcripts of ebv-miR-BART6 (pri-miR-BART6) are edited in latently EBV-infected cells. Editing of wild-type pri-miR-BART6 RNAs dramatically reduced loading of miR-BART6-5p RNAs onto the microRNA-induced silencing complex. Editing of a mutation-containing pri-miR-BART6 found in Daudi Burkitt lymphoma and nasopharyngeal carcinoma C666-1 cell lines suppressed processing of miR-BART6 RNAs. Most importantly, miR-BART6-5p RNAs silence Dicer through multiple target sites located in the 3'-UTR of Dicer mRNA. The significance of miR-BART6 was further investigated in cells in various stages of latency. We found that miR-BART6-5p RNAs suppress the EBNA2 viral oncogene required for transition from immunologically less responsive type I and type II latency to the more immunoreactive type III latency as well as Zta and Rta viral proteins essential for lytic replication, revealing the regulatory function of miR-BART6 in EBV infection and latency. Mutation and A-to-I editing appear to be adaptive mechanisms that antagonize miR-BART6 activities.
Collapse
|
23
|
Abstract
In this chapter, we present a brief overview of current knowledge about the promoters of plant microRNAs (miRNAs), and provide a step-by-step guide for predicting plant miRNA promoter elements using known transcription factor binding motifs. The approach to promoter element prediction is based on a carefully constructed collection of Positional Weight Matrices (PWMs) for known transcription factors (TFs) in Arabidopsis. A key concept of the method is to use scoring thresholds for potential binding sites that are appropriate to each individual transcription factor. While the procedure can be applied to search for Transcription Factor Binding Sites (TFBSs) in any pol-II promoter region, it is particularly practical for the case of plant miRNA promoters where upstream sequence regions and binding sites are not readily available in existing databases. The majority of the material described in this chapter is available for download at http://microrna.gr.
Collapse
|
24
|
Abstract
MicroRNAs are small, non-protein coding RNA molecules known to regulate the expression of genes by binding to the 3′UTR region of mRNAs. MicroRNAs are produced from longer transcripts which can code for more than one mature miRNAs. miRGen 2.0 is a database that aims to provide comprehensive information about the position of human and mouse microRNA coding transcripts and their regulation by transcription factors, including a unique compilation of both predicted and experimentally supported data. Expression profiles of microRNAs in several tissues and cell lines, single nucleotide polymorphism locations, microRNA target prediction on protein coding genes and mapping of miRNA targets of co-regulated miRNAs on biological pathways are also integrated into the database and user interface. The miRGen database will be continuously maintained and freely available at http://www.microrna.gr/mirgen/.
Collapse
|
25
|
|
26
|
Abstract
The recent arrival of large-scale cap analysis of gene expression (CAGE) data sets in mammals provides a wealth of quantitative information on coding and noncoding RNA polymerase II transcription start sites (TSS). Genome-wide CAGE studies reveal that a large fraction of TSS exhibit peaks where the vast majority of associated tags map to a particular location ( approximately 45%), whereas other active regions contain a broader distribution of initiation events. The presence of a strong single peak suggests that transcription at these locations may be mediated by position-specific sequence features. We therefore propose a new model for single-peaked TSS based solely on known transcription factors (TFs) and their respective regions of positional enrichment. This probabilistic model leads to near-perfect classification results in cross-validation (auROC = 0.98), and performance in genomic scans demonstrates that TSS prediction with both high accuracy and spatial resolution is achievable for a specific but large subgroup of mammalian promoters. The interpretable model structure suggests a DNA code in which canonical sequence features such as TATA-box, Initiator, and GC content do play a significant role, but many additional TFs show distinct spatial biases with respect to TSS location and are important contributors to the accurate prediction of single-peak transcription initiation sites. The model structure also reveals that CAGE tag clusters distal from annotated gene starts have distinct characteristics compared to those close to gene 5'-ends. Using this high-resolution single-peak model, we predict TSS for approximately 70% of mammalian microRNAs based on currently available data.
Collapse
|
27
|
Abstract
Primary transcripts of certain microRNA (miRNA) genes (pri-miRNAs) are subject to RNA editing that converts adenosine to inosine (A→I RNA editing). However, the frequency of the pri-miRNA editing and the fate of edited pri-miRNAs remain largely to be determined. Examination of already known pri-miRNA editing sites indicated that adenosine residues of the UAG triplet sequence might be edited more frequently. In the present study, therefore, we conducted a large-scale survey of human pri-miRNAs containing the UAG triplet sequence. By direct sequencing of RT–PCR products corresponding to pri-miRNAs, we examined 209 pri-miRNAs and identified 43 UAG and also 43 non-UAG editing sites in 47 pri-miRNAs, which were highly edited in human brain. In vitro miRNA processing assay using recombinant Drosha-DGCR8 and Dicer-TRBP (the human immuno deficiency virus transactivating response RNA-binding protein) complexes revealed that a majority of pri-miRNA editing is likely to interfere with the miRNA processing steps. In addition, four new edited miRNAs with altered seed sequences were identified by targeted cloning and sequencing of the miRNAs that would be processed from edited pri-miRNAs. Our studies predict that ∼16% of human pri-miRNAs are subject to A→I editing and, thus, miRNA editing could have a large impact on the miRNA-mediated gene silencing.
Collapse
|
28
|
Abstract
miRGen is an integrated database of (i) positional relationships between animal miRNAs and genomic annotation sets and (ii) animal miRNA targets according to combinations of widely used target prediction programs. A major goal of the database is the study of the relationship between miRNA genomic organization and miRNA function. This is made possible by three integrated and user friendly interfaces. The Genomics interface allows the user to explore where whole-genome collections of miRNAs are located with respect to UCSC genome browser annotation sets such as Known Genes, Refseq Genes, Genscan predicted genes, CpG islands and pseudogenes. These miRNAs are connected through the Targets interface to their experimentally supported target genes from TarBase, as well as computationally predicted target genes from optimized intersections and unions of several widely used mammalian target prediction programs. Finally, the Clusters interface provides predicted miRNA clusters at any given inter-miRNA distance and provides specific functional information on the targets of miRNAs within each cluster. All of these unique features of miRGen are designed to facilitate investigations into miRNA genomic organization, co-transcription and targeting. miRGen can be freely accessed at http://www.diana.pcbi.upenn.edu/miRGen.
Collapse
|
29
|
A guide through present computational approaches for the identification of mammalian microRNA targets. Nat Methods 2006; 3:881-6. [PMID: 17060911 DOI: 10.1038/nmeth954] [Citation(s) in RCA: 457] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Computational microRNA (miRNA) target prediction is a field in flux. Here we present a guide through five widely used mammalian target prediction programs. We include an analysis of the performance of these individual programs and of various combinations of these programs. For this analysis we compiled several benchmark data sets of experimentally supported miRNA-target gene interactions. Based on the results, we provide a discussion on the status of target prediction and also suggest a stepwise approach toward predicting and selecting miRNA targets for experimental testing.
Collapse
|
30
|
Abstract
In this study we present a method of identifying Arabidopsis miRNA promoter elements using known transcription factor binding motifs. We provide a comparative analysis of the representation of these elements in miRNA promoters, protein-coding gene promoters, and random genomic sequences. We report five transcription factor (TF) binding motifs that show evidence of overrepresentation in miRNA promoter regions relative to the promoter regions of protein-coding genes. This investigation is based on the analysis of 800-nucleotide regions upstream of 63 experimentally verified Transcription Start Sites (TSS) for miRNA primary transcripts in Arabidopsis. While the TATA-box binding motif was also previously reported by Xie and colleagues, the transcription factors AtMYC2, ARF, SORLREP3, and LFY are identified for the first time as overrepresented binding motifs in miRNA promoters.
Collapse
|