1
|
Yadav VK, Jalmi SK, Tiwari S, Kerkar S. Deciphering shared attributes of plant long non-coding RNAs through a comparative computational approach. Sci Rep 2023; 13:15101. [PMID: 37699996 PMCID: PMC10497521 DOI: 10.1038/s41598-023-42420-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 09/10/2023] [Indexed: 09/14/2023] Open
Abstract
Over the past decade, long non-coding RNA (lncRNA), which lacks protein-coding potential, has emerged as an essential regulator of the genome. The present study examined 13,599 lncRNAs in Arabidopsis thaliana, 11,565 in Oryza sativa, and 32,397 in Zea mays for their characteristic features and explored the associated genomic and epigenomic features. We found lncRNAs were distributed throughout the chromosomes and the Helitron family of transposable elements (TEs) enriched, while the terminal inverted repeat depleted in lncRNA transcribing regions. Our analyses determined that lncRNA transcribing regions show rare or weak signals for most epigenetic marks except for H3K9me2 and cytosine methylation in all three plant species. LncRNAs showed preferential localization in the nucleus and cytoplasm; however, the distribution ratio in the cytoplasm and nucleus varies among the studied plant species. We identified several conserved endogenous target mimic sites in the lncRNAs among the studied plants. We found 233, 301, and 273 unique miRNAs, potentially targeting the lncRNAs of A. thaliana, O. sativa, and Z. mays, respectively. Our study has revealed that miRNAs, which interact with lncRNAs, target genes that are involved in a diverse array of biological and molecular processes. The miRNA-targeted lncRNAs displayed a strong affinity for several transcription factors, including ERF and BBR-BPC, mutually present in all three plants, advocating their conserved functions. Overall, the present study showed that plant lncRNAs exhibit conserved genomic and epigenomic characteristics and potentially govern the growth and development of plants.
Collapse
Affiliation(s)
- Vikash Kumar Yadav
- School of Biological Sciences and Biotechnology, Goa University, Taleigao Plateau, Goa, 403206, India.
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India.
| | - Siddhi Kashinath Jalmi
- School of Biological Sciences and Biotechnology, Goa University, Taleigao Plateau, Goa, 403206, India
| | - Shalini Tiwari
- Department of Biochemistry and Molecular Biology, Oklahoma State University, Stillwater, 74078, OK, USA
| | - Savita Kerkar
- School of Biological Sciences and Biotechnology, Goa University, Taleigao Plateau, Goa, 403206, India
| |
Collapse
|
2
|
Matoulek D, Ježek B, Vohnoutová M, Symonová R. Advances in Vertebrate (Cyto)Genomics Shed New Light on Fish Compositional Genome Evolution. Genes (Basel) 2023; 14:genes14020244. [PMID: 36833171 PMCID: PMC9956151 DOI: 10.3390/genes14020244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 01/05/2023] [Indexed: 01/19/2023] Open
Abstract
Cytogenetic and compositional studies considered fish genomes rather poor in guanine-cytosine content (GC%) because of a putative "sharp increase in genic GC% during the evolution of higher vertebrates". However, the available genomic data have not been exploited to confirm this viewpoint. In contrast, further misunderstandings in GC%, mostly of fish genomes, originated from a misapprehension of the current flood of data. Utilizing public databases, we calculated the GC% in animal genomes of three different, technically well-established fractions: DNA (entire genome), cDNA (complementary DNA), and cds (exons). Our results across chordates help set borders of GC% values that are still incorrect in literature and show: (i) fish in their immense diversity possess comparably GC-rich (or even GC-richer) genomes as higher vertebrates, and fish exons are GC-enriched among vertebrates; (ii) animal genomes generally show a GC-enrichment from the DNA, over cDNA, to the cds level (i.e., not only the higher vertebrates); (iii) fish and invertebrates show a broad(er) inter-quartile range in GC%, while avian and mammalian genomes are more constrained in their GC%. These results indicate no sharp increase in the GC% of genes during the transition to higher vertebrates, as stated and numerously repeated before. We present our results in 2D and 3D space to explore the compositional genome landscape and prepared an online platform to explore the AT/GC compositional genome evolution.
Collapse
Affiliation(s)
- Dominik Matoulek
- Department of Physics, Faculty of Science, University of Hradec Králové, 500 03 Hradec Králové, Czech Republic
| | - Bruno Ježek
- Faculty of Informatics and Management, University of Hradec Králové, Rokitanského 62, 500 02 Hradec Králové, Czech Republic
| | - Marta Vohnoutová
- Department of Computer Science, Faculty of Science, University of South Bohemia, Branišovská 1760, 370 05 České Budějovice, Czech Republic
| | - Radka Symonová
- Department of Computer Science, Faculty of Science, University of South Bohemia, Branišovská 1760, 370 05 České Budějovice, Czech Republic
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany
- Institute of Hydrobiology, Biology Centre of the Czech Academy of Sciences, 370 05 České Budějovice, Czech Republic
- Correspondence:
| |
Collapse
|
3
|
Bacolla A, Temiz NA, Yi M, Ivanic J, Cer RZ, Donohue DE, Ball EV, Mudunuri US, Wang G, Jain A, Volfovsky N, Luke BT, Stephens RM, Cooper DN, Collins JR, Vasquez KM. Guanine holes are prominent targets for mutation in cancer and inherited disease. PLoS Genet 2013; 9:e1003816. [PMID: 24086153 PMCID: PMC3784513 DOI: 10.1371/journal.pgen.1003816] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2013] [Accepted: 08/07/2013] [Indexed: 12/27/2022] Open
Abstract
Single base substitutions constitute the most frequent type of human gene mutation and are a leading cause of cancer and inherited disease. These alterations occur non-randomly in DNA, being strongly influenced by the local nucleotide sequence context. However, the molecular mechanisms underlying such sequence context-dependent mutagenesis are not fully understood. Using bioinformatics, computational and molecular modeling analyses, we have determined the frequencies of mutation at G • C bp in the context of all 64 5'-NGNN-3' motifs that contain the mutation at the second position. Twenty-four datasets were employed, comprising >530,000 somatic single base substitutions from 21 cancer genomes, >77,000 germline single-base substitutions causing or associated with human inherited disease and 16.7 million benign germline single-nucleotide variants. In several cancer types, the number of mutated motifs correlated both with the free energies of base stacking and the energies required for abstracting an electron from the target guanines (ionization potentials). Similar correlations were also evident for the pathological missense and nonsense germline mutations, but only when the target guanines were located on the non-transcribed DNA strand. Likewise, pathogenic splicing mutations predominantly affected positions in which a purine was located on the non-transcribed DNA strand. Novel candidate driver mutations and tissue-specific mutational patterns were also identified in the cancer datasets. We conclude that electron transfer reactions within the DNA molecule contribute to sequence context-dependent mutagenesis, involving both somatic driver and passenger mutations in cancer, as well as germline alterations causing or associated with inherited disease.
Collapse
Affiliation(s)
- Albino Bacolla
- Division of Pharmacology and Toxicology, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas, United States of America
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Nuri A. Temiz
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Ming Yi
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Joseph Ivanic
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Regina Z. Cer
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Duncan E. Donohue
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Edward V. Ball
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Uma S. Mudunuri
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Guliang Wang
- Division of Pharmacology and Toxicology, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas, United States of America
| | - Aklank Jain
- Division of Pharmacology and Toxicology, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas, United States of America
| | - Natalia Volfovsky
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Brian T. Luke
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Robert M. Stephens
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - David N. Cooper
- Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom
| | - Jack R. Collins
- Advanced Biomedical Computing Center, SAIC-Frederick, Inc., Frederick National Laboratory for Cancer Research, Frederick, Maryland, United States of America
| | - Karen M. Vasquez
- Division of Pharmacology and Toxicology, The University of Texas at Austin, Dell Pediatric Research Institute, Austin, Texas, United States of America
| |
Collapse
|
4
|
Jacobs E, Mills JD, Janitz M. The role of RNA structure in posttranscriptional regulation of gene expression. J Genet Genomics 2012; 39:535-43. [PMID: 23089363 DOI: 10.1016/j.jgg.2012.08.002] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Revised: 08/16/2012] [Accepted: 08/17/2012] [Indexed: 01/18/2023]
Abstract
As more information is gathered on the mechanisms of transcription and translation, it is becoming apparent that these processes are highly regulated. The formation of mRNA secondary and tertiary structures is one such regulatory process that until recently it has not been analysed in depth. Formation of these mRNA structures has the potential to enhance and inhibit alternative splicing of transcripts, and regulate rates and amount of translation. As this regulatory mechanism potentially impacts at both the transcriptional and translational level, while also potentially utilising the vast array of non-coding RNAs, it warrants further investigation. Currently, a variety of high-throughput sequencing techniques including parallel analysis of RNA structure (PARS), fragmentation sequencing (FragSeq) and selective 2-hydroxyl acylation analysed by primer extension (SHAPE) lead the way in the genome-wide identification and analysis of mRNA structure formation. These new sequencing techniques highlight the diversity and complexity of the transcriptome, and demonstrate another regulatory mechanism that could become a target for new therapeutic approaches.
Collapse
Affiliation(s)
- Elina Jacobs
- School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney NSW 2052, Australia
| | | | | |
Collapse
|