1
|
Tieng FYF, Abdullah-Zawawi MR, Md Shahri NAA, Mohamed-Hussein ZA, Lee LH, Mutalib NSA. A Hitchhiker's guide to RNA-RNA structure and interaction prediction tools. Brief Bioinform 2023; 25:bbad421. [PMID: 38040490 PMCID: PMC10753535 DOI: 10.1093/bib/bbad421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/16/2023] [Accepted: 10/26/2023] [Indexed: 12/03/2023] Open
Abstract
RNA biology has risen to prominence after a remarkable discovery of diverse functions of noncoding RNA (ncRNA). Most untranslated transcripts often exert their regulatory functions into RNA-RNA complexes via base pairing with complementary sequences in other RNAs. An interplay between RNAs is essential, as it possesses various functional roles in human cells, including genetic translation, RNA splicing, editing, ribosomal RNA maturation, RNA degradation and the regulation of metabolic pathways/riboswitches. Moreover, the pervasive transcription of the human genome allows for the discovery of novel genomic functions via RNA interactome investigation. The advancement of experimental procedures has resulted in an explosion of documented data, necessitating the development of efficient and precise computational tools and algorithms. This review provides an extensive update on RNA-RNA interaction (RRI) analysis via thermodynamic- and comparative-based RNA secondary structure prediction (RSP) and RNA-RNA interaction prediction (RIP) tools and their general functions. We also highlighted the current knowledge of RRIs and the limitations of RNA interactome mapping via experimental data. Then, the gap between RSP and RIP, the importance of RNA homologues, the relationship between pseudoknots, and RNA folding thermodynamics are discussed. It is hoped that these emerging prediction tools will deepen the understanding of RNA-associated interactions in human diseases and hasten treatment processes.
Collapse
Affiliation(s)
- Francis Yew Fu Tieng
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
| | | | - Nur Alyaa Afifah Md Shahri
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Institute of Systems Biology (INBIOSIS), UKM, Selangor 43600, Malaysia
- Department of Applied Physics, Faculty of Science and Technology, UKM, Selangor 43600, Malaysia
| | - Learn-Han Lee
- Sunway Microbiomics Centre, School of Medical and Life Sciences, Sunway University, Sunway City 47500, Malaysia
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University of Malaysia, Selangor 47500, Malaysia
| | - Nurul-Syakima Ab Mutalib
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University of Malaysia, Selangor 47500, Malaysia
- Faculty of Health Sciences, UKM, Kuala Lumpur 50300, Malaysia
| |
Collapse
|
2
|
Abdullah-Zawawi MR, Govender N, Karim MB, Altaf-Ul-Amin M, Kanaya S, Mohamed-Hussein ZA. Chemoinformatics-driven classification of Angiosperms using sulfur-containing compounds and machine learning algorithm. Plant Methods 2022; 18:118. [PMID: 36335358 PMCID: PMC9636760 DOI: 10.1186/s13007-022-00951-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Accepted: 10/14/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND Phytochemicals or secondary metabolites are low molecular weight organic compounds with little function in plant growth and development. Nevertheless, the metabolite diversity govern not only the phenetics of an organism but may also inform the evolutionary pattern and adaptation of green plants to the changing environment. Plant chemoinformatics analyzes the chemical system of natural products using computational tools and robust mathematical algorithms. It has been a powerful approach for species-level differentiation and is widely employed for species classifications and reinforcement of previous classifications. RESULTS This study attempts to classify Angiosperms using plant sulfur-containing compound (SCC) or sulphated compound information. The SCC dataset of 692 plant species were collected from the comprehensive species-metabolite relationship family (KNApSAck) database. The structural similarity score of metabolite pairs under all possible combinations (plant species-metabolite) were determined and metabolite pairs with a Tanimoto coefficient value > 0.85 were selected for clustering using machine learning algorithm. Metabolite clustering showed association between the similar structural metabolite clusters and metabolite content among the plant species. Phylogenetic tree construction of Angiosperms displayed three major clades, of which, clade 1 and clade 2 represented the eudicots only, and clade 3, a mixture of both eudicots and monocots. The SCC-based construction of Angiosperm phylogeny is a subset of the existing monocot-dicot classification. The majority of eudicots present in clade 1 and 2 were represented by glucosinolate compounds. These clades with SCC may have been a mixture of ancestral species whilst the combinatorial presence of monocot-dicot in clade 3 suggests sulphated-chemical structure diversification in the event of adaptation during evolutionary change. CONCLUSIONS Sulphated chemoinformatics informs classification of Angiosperms via machine learning technique.
Collapse
Affiliation(s)
- Muhammad-Redha Abdullah-Zawawi
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600, UKM Bangi, Malaysia
- UKM Medical Molecular Biology Institute (UMBI), Jalan Yaacob Latif, Bandar Tun Razak, 56000 Cheras, Kuala Lumpur, Malaysia
| | - Nisha Govender
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600, UKM Bangi, Malaysia
| | - Mohammad Bozlul Karim
- Graduate School Information Science, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Md Altaf-Ul-Amin
- Graduate School Information Science, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Shigehiko Kanaya
- Graduate School Information Science, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Zeti-Azura Mohamed-Hussein
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600, UKM Bangi, Malaysia.
- Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600, UKM Bangi, Malaysia.
| |
Collapse
|
3
|
Abdullah-Zawawi MR, Govender N, Harun S, Muhammad NAN, Zainal Z, Mohamed-Hussein ZA. Multi-Omics Approaches and Resources for Systems-Level Gene Function Prediction in the Plant Kingdom. Plants (Basel) 2022; 11:2614. [PMID: 36235479 PMCID: PMC9573505 DOI: 10.3390/plants11192614] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 09/05/2022] [Accepted: 09/13/2022] [Indexed: 06/16/2023]
Abstract
In higher plants, the complexity of a system and the components within and among species are rapidly dissected by omics technologies. Multi-omics datasets are integrated to infer and enable a comprehensive understanding of the life processes of organisms of interest. Further, growing open-source datasets coupled with the emergence of high-performance computing and development of computational tools for biological sciences have assisted in silico functional prediction of unknown genes, proteins and metabolites, otherwise known as uncharacterized. The systems biology approach includes data collection and filtration, system modelling, experimentation and the establishment of new hypotheses for experimental validation. Informatics technologies add meaningful sense to the output generated by complex bioinformatics algorithms, which are now freely available in a user-friendly graphical user interface. These resources accentuate gene function prediction at a relatively minimal cost and effort. Herein, we present a comprehensive view of relevant approaches available for system-level gene function prediction in the plant kingdom. Together, the most recent applications and sought-after principles for gene mining are discussed to benefit the plant research community. A realistic tabulation of plant genomic resources is included for a less laborious and accurate candidate gene discovery in basic plant research and improvement strategies.
Collapse
Affiliation(s)
- Muhammad-Redha Abdullah-Zawawi
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia, Kuala Lumpur 56000, Malaysia
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Nisha Govender
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Sarahani Harun
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Nor Azlan Nor Muhammad
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Zamri Zainal
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
- Faculty of Science and Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Institute of System Biology (INBIOSIS), Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
- Faculty of Science and Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Malaysia
| |
Collapse
|
4
|
Zainal-Abidin RA, Afiqah-Aleng N, Abdullah-Zawawi MR, Harun S, Mohamed-Hussein ZA. Protein–Protein Interaction (PPI) Network of Zebrafish Oestrogen Receptors: A Bioinformatics Workflow. Life (Basel) 2022; 12:life12050650. [PMID: 35629318 PMCID: PMC9143887 DOI: 10.3390/life12050650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 04/24/2022] [Accepted: 04/25/2022] [Indexed: 12/04/2022] Open
Abstract
Protein–protein interaction (PPI) is involved in every biological process that occurs within an organism. The understanding of PPI is essential for deciphering the cellular behaviours in a particular organism. The experimental data from PPI methods have been used in constructing the PPI network. PPI network has been widely applied in biomedical research to understand the pathobiology of human diseases. It has also been used to understand the plant physiology that relates to crop improvement. However, the application of the PPI network in aquaculture is limited as compared to humans and plants. This review aims to demonstrate the workflow and step-by-step instructions for constructing a PPI network using bioinformatics tools and PPI databases that can help to predict potential interaction between proteins. We used zebrafish proteins, the oestrogen receptors (ERs) to build and analyse the PPI network. Thus, serving as a guide for future steps in exploring potential mechanisms on the organismal physiology of interest that ultimately benefit aquaculture research.
Collapse
Affiliation(s)
| | - Nor Afiqah-Aleng
- Institute of Marine Biotechnology, Universiti Malaysia Terengganu, Kuala Nerus 21030, Malaysia
- Correspondence: (N.A.-A.); (Z.-A.M.-H.)
| | | | - Sarahani Harun
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia;
| | - Zeti-Azura Mohamed-Hussein
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia;
- Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, Bangi 43600, Malaysia
- Correspondence: (N.A.-A.); (Z.-A.M.-H.)
| |
Collapse
|
5
|
Abdullah-Zawawi MR, Ahmad-Nizammuddin NF, Govender N, Harun S, Mohd-Assaad N, Mohamed-Hussein ZA. Comparative genome-wide analysis of WRKY, MADS-box and MYB transcription factor families in Arabidopsis and rice. Sci Rep 2021; 11:19678. [PMID: 34608238 PMCID: PMC8490385 DOI: 10.1038/s41598-021-99206-y] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 09/21/2021] [Indexed: 01/25/2023] Open
Abstract
Transcription factors (TFs) form the major class of regulatory genes and play key roles in multiple plant stress responses. In most eukaryotic plants, transcription factor (TF) families (WRKY, MADS-box and MYB) activate unique cellular-level abiotic and biotic stress-responsive strategies, which are considered as key determinants for defense and developmental processes. Arabidopsis and rice are two important representative model systems for dicot and monocot plants, respectively. A comprehensive comparative study on 101 OsWRKY, 34 OsMADS box and 122 OsMYB genes (rice genome) and, 71 AtWRKY, 66 AtMADS box and 144 AtMYB genes (Arabidopsis genome) showed various relationships among TFs across species. The phylogenetic analysis clustered WRKY, MADS-box and MYB TF family members into 10, 7 and 14 clades, respectively. All clades in WRKY and MYB TF families and almost half of the total number of clades in the MADS-box TF family are shared between both species. Chromosomal and gene structure analysis showed that the Arabidopsis-rice orthologous TF gene pairs were unevenly localized within their chromosomes whilst the distribution of exon–intron gene structure and motif conservation indicated plausible functional similarity in both species. The abiotic and biotic stress-responsive cis-regulatory element type and distribution patterns in the promoter regions of Arabidopsis and rice WRKY, MADS-box and MYB orthologous gene pairs provide better knowledge on their role as conserved regulators in both species. Co-expression network analysis showed the correlation between WRKY, MADs-box and MYB genes in each independent rice and Arabidopsis network indicating their role in stress responsiveness and developmental processes.
Collapse
Affiliation(s)
| | - Nur-Farhana Ahmad-Nizammuddin
- Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM, Bangi, Selangor, Malaysia
| | - Nisha Govender
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600 UKM, Bangi, Selangor, Malaysia.
| | - Sarahani Harun
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600 UKM, Bangi, Selangor, Malaysia
| | - Norfarhan Mohd-Assaad
- Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM, Bangi, Selangor, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600 UKM, Bangi, Selangor, Malaysia.,Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 UKM, Bangi, Selangor, Malaysia
| |
Collapse
|
6
|
Harun S, Abdullah-Zawawi MR, Goh HH, Mohamed-Hussein ZA. A Comprehensive Gene Inventory for Glucosinolate Biosynthetic Pathway in Arabidopsis thaliana. J Agric Food Chem 2020; 68:7281-7297. [PMID: 32551569 DOI: 10.1021/acs.jafc.0c01916] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Glucosinolates (GSLs) are plant secondary metabolites comprising sulfur and nitrogen mainly found in plants from the order of Brassicales, such as broccoli, cabbage, and Arabidopsis thaliana. The activated forms of GSL play important roles in fighting against pathogens and have health benefits to humans. The increasing amount of data on A. thaliana generated from various omics technologies can be investigated more deeply in search of new genes or compounds involved in GSL biosynthesis and metabolism. This review describes a comprehensive inventory of A. thaliana GSLs identified from published literature and databases such as KNApSAcK, KEGG, and AraCyc. A total of 113 GSL genes encoding for 23 transcription components, 85 enzymes, and five protein transporters were experimentally characterized in the past two decades. Continuous efforts are still on going to identify all molecules related to the production of GSLs. A manually curated database known as SuCCombase (http://plant-scc.org) was developed to serve as a comprehensive GSL inventory. Realizing lack of information on the regulation of GSL biosynthesis and degradation mechanisms, this review also includes relevant information and their connections with crosstalk among various factors, such as light, sulfur metabolism, and nitrogen metabolism, not only in A. thaliana but also in other crucifers.
Collapse
Affiliation(s)
- Sarahani Harun
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia
| | - Muhammad-Redha Abdullah-Zawawi
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia
| | - Hoe-Han Goh
- Centre for Plant Biotechnology, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia
- Department of Applied Physics, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia
| |
Collapse
|
7
|
Harun S, Abdullah-Zawawi MR, A-Rahman MRA, Muhammad NAN, Mohamed-Hussein ZA. SuCComBase: a manually curated repository of plant sulfur-containing compounds. Database (Oxford) 2019; 2019:5353919. [PMID: 30793170 PMCID: PMC6384505 DOI: 10.1093/database/baz021] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 01/28/2019] [Accepted: 01/28/2019] [Indexed: 12/30/2022]
Abstract
Plants produce a wide range of secondary metabolites that play important roles in plant defense and immunity, their interaction with the environment and symbiotic associations. Sulfur-containing compounds (SCCs) are a group of important secondary metabolites produced in members of the Brassicales order. SCCs constitute various groups of phytochemicals, but not much is known about them. Findings from previous studies on SCCs were scattered in published literatures, hence SuCComBase was developed to store all molecular information related to the biosynthesis of SCCs. Information that includes genes, proteins and compounds that are involved in the SCC biosynthetic pathway was manually identified from databases and published scientific literatures. Sets of co-expression data was analyzed to search for other possible (previously unknown) genes that might be involved in the biosynthesis of SCC. These genes were named as potential SCC-related encoding genes. A total of 147 known and 92 putative Arabidopsis thaliana SCC-related genes from literatures were used to identify other potential SCC-related encoding genes. We identified 778 potential SCC-related encoding genes, 4026 homologs to the SCC-related encoding genes and 116 SCCs as shown on SuCComBase homepage. Data entries are searchable from the Main page, Search, Browse and Datasets tabs. Users can easily download all data stored in SuCComBase. All publications related to SCCs are also indexed in SuCComBase, which is currently the first and only database dedicated to plant SCCs. SuCComBase aims to become a manually curated and au fait knowledge-based repository for plant SCCs.
Collapse
Affiliation(s)
- Sarahani Harun
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, UKM Bangi, Selangor, Malaysia
| | - Muhammad-Redha Abdullah-Zawawi
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, UKM Bangi, Selangor, Malaysia
| | - Mohd Rusman Arief A-Rahman
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, UKM Bangi, Selangor, Malaysia
| | - Nor Azlan Nor Muhammad
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, UKM Bangi, Selangor, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Centre for Bioinformatics Research, Institute of Systems Biology (INBIOSIS), Universiti Kebangsaan Malaysia, UKM Bangi, Selangor, Malaysia.,Centre for Frontier Sciences, Faculty of Science and Technology, Universiti Kebangsaan Malaysia, UKM Bangi, Selangor, Malaysia
| |
Collapse
|
8
|
Ashari KS, Abdullah-Zawawi MR, Harun S, Mohamed-Hussein ZA. Reconstruction of the Transcriptional Regulatory Network in Arabidopsis thaliana Aliphatic Glucosinolate Biosynthetic Pathway. SAINS MALAYS 2018. [DOI: 10.17576/jsm-2018-4712-08] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|