1
|
Castanho EN, Aidos H, Madeira SC. Biclustering data analysis: a comprehensive survey. Brief Bioinform 2024; 25:bbae342. [PMID: 39007596 PMCID: PMC11247412 DOI: 10.1093/bib/bbae342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 05/16/2024] [Accepted: 07/01/2024] [Indexed: 07/16/2024] Open
Abstract
Biclustering, the simultaneous clustering of rows and columns of a data matrix, has proved its effectiveness in bioinformatics due to its capacity to produce local instead of global models, evolving from a key technique used in gene expression data analysis into one of the most used approaches for pattern discovery and identification of biological modules, used in both descriptive and predictive learning tasks. This survey presents a comprehensive overview of biclustering. It proposes an updated taxonomy for its fundamental components (bicluster, biclustering solution, biclustering algorithms, and evaluation measures) and applications. We unify scattered concepts in the literature with new definitions to accommodate the diversity of data types (such as tabular, network, and time series data) and the specificities of biological and biomedical data domains. We further propose a pipeline for biclustering data analysis and discuss practical aspects of incorporating biclustering in real-world applications. We highlight prominent application domains, particularly in bioinformatics, and identify typical biclusters to illustrate the analysis output. Moreover, we discuss important aspects to consider when choosing, applying, and evaluating a biclustering algorithm. We also relate biclustering with other data mining tasks (clustering, pattern mining, classification, triclustering, N-way clustering, and graph mining). Thus, it provides theoretical and practical guidance on biclustering data analysis, demonstrating its potential to uncover actionable insights from complex datasets.
Collapse
Affiliation(s)
- Eduardo N Castanho
- LASIGE, Faculdade de Ciências, Universidade de Lisboa, Campo Grande 16, P-1749-016 Lisbon, Portugal
| | - Helena Aidos
- LASIGE, Faculdade de Ciências, Universidade de Lisboa, Campo Grande 16, P-1749-016 Lisbon, Portugal
| | - Sara C Madeira
- LASIGE, Faculdade de Ciências, Universidade de Lisboa, Campo Grande 16, P-1749-016 Lisbon, Portugal
| |
Collapse
|
2
|
Zhang H, Jiao J, Zhao T, Zhao E, Li L, Li G, Zhang B, Qin QM. GERWR: Identifying the Key Pathogenicity- Associated sRNAs of Magnaporthe Oryzae Infection in Rice Based on Graph Embedding and Random Walk With Restart. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:227-239. [PMID: 38153818 DOI: 10.1109/tcbb.2023.3348080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2023]
Abstract
Rice blast, caused by Magnaporthe oryzae(M.oryzae), is a destructive rice disease that reduces rice yield by 10% to 30% annually. It also affects other cereal crops such as barley, wheat, rye, millet, sorghum, and maize. Small RNAs (sRNAs) play an essential regulatory role in fungus-plant interaction during the fungal invasion, but studies on pathogenic sRNAs during the fungal invasion of plants based on multi-omics data integration are rare. This paper proposes a novel approach called Graph Embedding combined with Random Walk with Restart (GERWR) to identify pathogenic sRNAs based on multi-omics data integration during M.oryzae invasion. By constructing a multi-omics network (MRMO), we identified 29 pathogenic sRNAs of rice blast fungus. Further analysis revealed that these sRNAs regulate rice genes in a many-to-many relationship, playing a significant regulatory role in the pathogenesis of rice blast disease. This paper explores the pathogenic factors of rice blast disease from the perspective of multi-omics data analysis, revealing the inherent connection between pathogenic factors of different omics. It has essential scientific significance for studying the pathogenic mechanism of rice blast fungus, the rice blast fungus-rice model system, and the pathogen-host interaction in related fields.
Collapse
|
3
|
Zhao E, Dong L, Zhao H, Zhang H, Zhang T, Yuan S, Jiao J, Chen K, Sheng J, Yang H, Wang P, Li G, Qin Q. A Relationship Prediction Method for Magnaporthe oryzae-Rice Multi-Omics Data Based on WGCNA and Graph Autoencoder. J Fungi (Basel) 2023; 9:1007. [PMID: 37888263 PMCID: PMC10607591 DOI: 10.3390/jof9101007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 10/02/2023] [Accepted: 10/07/2023] [Indexed: 10/28/2023] Open
Abstract
Magnaporthe oryzae Oryzae (MoO) pathotype is a devastating fungal pathogen of rice; however, its pathogenic mechanism remains poorly understood. The current research is primarily focused on single-omics data, which is insufficient to capture the complex cross-kingdom regulatory interactions between MoO and rice. To address this limitation, we proposed a novel method called Weighted Gene Autoencoder Multi-Omics Relationship Prediction (WGAEMRP), which combines weighted gene co-expression network analysis (WGCNA) and graph autoencoder to predict the relationship between MoO-rice multi-omics data. We applied WGAEMRP to construct a MoO-rice multi-omics heterogeneous interaction network, which identified 18 MoO small RNAs (sRNAs), 17 rice genes, 26 rice mRNAs, and 28 rice proteins among the key biomolecules. Most of the mined functional modules and enriched pathways were related to gene expression, protein composition, transportation, and metabolic processes, reflecting the infection mechanism of MoO. Compared to previous studies, WGAEMRP significantly improves the efficiency and accuracy of multi-omics data integration and analysis. This approach lays out a solid data foundation for studying the biological process of MoO infecting rice, refining the regulatory network of pathogenic markers, and providing new insights for developing disease-resistant rice varieties.
Collapse
Affiliation(s)
- Enshuang Zhao
- College of Computer Science and Technology, Jilin University, Changchun 130012, China; (E.Z.); (L.D.); (H.Z.); (T.Z.); (J.J.); (K.C.); (J.S.)
| | - Liyan Dong
- College of Computer Science and Technology, Jilin University, Changchun 130012, China; (E.Z.); (L.D.); (H.Z.); (T.Z.); (J.J.); (K.C.); (J.S.)
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130012, China
| | - Hengyi Zhao
- College of Computer Science and Technology, Jilin University, Changchun 130012, China; (E.Z.); (L.D.); (H.Z.); (T.Z.); (J.J.); (K.C.); (J.S.)
| | - Hao Zhang
- College of Computer Science and Technology, Jilin University, Changchun 130012, China; (E.Z.); (L.D.); (H.Z.); (T.Z.); (J.J.); (K.C.); (J.S.)
- College of Software, Jilin University, Changchun 130012, China; (S.Y.); (H.Y.); (P.W.)
| | - Tianyue Zhang
- College of Computer Science and Technology, Jilin University, Changchun 130012, China; (E.Z.); (L.D.); (H.Z.); (T.Z.); (J.J.); (K.C.); (J.S.)
| | - Shuai Yuan
- College of Software, Jilin University, Changchun 130012, China; (S.Y.); (H.Y.); (P.W.)
| | - Jiao Jiao
- College of Computer Science and Technology, Jilin University, Changchun 130012, China; (E.Z.); (L.D.); (H.Z.); (T.Z.); (J.J.); (K.C.); (J.S.)
| | - Kang Chen
- College of Computer Science and Technology, Jilin University, Changchun 130012, China; (E.Z.); (L.D.); (H.Z.); (T.Z.); (J.J.); (K.C.); (J.S.)
| | - Jianhua Sheng
- College of Computer Science and Technology, Jilin University, Changchun 130012, China; (E.Z.); (L.D.); (H.Z.); (T.Z.); (J.J.); (K.C.); (J.S.)
| | - Hongbo Yang
- College of Software, Jilin University, Changchun 130012, China; (S.Y.); (H.Y.); (P.W.)
| | - Pengyu Wang
- College of Software, Jilin University, Changchun 130012, China; (S.Y.); (H.Y.); (P.W.)
| | - Guihua Li
- College of Plant Science, Key Laboratory of Zoonosis Research, Ministry of Education, Jilin University, Changchun 130012, China;
| | - Qingming Qin
- Department of Molecular Microbiology and Immunology, School of Medicine, University of Missouri, Columbia, MI 65211-7310, USA;
| |
Collapse
|
4
|
Chi J, Zhang H, Zhang T, Zhao E, Zhao T, Zhao H, Yuan S. Exploring the Common Mechanism of Fungal sRNA Transboundary Regulation of Plants Based on Ensemble Learning Methods. Front Genet 2022; 13:816478. [PMID: 35222537 PMCID: PMC8873571 DOI: 10.3389/fgene.2022.816478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 01/17/2022] [Indexed: 11/13/2022] Open
Abstract
Studies have found that pathogenic fungi and plants have sRNA transboundary regulation mechanisms. However, no researchers have used computer methods to carry out comprehensive studies on whether there is a more remarkable similarity in the transboundary regulation of plants by pathogenic fungi. In this direction, high-throughput non-coding sRNA data of three types of fungi and fungi-infected plants for 72 h were obtained. These include the Magnaporthe, Magnaporthe oryzae infecting Oryza sativa, Botrytis cinerea, Botrytis cinerea infecting Solanum lycopersicum, Phytophthora infestans and Phytophthora infestans infecting Solanum tuberosum. Research on these data to explore the commonness of fungal sRNA transboundary regulation of plants. First, using the big data statistical analysis method, the sRNA whose expression level increased significantly after infection was found as the key sRNA for pathogenicity, including 355 species of Magnaporthe oryzae, 399 species of Botrytis cinerea, and 426 species of Phytophthora infestans. Secondly, the target prediction was performed on the key sRNAs of the above three fungi, and 96, 197, and 112 core nodes were screened out, respectively. After functional enrichment analysis, multiple GO and KEGG_Pathway were obtained. It is found that there are multiple identical GO and KEGG_Pathway that can participate in plant gene expression regulation, metabolism, and other life processes, thereby affecting plant growth, development, reproduction, and response to the external environment. Finally, the characteristics of key pathogenic sRNAs and some non-pathogenic sRNAs are mined and extracted. Five Ensemble learning algorithms of Gradient Boosting Decision Tree, Random Forest, Adaboost, XGBoost, and Light Gradient Boosting Machine are used to construct a binary classification prediction model on the data set. The five indicators of accuracy, recall, precision, F1 score, and AUC were used to compare and analyze the models with the best parameters obtained by training, and it was found that each model performed well. Among them, XGBoost performed very well in the five models, and the AUC of the validation set was 0.86, 0.93, and 0.90. Therefore, this model has a reference value for predicting other fungi’s key sRNAs that transboundary regulation of plants.
Collapse
Affiliation(s)
- Junxia Chi
- College of Software, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Hao Zhang
- College of Software, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
- College of Computer Science and Technology, Jilin University, Changchun, China
- *Correspondence: Hao Zhang,
| | - Tianyue Zhang
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
- College of Computer Science and Technology, Jilin University, Changchun, China
| | - Enshuang Zhao
- College of Software, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| | - Tianheng Zhao
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
- College of Computer Science and Technology, Jilin University, Changchun, China
| | - Hengyi Zhao
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
- College of Computer Science and Technology, Jilin University, Changchun, China
| | - Shuai Yuan
- College of Software, Jilin University, Changchun, China
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| |
Collapse
|
5
|
Chen YJ, Ma KY, Du SS, Zhang ZJ, Wu TL, Sun Y, Liu YQ, Yin XD, Zhou R, Yan YF, Wang RX, He YH, Chu QR, Tang C. Antifungal Exploration of Quinoline Derivatives against Phytopathogenic Fungi Inspired by Quinine Alkaloids. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2021; 69:12156-12170. [PMID: 34623798 DOI: 10.1021/acs.jafc.1c05677] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Enlightened from our previous work of structural simplification of quinine and innovative application of natural products against phytopathogenic fungi, lead structure 2,8-bis(trifluoromethyl)-4-quinolinol (3) was selected to be a candidate and its diversified design, synthesis, and antifungal evaluation were carried out. All of the synthesized compounds Aa1-Db1 were evaluated for their antifungal activity against four agriculturally important fungi, Botrytis cinerea, Fusarium graminearum, Rhizoctonia solani, and Sclerotinia sclerotiorum. Results showed that compounds Ac3, Ac4, Ac7, Ac9, Ac12, Bb1, Bb10, Bb11, Bb13, Cb1. and Cb3 exhibited a good antifungal effect, especially Ac12 had the most potent activity with EC50 values of 0.52 and 0.50 μg/mL against S. sclerotiorum and B. cinerea, respectively, which were more potent than those of the lead compound 3 (1.72 and 1.89 μg/mL) and commercial fungicides azoxystrobin (both >30 μg/mL) and 8-hydroxyquinoline (2.12 and 5.28 μg/mL). Moreover, compound Ac12 displayed excellent in vivo antifungal activity, which was comparable in activity to the commercial fungicide boscalid. The preliminary mechanism revealed that compound Ac12 might cause an abnormal morphology of cell membranes, an increase in membrane permeability, and release of cellular contents. These results indicated that compound Ac12 displayed superior in vitro and in vivo fungicidal activities and could be a potential fungicidal candidate against plant fungal diseases.
Collapse
Affiliation(s)
- Yong-Jia Chen
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Kun-Yuan Ma
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Sha-Sha Du
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Zhi-Jun Zhang
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Tian-Lin Wu
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Yu Sun
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Ying-Qian Liu
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Xiao-Dan Yin
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Rui Zhou
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Yin-Fang Yan
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Ren-Xuan Wang
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Ying-Hui He
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Qing-Ru Chu
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| | - Chen Tang
- School of Pharmacy, Lanzhou University, Lanzhou 730000, People's Republic of China
| |
Collapse
|
6
|
Wu Y, Zhu D, Wang X, Zhang S. An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data. Comput Biol Chem 2021; 95:107566. [PMID: 34534906 DOI: 10.1016/j.compbiolchem.2021.107566] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 08/13/2021] [Accepted: 08/18/2021] [Indexed: 11/17/2022]
Abstract
To explore the pathogenic mechanisms of MicroRNA (miRNA) on diverse diseases, many researchers have concentrated on discovering the potential associations between miRNA and disease using machine learning methods. However, the prediction accuracy of supervised machine learning methods is limited by lacking of experimentally-validated uncorrelated miRNA-disease pairs. Without these negative samples, training a highly accurate model is much more difficult. Different from traditional miRNA-disease prediction models using randomly selected unknown samples as negative training samples, we propose an ensemble learning framework to solve this positive-unlabeled (PU) learning problem. The framework incorporates two steps, i.e., a novel semi-supervised Kmeans (SS-Kmeans) to extract reliable negative samples from unknown miRNA-disease pairs and subagging method to generate diverse training sample sets to make full use of those reliable negative samples for ensemble learning. Combined with effective random vector functional link (RVFL) network as prediction model, the proposed framework showed superior prediction accuracy comparing with other popular approaches. A case study on lung and gastric neoplasms further confirms the framework's efficacy at identifying miRNA disease associations.
Collapse
Affiliation(s)
- Yao Wu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Donghua Zhu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Xuefeng Wang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China.
| | - Shuo Zhang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
7
|
Computational biology and chemistry Special section editorial: Computational analyses for miRNA. Comput Biol Chem 2021; 91:107448. [PMID: 33579616 DOI: 10.1016/j.compbiolchem.2021.107448] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|