1
|
Li G, Li Y, Liang C, Luo J. DeepWalk-aware graph attention networks with CNN for circRNA-drug sensitivity association identification. Brief Funct Genomics 2024; 23:418-428. [PMID: 38061910 DOI: 10.1093/bfgp/elad053] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 09/26/2023] [Accepted: 11/20/2023] [Indexed: 07/22/2024] Open
Abstract
Circular RNAs (circRNAs) are a class of noncoding RNA molecules that are widely found in cells. Recent studies have revealed the significant role played by circRNAs in human health and disease treatment. Several restrictions are encountered because forecasting prospective circRNAs and medication sensitivity connections through biological research is not only time-consuming and expensive but also incredibly ineffective. Consequently, the development of a novel computational method that enhances both the efficiency and accuracy of predicting the associations between circRNAs and drug sensitivities is urgently needed. Here, we present DGATCCDA, a computational method based on deep learning, for circRNA-drug sensitivity association identification. In DGATCCDA, we first construct multimodal networks from the original feature information of circRNAs and drugs. After that, we adopt DeepWalk-aware graph attention networks to sufficiently extract feature information from the multimodal networks to obtain the embedding representation of nodes. Specifically, we combine DeepWalk and graph attention network to form DeepWalk-aware graph attention networks, which can effectively capture the global and local information of graph structures. The features extracted from the multimodal networks are fused by layer attention, and eventually, the inner product approach is used to construct the association matrix of circRNAs and drugs for prediction. The ultimate experimental results obtained under 5-fold cross-validation settings show that the average area under the receiver operating characteristic curve value of DGATCCDA reaches 91.18%, which is better than those of the five current state-of-the-art calculation methods. We further guide a case study, and the excellent obtained results also show that DGATCCDA is an effective computational method for exploring latent circRNA-drug sensitivity associations.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Youjun Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
2
|
Salido-Guadarrama I, Romero-Cordoba SL, Rueda-Zarazua B. Multi-Omics Mining of lncRNAs with Biological and Clinical Relevance in Cancer. Int J Mol Sci 2023; 24:16600. [PMID: 38068923 PMCID: PMC10706612 DOI: 10.3390/ijms242316600] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 11/14/2023] [Accepted: 11/15/2023] [Indexed: 12/18/2023] Open
Abstract
In this review, we provide a general overview of the current panorama of mining strategies for multi-omics data to investigate lncRNAs with an actual or potential role as biological markers in cancer. Several multi-omics studies focusing on lncRNAs have been performed in the past with varying scopes. Nevertheless, many questions remain regarding the pragmatic application of different molecular technologies and bioinformatics algorithms for mining multi-omics data. Here, we attempt to address some of the less discussed aspects of the practical applications using different study designs for incorporating bioinformatics and statistical analyses of multi-omics data. Finally, we discuss the potential improvements and new paradigms aimed at unraveling the role and utility of lncRNAs in cancer and their potential use as molecular markers for cancer diagnosis and outcome prediction.
Collapse
Affiliation(s)
- Ivan Salido-Guadarrama
- Departamento de Bioinformatìca y Análisis Estadísticos, Instituto Nacional de Perinatología Isidro Espinosa de los Reyes, Mexico City 11000, Mexico
| | - Sandra L. Romero-Cordoba
- Departamento de Medicina Genómica y Toxicología Ambiental, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico;
- Biochemistry Department, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City 14080, Mexico
| | - Bertha Rueda-Zarazua
- Posgrado en Ciencias Biológicas, Facultad de Medicina, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico;
| |
Collapse
|
3
|
Cui S, Gao Y, Huang Y, Shen L, Zhao Q, Pan Y, Zhuang S. Advances and applications of machine learning and deep learning in environmental ecology and health. ENVIRONMENTAL POLLUTION (BARKING, ESSEX : 1987) 2023; 335:122358. [PMID: 37567408 DOI: 10.1016/j.envpol.2023.122358] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 08/02/2023] [Accepted: 08/08/2023] [Indexed: 08/13/2023]
Abstract
Machine learning (ML) and deep learning (DL) possess excellent advantages in data analysis (e.g., feature extraction, clustering, classification, regression, image recognition and prediction) and risk assessment and management in environmental ecology and health (EEH). Considering the rapid growth and increasing complexity of data in EEH, it is of significance to summarize recent advances and applications of ML and DL in EEH. This review summarized the basic processes and fundamental algorithms of the ML and DL modeling, and indicated the urgent needs of ML and DL in EEH. Recent research hotspots such as environmental ecology and restoration, environmental fate of new pollutants, chemical exposures and risks, chemical hazard identification and control were highlighted. Various applications of ML and DL in EEH demonstrate their versatility and technological revolution, and present some challenges. The perspective of ML and DL in EEH were further outlined to promote the innovative analysis and cultivation of the ML-driven research paradigm.
Collapse
Affiliation(s)
- Shixuan Cui
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China; Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, 310006, China
| | - Yuchen Gao
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yizhou Huang
- Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, 310006, China
| | - Lilai Shen
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Qiming Zhao
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yaru Pan
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Shulin Zhuang
- Key Laboratory of Environment Remediation and Ecological Health, Ministry of Education, College of Environmental and Resource Sciences, Zhejiang University, Hangzhou, 310058, China; Women's Hospital, School of Medicine, Zhejiang University, Hangzhou, 310006, China.
| |
Collapse
|
4
|
Helleckes LM, Hemmerich J, Wiechert W, von Lieres E, Grünberger A. Machine learning in bioprocess development: from promise to practice. Trends Biotechnol 2023; 41:817-835. [PMID: 36456404 DOI: 10.1016/j.tibtech.2022.10.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/20/2022] [Accepted: 10/27/2022] [Indexed: 11/30/2022]
Abstract
Fostered by novel analytical techniques, digitalization, and automation, modern bioprocess development provides large amounts of heterogeneous experimental data, containing valuable process information. In this context, data-driven methods like machine learning (ML) approaches have great potential to rationally explore large design spaces while exploiting experimental facilities most efficiently. Herein we demonstrate how ML methods have been applied so far in bioprocess development, especially in strain engineering and selection, bioprocess optimization, scale-up, monitoring, and control of bioprocesses. For each topic, we will highlight successful application cases, current challenges, and point out domains that can potentially benefit from technology transfer and further progress in the field of ML.
Collapse
Affiliation(s)
- Laura M Helleckes
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany; RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany
| | - Johannes Hemmerich
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany
| | - Wolfgang Wiechert
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany; RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany
| | - Eric von Lieres
- Institute for Bio- and Geosciences (IBG-1), Forschungszentrum Jülich GmbH, 52428 Jülich, Germany; RWTH Aachen University, Templergraben 55, 52062 Aachen, Germany
| | - Alexander Grünberger
- Multiscale Bioengineering, Technical Faculty, Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany; Center for Biotechnology (CeBiTec), Bielefeld University, Universitätsstr. 25, 33615 Bielefeld, Germany; Institute of Process Engineering in Life Sciences, Section III: Microsystems in Bioprocess Engineering, Karlsruhe Institute of Technology, Fritz-Haber-Weg 2, 76131, Karlsruhe, Germany.
| |
Collapse
|
5
|
Feng H, Jin D, Li J, Li Y, Zou Q, Liu T. Matrix reconstruction with reliable neighbors for predicting potential MiRNA-disease associations. Brief Bioinform 2023; 24:6960615. [PMID: 36567252 DOI: 10.1093/bib/bbac571] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 10/16/2022] [Accepted: 11/23/2022] [Indexed: 12/27/2022] Open
Abstract
Numerous experimental studies have indicated that alteration and dysregulation in mircroRNAs (miRNAs) are associated with serious diseases. Identifying disease-related miRNAs is therefore an essential and challenging task in bioinformatics research. Computational methods are an efficient and economical alternative to conventional biomedical studies and can reveal underlying miRNA-disease associations for subsequent experimental confirmation with reasonable confidence. Despite the success of existing computational approaches, most of them only rely on the known miRNA-disease associations to predict associations without adding other data to increase the prediction accuracy, and they are affected by issues of data sparsity. In this paper, we present MRRN, a model that combines matrix reconstruction with node reliability to predict probable miRNA-disease associations. In MRRN, the most reliable neighbors of miRNA and disease are used to update the original miRNA-disease association matrix, which significantly reduces data sparsity. Unknown miRNA-disease associations are reconstructed by aggregating the most reliable first-order neighbors to increase prediction accuracy by representing the local and global structure of the heterogeneous network. Five-fold cross-validation of MRRN produced an area under the curve (AUC) of 0.9355 and area under the precision-recall curve (AUPR) of 0.2646, values that were greater than those produced by comparable models. Two different types of case studies using three diseases were conducted to demonstrate the accuracy of MRRN, and all top 30 predicted miRNAs were verified.
Collapse
Affiliation(s)
- Hailin Feng
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Dongdong Jin
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Jian Li
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Yane Li
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No. 2006, Xiyuan Avenue, West District, high tech Zone, 611731, Chengdu, China
| | - Tongcun Liu
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| |
Collapse
|
6
|
Zhang B, Fan T. Knowledge structure and emerging trends in the application of deep learning in genetics research: A bibliometric analysis [2000–2021]. Front Genet 2022; 13:951939. [PMID: 36081985 PMCID: PMC9445221 DOI: 10.3389/fgene.2022.951939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 07/13/2022] [Indexed: 11/13/2022] Open
Abstract
Introduction: Deep learning technology has been widely used in genetic research because of its characteristics of computability, statistical analysis, and predictability. Herein, we aimed to summarize standardized knowledge and potentially innovative approaches for deep learning applications of genetics by evaluating publications to encourage more research.Methods: The Science Citation Index Expanded TM (SCIE) database was searched for deep learning applications for genomics-related publications. Original articles and reviews were considered. In this study, we derived a clustered network from 69,806 references that were cited by the 1,754 related manuscripts identified. We used CiteSpace and VOSviewer to identify countries, institutions, journals, co-cited references, keywords, subject evolution, path, current characteristics, and emerging topics.Results: We assessed the rapidly increasing publications concerned about deep learning applications of genomics approaches and identified 1,754 articles that published reports focusing on this subject. Among these, a total of 101 countries and 2,487 institutes contributed publications, The United States of America had the most publications (728/1754) and the highest h-index, and the US has been in close collaborations with China and Germany. The reference clusters of SCI articles were clustered into seven categories: deep learning, logic regression, variant prioritization, random forests, scRNA-seq (single-cell RNA-seq), genomic regulation, and recombination. The keywords representing the research frontiers by year were prediction (2016–2021), sequence (2017–2021), mutation (2017–2021), and cancer (2019–2021).Conclusion: Here, we summarized the current literature related to the status of deep learning for genetics applications and analyzed the current research characteristics and future trajectories in this field. This work aims to provide resources for possible further intensive exploration and encourages more researchers to overcome the research of deep learning applications in genetics.
Collapse
Affiliation(s)
- Bijun Zhang
- Department of Clinical Genetics, Shengjing Hospital of China Medical University, Shenyang, China
| | - Ting Fan
- Department of Computer, School of Intelligent Medicine, China Medical University, Shenyang, China
- *Correspondence: Ting Fan,
| |
Collapse
|
7
|
Prediction Model of Hemorrhage Transformation in Patient with Acute Ischemic Stroke Based on Multiparametric MRI Radiomics and Machine Learning. Brain Sci 2022; 12:brainsci12070858. [PMID: 35884664 PMCID: PMC9313447 DOI: 10.3390/brainsci12070858] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 06/23/2022] [Accepted: 06/27/2022] [Indexed: 12/13/2022] Open
Abstract
Intravenous thrombolysis is the most commonly used drug therapy for patients with acute ischemic stroke, which is often accompanied by complications of intracerebral hemorrhage transformation (HT). This study proposed to build a reliable model for pretreatment prediction of HT. Specifically, 5400 radiomics features were extracted from 20 regions of interest (ROIs) of multiparametric MRI images of 71 patients. Furthermore, a minimal set of all-relevant features were selected by LASSO from all ROIs and used to build a radiomics model through the random forest (RF). To explore the significance of normal ROIs, we built a model only based on abnormal ROIs. In addition, a model combining clinical factors and radiomics features was further built. Finally, the models were tested on an independent validation cohort. The radiomics model with 14 All-ROIs features achieved pretreatment prediction of HT (AUC = 0.871, accuracy = 0.848), which significantly outperformed the model with only 14 Abnormal-ROIs features (AUC = 0.831, accuracy = 0.818). Besides, combining clinical factors with radiomics features further benefited the prediction performance (AUC = 0.911, accuracy = 0.894). So, we think that the combined model can greatly assist doctors in diagnosis. Furthermore, we find that even if there were no lesions in the normal ROIs, they also provide characteristic information for the prediction of HT.
Collapse
|
8
|
Cui F, Cheng L, Zou Q. Briefings in functional genomics special section editorial: analysis of integrated multiple omics data. Brief Funct Genomics 2021; 20:196-197. [PMID: 34279568 DOI: 10.1093/bfgp/elab033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Feifei Cui
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Liang Cheng
- NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, Heilongjiang, 150028, China.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| |
Collapse
|