1
|
Cai C, Li J, Xia Y, Li W. FluPMT: Prediction of Predominant Strains of Influenza A Viruses via Multi-Task Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1254-1263. [PMID: 38498763 DOI: 10.1109/tcbb.2024.3378468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2024]
Abstract
Seasonal influenza vaccines play a crucial role in saving numerous lives annually. However, the constant evolution of the influenza A virus necessitates frequent vaccine updates to ensure its ongoing effectiveness. The decision to develop a new vaccine strain is generally based on the assessment of the current predominant strains. Nevertheless, the process of vaccine production and distribution is very time-consuming, leaving a window for the emergence of new variants that could decrease vaccine effectiveness, so predictions of influenza A virus evolution can inform vaccine evaluation and selection. Hence, we present FluPMT, a novel sequence prediction model that applies an encoder-decoder architecture to predict the hemagglutinin (HA) protein sequence of the upcoming season's predominant strain by capturing the patterns of evolution of influenza A viruses. Specifically, we employ time series to model the evolution of influenza A viruses, and utilize attention mechanisms to explore dependencies among residues of sequences. Additionally, antigenic distance prediction based on graph network representation learning is incorporated into the sequence prediction as an auxiliary task through a multi-task learning framework. Experimental results on two influenza datasets highlight the exceptional predictive performance of FluPMT, offering valuable insights into virus evolutionary dynamics, as well as vaccine evaluation and production.
Collapse
|
2
|
Yin R, Ye B, Bian J. CLCAP: Contrastive learning improves antigenicity prediction for influenza A virus using convolutional neural networks. Methods 2023; 220:S1046-2023(23)00180-9. [PMID: 39491098 DOI: 10.1016/j.ymeth.2023.10.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 10/05/2023] [Accepted: 10/23/2023] [Indexed: 11/05/2024] Open
Abstract
Influenza viruses are detected year-round over the world and the viruses will usually circulate during fall and winter, causing the seasonal flu. The growing novel variants of influenza viruses pose a significant concern to public health annually. However, the rapid mutation of the influenza viruses makes it challenging to timely track their evolution. Therefore, a fast, low-cost, and precise method to predict the antigenic variant of influenza viruses could help vaccine development and prevent viral transmission. In this study, we propose a multi-channel convolutional neural network using contrastive learning to predict the antigenicity of influenza A viruses. An integrated dataset containing antigenic data and protein sequences was collected from various public resources and literature. The experimental results on three different influenza subtypes indicate our proposed model outperforms other traditional machine learning classifiers for antigenicity prediction. In addition, it also demonstrates superior performance over several state-of-the-art approaches, with 5.18 %, 7.03 % and 7.82 % increase in accuracy compared to the best results for H1N1, H3N2 and H5N1, respectively. The proposed framework is timely and effective in influenza antigenicity prediction and can be adapted to the study of other viruses.
Collapse
Affiliation(s)
- Rui Yin
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, College of Medicine, FL, USA.
| | - Biao Ye
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, College of Medicine, FL, USA
| | - Jiang Bian
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, College of Medicine, FL, USA
| |
Collapse
|
3
|
Garjani A, Chegini AM, Salehi M, Tabibzadeh A, Yousefi P, Razizadeh MH, Esghaei M, Esghaei M, Rohban MH. Forecasting influenza hemagglutinin mutations through the lens of anomaly detection. Sci Rep 2023; 13:14944. [PMID: 37696867 PMCID: PMC10495359 DOI: 10.1038/s41598-023-42089-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Accepted: 09/05/2023] [Indexed: 09/13/2023] Open
Abstract
The influenza virus hemagglutinin is an important part of the virus attachment to the host cells. The hemagglutinin proteins are one of the genetic regions of the virus with a high potential for mutations. Due to the importance of predicting mutations in producing effective and low-cost vaccines, solutions that attempt to approach this problem have recently gained significant attention. A historical record of mutations has been used to train predictive models in such solutions. However, the imbalance between mutations and preserved proteins is a big challenge for the development of such models that need to be addressed. Here, we propose to tackle this challenge through anomaly detection (AD). AD is a well-established field in Machine Learning (ML) that tries to distinguish unseen anomalies from normal patterns using only normal training samples. By considering mutations as anomalous behavior, we could benefit existing rich solutions in this field that have emerged recently. Such methods also fit the problem setup of extreme imbalance between the number of unmutated vs. mutated training samples. Motivated by this formulation, our method tries to find a compact representation for unmutated samples while forcing anomalies to be separated from the normal ones. This helps the model to learn a shared unique representation between normal training samples as much as possible, which improves the discernibility and detectability of mutated samples from the unmutated ones at the test time. We conduct a large number of experiments on four publicly available datasets, consisting of three different hemagglutinin protein datasets, and one SARS-CoV-2 dataset, and show the effectiveness of our method through different standard criteria.
Collapse
Affiliation(s)
- Ali Garjani
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | | | - Mohammadreza Salehi
- Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Alireza Tabibzadeh
- Department of Virology, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Parastoo Yousefi
- Department of Virology, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | | | - Moein Esghaei
- Cognitive Neuroscience Laboratory, German Primate Center, Leibniz Institute for Primate Research, Goettingen, Germany
| | - Maryam Esghaei
- Department of Virology, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
| | | |
Collapse
|
4
|
Yin R, Luo Z, Zhuang P, Zeng M, Li M, Lin Z, Kwoh CK. ViPal: A framework for virulence prediction of influenza viruses with prior viral knowledge using genomic sequences. J Biomed Inform 2023; 142:104388. [PMID: 37178781 PMCID: PMC10602211 DOI: 10.1016/j.jbi.2023.104388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 04/30/2023] [Accepted: 05/07/2023] [Indexed: 05/15/2023]
Abstract
Influenza viruses pose great threats to public health and cause enormous economic losses every year. Previous work has revealed the viral factors associated with the virulence of influenza viruses in mammals. However, taking prior viral knowledge represented by heterogeneous categorical and discrete information into account to explore virus virulence is scarce in the existing work. How to make full use of the preceding domain knowledge in virulence study is challenging but beneficial. This paper proposes a general framework named ViPal for virulence prediction in mice that incorporates discrete prior viral mutation and reassortment information based on all eight influenza segments. The posterior regularization technique is leveraged to transform prior viral knowledge into constraint features and integrated into the machine learning models. Experimental results on influenza genomic datasets validate that our proposed framework can improve virulence prediction performance over baselines. The comparison between ViPal and other existing methods shows the computational efficiency of our framework with comparable or superior performance. Moreover, the interpretable analysis through SHAP (SHapley Additive exPlanations) identifies the scores of constraint features contributing to the prediction. We hope this framework could provide assistance for the accurate detection of influenza virulence and facilitate flu surveillance.
Collapse
Affiliation(s)
- Rui Yin
- Department of Health Outcomes and Biomedical Informatics, University of Florida, Gainesville, USA; School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore.
| | - Zihan Luo
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
| | - Pei Zhuang
- Brigham and Women's Hospital, Harvard Medical School, Boston, USA
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha, China
| | - Zhuoyi Lin
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore 639798, Singapore
| |
Collapse
|
5
|
Yin R, Thwin NN, Zhuang P, Lin Z, Kwoh CK. IAV-CNN: A 2D Convolutional Neural Network Model to Predict Antigenic Variants of Influenza A Virus. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3497-3506. [PMID: 34469306 DOI: 10.1109/tcbb.2021.3108971] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The rapid evolution of influenza viruses constantly leads to the emergence of novel influenza strains that are capable of escaping from population immunity. The timely determination of antigenic variants is critical to vaccine design. Empirical experimental methods like hemagglutination inhibition (HI) assays are time-consuming and labor-intensive, requiring live viruses. Recently, many computational models have been developed to predict the antigenic variants without considerations of explicitly modeling the interdependencies between the channels of feature maps. Moreover, the influenza sequences consisting of similar distribution of residues will have high degrees of similarity and will affect the prediction outcome. Consequently, it is challenging but vital to determine the importance of different residue sites and enhance the predictive performance of influenza antigenicity. We have proposed a 2D convolutional neural network (CNN) model to infer influenza antigenic variants (IAV-CNN). Specifically, we apply a new distributed representation of amino acids, named ProtVec that can be applied to a variety of downstream proteomic machine learning tasks. After splittings and embeddings of influenza strains, a 2D squeeze-and-excitation CNN architecture is constructed that enables networks to focus on informative residue features by fusing both spatial and channel-wise information with local receptive fields at each layer. Experimental results on three influenza datasets show IAV-CNN achieves state-of-the-art performance combining the new distributed representation with our proposed architecture. It outperforms both traditional machine algorithms with the same feature representations and the majority of existing models in the independent test data. Therefore we believe that our model can be served as a reliable and robust tool for the prediction of antigenic variants.
Collapse
|
6
|
Yin R, Zhu X, Zeng M, Wu P, Li M, Kwoh CK. A framework for predicting variable-length epitopes of human-adapted viruses using machine learning methods. Brief Bioinform 2022; 23:6645487. [PMID: 35849093 DOI: 10.1093/bib/bbac281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 06/16/2022] [Accepted: 06/17/2022] [Indexed: 11/14/2022] Open
Abstract
The coronavirus disease 2019 pandemic has alerted people of the threat caused by viruses. Vaccine is the most effective way to prevent the disease from spreading. The interaction between antibodies and antigens will clear the infectious organisms from the host. Identifying B-cell epitopes is critical in vaccine design, development of disease diagnostics and antibody production. However, traditional experimental methods to determine epitopes are time-consuming and expensive, and the predictive performance using the existing in silico methods is not satisfactory. This paper develops a general framework to predict variable-length linear B-cell epitopes specific for human-adapted viruses with machine learning approaches based on Protvec representation of peptides and physicochemical properties of amino acids. QR decomposition is incorporated during the embedding process that enables our models to handle variable-length sequences. Experimental results on large immune epitope datasets validate that our proposed model's performance is superior to the state-of-the-art methods in terms of AUROC (0.827) and AUPR (0.831) on the testing set. Moreover, sequence analysis also provides the results of the viral category for the corresponding predicted epitopes with high precision. Therefore, this framework is shown to reliably identify linear B-cell epitopes of human-adapted viruses given protein sequences and could provide assistance for potential future pandemics and epidemics.
Collapse
Affiliation(s)
- Rui Yin
- Department of Biomedical Informatics, Harvard Medical School, Boston, USA
| | - Xianghe Zhu
- Department of Statistics, University of Oxford, Oxford, UK
| | - Min Zeng
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Pengfei Wu
- Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, China
| | - Min Li
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, China
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| |
Collapse
|
7
|
Abbas ME, Chengzhang Z, Fathalla A, Xiao Y. End-to-end antigenic variant generation for H1N1 influenza HA protein using sequence to sequence models. PLoS One 2022; 17:e0266198. [PMID: 35344562 PMCID: PMC8959165 DOI: 10.1371/journal.pone.0266198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Accepted: 03/16/2022] [Indexed: 11/23/2022] Open
Abstract
The growing risk of new variants of the influenza A virus is the most significant to public health. The risk imposed from new variants may have been lethal, as witnessed in the year 2009. Even though the improvement in predicting antigenicity of influenza viruses has rapidly progressed, few studies employed deep learning methodologies. The most recent literature mostly relied on classification techniques, while a model that generates the HA protein of the antigenic variant is not developed. However, the antigenic pair of influenza virus A can be determined in a laboratory setup, the process needs a tremendous amount of time and labor. Antigenic shift and drift which are caused by changes in surface protein favored the influenza A virus in evading immunity. The high frequency of the minor changes in the surface protein poses a challenge to identifying the antigenic variant of an emerging virus. These changes slow down vaccine selection and the manufacturing process. In this vein, the proposed model could help save the time and efforts exerted to identify the antigenic pair of the influenza virus. The proposed model utilized an end-to-end learning methodology relying on deep sequence-to-sequence architecture to generate the antigenic variant of a given influenza A virus using surface protein. Employing the BLEU score to evaluate the generated HA protein of the antigenic variant of influenza virus A against the actual variant, the proposed model achieved a mean accuracy of 97.57%.
Collapse
Affiliation(s)
- Mohamed Elsayed Abbas
- School of Computer Science and Engineering, Central South University, Changsha, China
- Mobile Health Ministry of Education-China Mobile Joint Laboratory, Changsha, China
| | - Zhu Chengzhang
- School of Computer Science and Engineering, Central South University, Changsha, China
- The College of Literature and Journalism, Central South University, Changsha, China
- Mobile Health Ministry of Education-China Mobile Joint Laboratory, Changsha, China
| | - Ahmed Fathalla
- Department of Mathematics, Faculty of Science,Suez Canal University, Ismailia, Egypt
| | - Yalong Xiao
- School of Computer Science and Engineering, Central South University, Changsha, China
- The College of Literature and Journalism, Central South University, Changsha, China
| |
Collapse
|
8
|
Classification of COVID-19 and Influenza Patients Using Deep Learning. CONTRAST MEDIA & MOLECULAR IMAGING 2022; 2022:8549707. [PMID: 35280712 PMCID: PMC8884121 DOI: 10.1155/2022/8549707] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 01/26/2022] [Indexed: 12/13/2022]
Abstract
Coronavirus (COVID-19) is a deadly virus that initially starts with flu-like symptoms. COVID-19 emerged in China and quickly spread around the globe, resulting in the coronavirus epidemic of 2019–22. As this virus is very similar to influenza in its early stages, its accurate detection is challenging. Several techniques for detecting the virus in its early stages are being developed. Deep learning techniques are a handy tool for detecting various diseases. For the classification of COVID-19 and influenza, we proposed tailored deep learning models. A publicly available dataset of X-ray images was used to develop proposed models. According to test results, deep learning models can accurately diagnose normal, influenza, and COVID-19 cases. Our proposed long short-term memory (LSTM) technique outperformed the CNN model in the evaluation phase on chest X-ray images, achieving 98% accuracy.
Collapse
|
9
|
Yin R, Luo Z, Kwoh CK. Exploring the Lethality of Human-Adapted Coronavirus Through Alignment-Free Machine Learning Approaches Using Genomic Sequences. Curr Genomics 2021; 22:583-595. [PMID: 35386190 PMCID: PMC8922323 DOI: 10.2174/1389202923666211221110857] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 12/02/2021] [Accepted: 12/14/2021] [Indexed: 11/29/2022] Open
Abstract
Background A newly emerging novel coronavirus appeared and rapidly spread worldwide and World Health Organization declared a pandemic on March 11, 2020. The roles and characteristics of coronavirus have captured much attention due to its power of causing a wide variety of infectious diseases, from mild to severe, on humans. The detection of the lethality of human coronavirus is key to estimate the viral toxicity and provide perspectives for treatment. Methods We developed an alignment-free framework that utilizes machine learning approaches for an ultra-fast and highly accurate prediction of the lethality of human-adapted coronavirus using genomic sequences. We performed extensive experiments through six different feature transformation and machine learning algorithms combining digital signal processing to identify the lethality of possible future novel coronaviruses using existing strains. Results The results tested on SARS-CoV, MERS-CoV and SARS-CoV-2 datasets show an average 96.7% prediction accuracy. We also provide preliminary analysis validating the effectiveness of our models through other human coronaviruses. Our framework achieves high levels of prediction performance that is alignment-free and based on RNA sequences alone without genome annotations and specialized biological knowledge. Conclusion The results demonstrate that, for any novel human coronavirus strains, this study can offer a reliable real-time estimation for its viral lethality.
Collapse
Affiliation(s)
- Rui Yin
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore
- Department of Biomedical Informatics, Harvard University, Boston, MA 02138, USA
| | - Zihan Luo
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, 430074, China
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore
| |
Collapse
|
10
|
Dong J, Wu H, Zhou D, Li K, Zhang Y, Ji H, Tong Z, Lou S, Liu Z. Application of Big Data and Artificial Intelligence in COVID-19 Prevention, Diagnosis, Treatment and Management Decisions in China. J Med Syst 2021; 45:84. [PMID: 34302549 PMCID: PMC8308073 DOI: 10.1007/s10916-021-01757-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 07/12/2021] [Indexed: 01/08/2023]
Abstract
COVID-19, caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), spread rapidly and affected most of the world since its outbreak in Wuhan, China, which presents a major challenge to the emergency response mechanism for sudden public health events and epidemic prevention and control in all countries. In the face of the severe situation of epidemic prevention and control and the arduous task of social management, the tremendous power of science and technology in prevention and control has emerged. The new generation of information technology, represented by big data and artificial intelligence (AI) technology, has been widely used in the prevention, diagnosis, treatment and management of COVID-19 as an important basic support. Although the technology has developed, there are still challenges with respect to epidemic surveillance, accurate prevention and control, effective diagnosis and treatment, and timely judgement. The prevention and control of sudden infectious diseases usually depend on the control of infection sources, interruption of transmission channels and vaccine development. Big data and AI are effective technologies to identify the source of infection and have an irreplaceable role in distinguishing close contacts and suspicious populations. Advanced computational analysis is beneficial to accelerate the speed of vaccine research and development and to improve the quality of vaccines. AI provides support in automatically processing relevant data from medical images and clinical features, tests and examination findings; predicting disease progression and prognosis; and even recommending treatment plans and strategies. This paper reviews the application of big data and AI in the COVID-19 prevention, diagnosis, treatment and management decisions in China to explain how to apply big data and AI technology to address the common problems in the COVID-19 pandemic. Although the findings regarding the application of big data and AI technologies in sudden public health events lack validation of repeatability and universality, current studies in China have shown that the application of big data and AI is feasible in response to the COVID-19 pandemic. These studies concluded that the application of big data and AI technology can contribute to prevention, diagnosis, treatment and management decision making regarding sudden public health events in the future.
Collapse
Affiliation(s)
- Jiancheng Dong
- Medical Big Data Research Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China.
| | - Huiqun Wu
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| | - Dong Zhou
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
| | - Kaixiang Li
- Medical Big Data Research Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Yuanpeng Zhang
- Department of Medical Informatics, Medical School of Nantong University, Nantong, China
- Department of Health Technology and Informatics, The Hong Kong Polytechnical University, Hong Kong, China
| | - Hanzhen Ji
- The Third Affiliated Hospital of Nantong University, Nantong, China
| | - Zhuang Tong
- Medical Big Data Research Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China
| | - Shuai Lou
- Jiangsu Zhongkang Software Co, Ltd, Nantong, China
| | - Zhangsuo Liu
- Medical Big Data Research Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, China.
| |
Collapse
|
11
|
Long Y, Luo J. Association Mining to Identify Microbe Drug Interactions Based on Heterogeneous Network Embedding Representation. IEEE J Biomed Health Inform 2021; 25:266-275. [PMID: 32750918 DOI: 10.1109/jbhi.2020.2998906] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Accurately identifying microbe-drug associations plays a critical role in drug development and precision medicine. Considering that the conventional wet-lab method is time-consuming, labor-intensive and expensive, computational approach is an alternative choice. The increasing availability of numerous biological data provides a great opportunity to systematically understand complex interaction mechanisms between microbes and drugs. However, few computational methods have been developed for microbe drug prediction. In this work, we leverage multiple sources of biomedical data to construct a heterogeneous network for microbes and drugs, including drug-drug interactions, microbe-microbe interactions and microbe-drug associations. And then we propose a novel Heterogeneous Network Embedding Representation framework for Microbe-Drug Association prediction, named (HNERMDA), by combining metapath2vec with bipartite network recommendation. In this framework, we introduce metapath2vec, a heterogeneous network representation learning method, to learn low-dimensional embedding representations for microbes and drugs. Following that, we further design a bias bipartite network projection recommendation algorithm to improve prediction accuracy. Comprehensive experiments on two datasets, named MDAD and aBiofilm, demonstrated that our model consistently outperformed five baseline methods in three types of cross-validations. Case study on two popular drugs (i.e., Ciprofloxacin and Pefloxacin) further validated the effectiveness of our HNERMDA model in inferring potential target microbes for drugs.
Collapse
|