1
|
Dai W, Pang S, He Z, Fu X, Liu L, Liu L, Yu N. Prediction of miRNA-disease association based on heterogeneous hypergraph convolution and heterogeneous graph multi-scale convolution. Health Inf Sci Syst 2025; 13:4. [PMID: 39659869 PMCID: PMC11625705 DOI: 10.1007/s13755-024-00319-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2024] [Accepted: 11/19/2024] [Indexed: 12/12/2024] Open
Abstract
Making the accurate prediction of miRNA-disease associations essential for medical interventions. Current computational models often fail to capture the complexity of miRNA-disease associations. This study proposes HHMDA, a method based on heterogeneous hypergraph convolution and heterogeneous graph multi-scale convolution, to predict the association between miRNA and disease. Firstly, HHMDA constructs a heterogeneous graph of miRNA-disease relationships. Then, a graph convolution is run on the heterogeneous graph to capture the multi-scale feature representations of miRNA and disease. MiRNA-disease association are reconstructed based on these features. Meanwhile, HHMDA constructs a heterogeneous hypergraph with miRNAs and diseases as nodes, and the hyperedges consist of miRNAs and diseases linked to the same genes. HHMDA performs hypergraph graph convolution operation on the heterogeneous hypergraph to extract the high-order features of miRNA and disease. Finally, these features are leveraged to calculate the Laplacian regularization loss and combined with the miRNA-disease association matrix reconstruction loss to optimize the model. The experimental results show that HHMDA has advantages over the existing state-of-the-art methods under different experimental settings.
Collapse
Affiliation(s)
- Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650050 China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Sifan Pang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650050 China
| | - Zhichen He
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650050 China
| | - Xiaodong Fu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650050 China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Li Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650050 China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Lijun Liu
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650050 China
- Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, 650050 China
| | - Ning Yu
- Department of Computing Sciences, The College at Brockport, State University of New York, 350 New Campus Drive, Brockport, NY 14422 USA
| |
Collapse
|
2
|
Chen H, Li R, Cleveland A, Ding J. Enhancing data quality in medical concept normalization through large language models. J Biomed Inform 2025; 165:104812. [PMID: 40180205 DOI: 10.1016/j.jbi.2025.104812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2024] [Revised: 01/26/2025] [Accepted: 03/13/2025] [Indexed: 04/05/2025]
Abstract
OBJECTIVE Medical concept normalization (MCN) aims to map informal medical terms to formal medical concepts, a critical task in building machine learning systems for medical applications. However, most existing studies on MCN primarily focus on models and algorithms, often overlooking the vital role of data quality. This research evaluates MCN performance across varying data quality scenarios and investigates how to leverage these evaluation results to enhance data quality, ultimately improving MCN performance through the use of large language models (LLMs). The effectiveness of the proposed approach is demonstrated through a case study. METHODS We begin by conducting a data quality evaluation of a dataset used for MCN. Based on these findings, we employ ChatGPT-based zero-shot prompting for data augmentation. The quality of the generated data is then assessed across the dimensions of correctness and comprehensiveness. A series of experiments is performed to analyze the impact of data quality on MCN model performance. These results guide us in implementing LLM-based few-shot prompting to further enhance data quality and improve model performance. RESULTS Duplication of data items within a dataset can lead to inaccurate evaluation results. Data augmentation techniques such as zero-shot and few-shot learning with ChatGPT can introduce duplicated data items, particularly those in the mean region of a dataset's distribution. As such, data augmentation strategies must be carefully designed, incorporating context information and training data to avoid these issues. Additionally, we found that including augmented data in the testing set is necessary to fairly evaluate the effectiveness of data augmentation strategies. CONCLUSION While LLMs can generate high-quality data for MCN, the success of data augmentation depends heavily on the strategy employed. Our study found that few-shot learning, with prompts that incorporate appropriate context and a small, representative set of original data, is an effective approach. The methods developed in this research, including the data quality evaluation framework, LLM-based data augmentation strategies, and procedures for data quality enhancement, provide valuable insights for data augmentation and evaluation in similar deep learning applications. AVAILABILITY https://github.com/RichardLRC/mcn-data-quality-llm/tree/main/evaluation.
Collapse
Affiliation(s)
- Haihua Chen
- The Anuradha & Vikas Sinha Department of Data Science, University of North Texas, Denton, 76203, TX, USA.
| | - Ruochi Li
- Department of Computer Science, North Carolina State University, Raleigh, 27695, NC, USA.
| | - Ana Cleveland
- Department of Information Science, University of North Texas, Denton, 76203, TX, USA.
| | - Junhua Ding
- The Anuradha & Vikas Sinha Department of Data Science, University of North Texas, Denton, 76203, TX, USA.
| |
Collapse
|
3
|
Brown GS, Wengler J, Fabelico AJS, Muir A, Tubbs A, Warren A, Millett AN, Yu XX, Pavlidis P, Rogic S, Piccolo SR. Using semantic search to find publicly available gene-expression datasets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.03.13.643153. [PMID: 40161731 PMCID: PMC11952526 DOI: 10.1101/2025.03.13.643153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Millions of high-throughput, molecular datasets have been shared in public repositories. have been shared in public repositories. Researchers can reuse such data to validate their own findings and explore novel questions. A frequent goal is to find multiple datasets that address similar research topics and to either combine them directly or integrate inferences from them. However, a major challenge is finding relevant datasets due to the vast number of candidates, inconsistencies in their descriptions, and a lack of semantic annotations. This challenge is first among the FAIR principles for scientific data. Here we focus on dataset discovery within Gene Expression Omnibus (GEO), a repository containing 100,000s of data series. GEO supports queries based on keywords, ontology terms, and other annotations. However, reviewing these results is time-consuming and tedious, and it often misses relevant datasets. We hypothesized that language models could address this problem by summarizing dataset descriptions as numeric representations (embeddings). Assuming a researcher has previously found some relevant datasets, we evaluated the potential to find additional relevant datasets. For six human medical conditions, we used 30 models to generate embeddings for datasets that human curators had previously associated with the conditions and identified other datasets with the most similar descriptions. This approach was often, but not always, more effective than GEO's search engine. Our top-performing models were trained on general corpora, used contrastive-learning strategies, and used relatively large embeddings. Our findings suggest that language models have the potential to improve dataset discovery, perhaps in combination with existing search tools.
Collapse
Affiliation(s)
- Grace S. Brown
- Department of Biology, Brigham Young University, Provo, Utah, USA
| | - James Wengler
- Department of Biology, Brigham Young University, Provo, Utah, USA
- Institute of Biosciences and Technology, Texas A&M Health Science Center, Houston, TX, USA
| | | | - Abigail Muir
- Department of Biology, Brigham Young University, Provo, Utah, USA
| | - Anna Tubbs
- Department of Biology, Brigham Young University, Provo, Utah, USA
| | - Amanda Warren
- Department of Biology, Brigham Young University, Provo, Utah, USA
| | - Alexandra N. Millett
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Xinrui Xiang Yu
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Paul Pavlidis
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Sanja Rogic
- Department of Psychiatry, University of British Columbia, Vancouver, British Columbia, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | | |
Collapse
|
4
|
Lee MM, Lin X, Lee ES, Smith HE, Tudor Car L. Effectiveness of educational interventions for improving healthcare professionals' information literacy: A systematic review. Health Info Libr J 2025. [PMID: 39894960 DOI: 10.1111/hir.12562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 10/07/2024] [Accepted: 12/04/2024] [Indexed: 02/04/2025]
Abstract
BACKGROUND It is unclear which educational interventions effectively improve healthcare professionals' information literacy. OBJECTIVES We aimed to evaluate the effectiveness of educational interventions for improving the formulation of answerable clinical questions and the search skills of healthcare professionals. METHODS We followed the Cochrane methodology and reported according to the PRISMA statement. The following databases from inception to November 2022: MEDLINE, Cochrane CENTRAL, EMBASE, Web of Science, CINAHL, and Google Scholar search engine, were searched. Randomised controlled trials and crossover trials on any educational interventions were included. Studies on search tools that are obsolete were excluded. RESULTS Ten studies that mainly compared the effectiveness of lectures and bedside education to lectures or no intervention for searching of PubMed and/or MEDLINE, were included. There was evidence for improved attitude towards the intervention favouring lecture with self-directed learning over lecture, bedside education, and computer-assisted self-directed learning (RR: 1.14; 95% CI 1.06-1.23; N = 2 studies; 1064 participants; I2 = 0%; moderate certainty evidence). There were limited findings on the knowledge, skills, satisfaction, and behaviour outcomes. CONCLUSION Future research should include a wider set of outcomes, be reported better and explore the use of digital technology for delivery of educational interventions. Further research should entail well-designed trials with relevant outcomes evaluating novel digital-based educational interventions.
Collapse
Affiliation(s)
- Mauricette Moling Lee
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore and Health and Social Sciences, Singapore Institute of Technology, Singapore, Singapore
| | - Xiaowen Lin
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Eng Sing Lee
- Family Medicine and Primary Care, Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore and National Healthcare Group Polyclinics, Singapore, Singapore
| | - Helen Elizabeth Smith
- Family Medicine and Primary Care, Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore, Singapore
| | - Lorainne Tudor Car
- Lee Kong Chian School of Medicine, Nanyang Technological University, Singapore and Imperial College London, London, UK
| |
Collapse
|
5
|
Wu S, Wang D, Gu X, Xiao R, Gao H, Yang B, Kang Y. Identifying Research Hotspots and Trends in Psoriasis Literature: Autotuned Topic Modeling with Agent. J Invest Dermatol 2025:S0022-202X(25)00086-7. [PMID: 39894203 DOI: 10.1016/j.jid.2024.11.029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Revised: 10/30/2024] [Accepted: 11/25/2024] [Indexed: 02/04/2025]
Abstract
The rapid expansion of psoriasis research presents challenges in efficient analysis and trend identification, necessitating advanced approaches. We propose AgenTopic, an interactive topic modeling framework that integrates Bidirectional Encoder Representations from Transformers embeddings, dimensionality reduction, clustering, and a language model feedback loop to analyze the psoriasis research literature from 2000 to 2023. Applied to PubMed articles, AgenTopic extracted 158 psoriasis-related topics across 8 categories, outperforming traditional methods in handling complex medical literature. Further trend analysis using multiple modeling techniques, including a support vector regression-Linear model, revealed nonlinear patterns in research growth across categories (R2 values = 0.75-0.97). Key trends identified include focus on nail psoriasis and spondyloarthritis, shift from TNF-α to IL-17 in pathogenesis understanding, rapid development of biologics and small-molecule inhibitors, and increased attention to comorbidities. We developed an interactive web tool to facilitate literature retrieval and identification of trends. To the best of our knowledge, this application of an agent-based interactive topic modeling framework to dermatological literature has not been previously reported. Using only topic-modeled data, our framework achieved a performance comparable with that of expert manual reviews in identifying research trends. AgenTopic performed better than several state-of-the-art topic modeling methods and demonstrated the potential of artificial intelligence for advancing medical literature analyses.
Collapse
Affiliation(s)
- Sunsi Wu
- Department of Dermatology, Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Dan Wang
- Department of Rehabilitation, Zhongshan Hospital, Fudan University, Shanghai, China; Shanghai Institute of Rehabilitation with Integrated Western and Chinese Traditional Medicine, Shanghai, China
| | - Xinpei Gu
- School of Pharmaceutical Sciences, Guangdong Provincial Key Laboratory of New Drug Screening, Southern Medical University, Guangzhou, China
| | - Ruiheng Xiao
- School of Literature, Shandong University, Jinan, China
| | - Hongzhi Gao
- Department of Nephrology, Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Bo Yang
- Department of Dermatology, Longhua Hospital, Shanghai University of Traditional Chinese Medicine, Shanghai, China.
| | - Yanlan Kang
- Institute of AI and Robotics, Academy for Engineering & Technology, Fudan University, Shanghai, China.
| |
Collapse
|
6
|
Mishra A, Lee H, Jeoung S, Torvik VI, Diesner J. Patterns of diversity in biomedical coauthorships: An analysis across authors' ethnicity, gender, age, and expertise. PLoS One 2025; 20:e0316890. [PMID: 39888948 PMCID: PMC11785319 DOI: 10.1371/journal.pone.0316890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 12/16/2024] [Indexed: 02/02/2025] Open
Abstract
Multiple studies have linked diversity in scientific collaborations to innovative and impactful research. Here, we explore how different diversity indices-ethnicity, gender, academic age, and topical expertise-interact and thereby influence scientific impact. Leveraging nearly 900,000 biomedical journal articles from PubMed, published in major journals between 1991 and 2014, we investigate the nuanced relationships among these diversity indices and their collective influence on research outcomes. By systematically varying model parametrizations, we assess the robustness of the observed relationships and examine multiple methodological choices. Our findings reveal a consistent pattern of demographic homophily, where scientists tend to collaborate with others who share similar ethnic and gender backgrounds. While each diversity index correlates significantly with impact when considered individually, gender diversity and topical expertise emerge as the strongest positive predictors of impact after accounting for key covariates. However, the association between diversity and impact is moderated by the number of collaborating authors, with larger teams sometimes showing opposite trends due to interactions between the computed diversity indices and team size. Despite this complexity, the practical drivers of scientific impact for an article remain the journal of publication, authors' prior citation rate, and the number of co-authors. On further examining expertise diversity through three separate dimensions: variety, balance, and disparity, our findings indicate that impactful teams balance a wide range of subject matter expertise while maintaining a focused connection on closely related topics. These findings highlight the importance of strategic team composition and underline the significance of team diversity in scientific research.
Collapse
Affiliation(s)
- Apratim Mishra
- School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, Illinois, United States of America
| | - Haejin Lee
- School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, Illinois, United States of America
| | - Sullam Jeoung
- School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, Illinois, United States of America
| | - Vetle I. Torvik
- School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, Illinois, United States of America
| | - Jana Diesner
- School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, Illinois, United States of America
- School of Social Sciences and Technology, Technical University of Munich, Munich, Germany
| |
Collapse
|
7
|
Campbell EA, Holl F, Marwah HK, Fraser HS, Craig SS. The impact of climate change on vulnerable populations in pediatrics: opportunities for AI, digital health, and beyond-a scoping review and selected case studies. Pediatr Res 2025:10.1038/s41390-024-03719-x. [PMID: 39881182 DOI: 10.1038/s41390-024-03719-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 05/30/2024] [Accepted: 08/07/2024] [Indexed: 01/31/2025]
Abstract
Climate change critically impacts global pediatric health, presenting unique and escalating challenges due to children's inherent vulnerabilities and ongoing physiological development. This scoping review intricately intertwines the spheres of climate change, pediatric health, and Artificial Intelligence (AI), with a goal to elucidate the potential of AI and digital health in mitigating the adverse child health outcomes induced by environmental alterations, especially in Low- and Middle-Income Countries (LMICs). A notable gap is uncovered: literature directly correlating AI interventions with climate change-impacted pediatric health is scant, even though substantial research exists at the confluence of AI and health, and health and climate change respectively. We present three case studies about AI's promise in addressing pediatric health issues exacerbated by climate change. The review spotlights substantial obstacles, including technical, ethical, equitable, privacy, and data security challenges in AI applications for pediatric health, necessitating in-depth, future-focused research. Engaging with the intricate nexus of climate change, pediatric health, and AI, this work underpins future explorations into leveraging AI to navigate and neutralize the burgeoning impact of climate change on pediatric health outcomes. IMPACT: Our scoping review highlights the scarcity of literature directly correlating AI interventions with climate change-impacted pediatric health that disproportionately affects vulnerable populations, even though substantial research exists at the confluence of AI and health, and health and climate change respectively. We present three case studies about AI's promise in addressing pediatric health issues exacerbated by climate change. The review spotlights substantial obstacles, including technical, ethical, equitable, privacy, and data security challenges in AI applications for pediatric health, necessitating in-depth, future-focused research.
Collapse
Affiliation(s)
- Elizabeth A Campbell
- Department of Environmental Health and Engineering, Johns Hopkins Bloomberg School of Public Health, 615 N. Wolfe St., Baltimore, MD, 21205, USA.
- Center for Outbreak Response Innovation, Johns Hopkins Bloomberg School of Public Health, 700 E. Pratt Street, Suite 900, Baltimore, MD, 21202, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, 622 W 168th St PH20 3720, New York, NY, 10032, USA.
| | - Felix Holl
- DigiHealth Institute, Neu-Ulm University of Applied Sciences, Neu-Ulm, Germany
| | - Harleen K Marwah
- Division of General Pediatrics, The Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Hamish S Fraser
- Brown Center for Biomedical Informatics, The Warren Alpert Medical School of Brown University, Providence, RI, USA
| | - Sansanee S Craig
- Division of General Pediatrics, Department of Pediatrics, The Children's Hospital of Philadelphia, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.
| |
Collapse
|
8
|
Bushiri Pwesombo D, Beese C, Schmied C, Sun H. Semisupervised Contrastive Learning for Bioactivity Prediction Using Cell Painting Image Data. J Chem Inf Model 2025; 65:528-543. [PMID: 39761993 PMCID: PMC11776044 DOI: 10.1021/acs.jcim.4c00835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Revised: 12/17/2024] [Accepted: 12/26/2024] [Indexed: 01/28/2025]
Abstract
Morphological profiling has recently demonstrated remarkable potential for identifying the biological activities of small molecules. Alongside the fully supervised and self-supervised machine learning methods recently proposed for bioactivity prediction from Cell Painting image data, we introduce here a semisupervised contrastive (SemiSupCon) learning approach. This approach combines the strengths of using biological annotations in supervised contrastive learning and leveraging large unannotated image data sets with self-supervised contrastive learning. SemiSupCon enhances downstream prediction performance of classifying MeSH pharmacological classifications from PubChem, as well as mode of action and biological target annotations from the Drug Repurposing Hub across two publicly available Cell Painting data sets. Notably, our approach has effectively predicted the biological activities of several unannotated compounds, and these findings were validated through literature searches. This demonstrates that our approach can potentially expedite the exploration of biological activity based on Cell Painting image data with minimal human intervention.
Collapse
Affiliation(s)
- David Bushiri Pwesombo
- Research
Unit Structural Chemistry and Computational Biophysics, Leibniz-Forschungsinstitut für Molekulare Pharmakologie, Berlin 13125, Germany
- Institute
of Chemistry, Technische Universität
Berlin, 10623 Berlin, Germany
| | - Carsten Beese
- Research
Unit Structural Chemistry and Computational Biophysics, Leibniz-Forschungsinstitut für Molekulare Pharmakologie, Berlin 13125, Germany
| | - Christopher Schmied
- Research
Unit Structural Chemistry and Computational Biophysics, Leibniz-Forschungsinstitut für Molekulare Pharmakologie, Berlin 13125, Germany
- EU-OPENSCREEN, Berlin 13125, Germany
| | - Han Sun
- Research
Unit Structural Chemistry and Computational Biophysics, Leibniz-Forschungsinstitut für Molekulare Pharmakologie, Berlin 13125, Germany
- Institute
of Chemistry, Technische Universität
Berlin, 10623 Berlin, Germany
| |
Collapse
|
9
|
Peng L, Ren M, Huang L, Chen M. GEnDDn: An lncRNA-Disease Association Identification Framework Based on Dual-Net Neural Architecture and Deep Neural Network. Interdiscip Sci 2024; 16:418-438. [PMID: 38733474 DOI: 10.1007/s12539-024-00619-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 02/02/2024] [Accepted: 02/03/2024] [Indexed: 05/13/2024]
Abstract
Accumulating studies have demonstrated close relationships between long non-coding RNAs (lncRNAs) and diseases. Identification of new lncRNA-disease associations (LDAs) enables us to better understand disease mechanisms and further provides promising insights into cancer targeted therapy and anti-cancer drug design. Here, we present an LDA prediction framework called GEnDDn based on deep learning. GEnDDn mainly comprises two steps: First, features of both lncRNAs and diseases are extracted by combining similarity computation, non-negative matrix factorization, and graph attention auto-encoder, respectively. And each lncRNA-disease pair (LDP) is depicted as a vector based on concatenation operation on the extracted features. Subsequently, unknown LDPs are classified by aggregating dual-net neural architecture and deep neural network. Using six different evaluation metrics, we found that GEnDDn surpassed four competing LDA identification methods (SDLDA, LDNFSGB, IPCARF, LDASR) on the lncRNADisease and MNDR databases under fivefold cross-validation experiments on lncRNAs, diseases, LDPs, and independent lncRNAs and independent diseases, respectively. Ablation experiments further validated the powerful LDA prediction performance of GEnDDn. Furthermore, we utilized GEnDDn to find underlying lncRNAs for lung cancer and breast cancer. The results elucidated that there may be dense linkages between IFNG-AS1 and lung cancer as well as between HIF1A-AS1 and breast cancer. The results require further biomedical experimental verification. GEnDDn is publicly available at https://github.com/plhhnu/GEnDDn.
Collapse
Affiliation(s)
- Lihong Peng
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Mengnan Ren
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Liangliang Huang
- College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, China
| | - Min Chen
- School of Computer Science, Hunan Institute of Technology, Hengyang, 421002, China.
| |
Collapse
|
10
|
Novoa J, Fernandez-Dumont A, Mills ENC, Moreno FJ, Pazos F. Advancing the allergenicity assessment of new proteins using a text mining resource. Food Chem Toxicol 2024; 187:114638. [PMID: 38582341 DOI: 10.1016/j.fct.2024.114638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 03/11/2024] [Accepted: 03/31/2024] [Indexed: 04/08/2024]
Abstract
With a society increasingly demanding alternative protein food sources, new strategies for evaluating protein safety issues, such as allergenic potential, are needed. Large-scale and systemic studies on allergenic proteins are hindered by the limited and non-harmonized clinical information available for these substances in dedicated databases. A missing key information is that representing the symptomatology of the allergens, especially given in terms of standard vocabularies, that would allow connecting with other biomedical resources to carry out different studies related to human health. In this work, we have generated the first resource with a comprehensive annotation of allergens' symptomatology, using a text-mining approach that extracts significant co-mentions between these entities from the scientific literature (PubMed, ∼36 million abstracts). The method identifies statistically significant co-mentions between the textual descriptions of the two types of entities in the literature as indication of relationship. 1,180 clinical signs extracted from the Human Phenotype Ontology, the Medical Subject Heading terms of PubMed together with other allergen-specific symptoms, were linked to 1,036 unique allergens annotated in two main allergen-related public databases via 14,009 relationships. This novel resource, publicly available through an interactive web interface, could serve as a starting point for future manually curated compilation of allergen symptomatology.
Collapse
Affiliation(s)
- Jorge Novoa
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), 28049, Madrid, Spain
| | | | - E N Clare Mills
- School of Biosciences and Medicine, The University of Surrey, Guildford, GU2 7XH, UK
| | - F Javier Moreno
- Instituto de Investigación en Ciencias de La Alimentación (CIAL), CSIC-UAM, CEI (UAM+CSIC), 28049, Madrid, Spain.
| | - Florencio Pazos
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), 28049, Madrid, Spain.
| |
Collapse
|
11
|
Chen M, Deng Y, Li Z, Ye Y, Zeng L, He Z, Peng G. SCPLPA: An miRNA-disease association prediction model based on spatial consistency projection and label propagation algorithm. J Cell Mol Med 2024; 28:e18345. [PMID: 38693850 PMCID: PMC11063733 DOI: 10.1111/jcmm.18345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 04/01/2024] [Accepted: 04/08/2024] [Indexed: 05/03/2024] Open
Abstract
Identifying the association between miRNA and diseases is helpful for disease prevention, diagnosis and treatment. It is of great significance to use computational methods to predict potential human miRNA disease associations. Considering the shortcomings of existing computational methods, such as low prediction accuracy and weak generalization, we propose a new method called SCPLPA to predict miRNA-disease associations. First, a heterogeneous disease similarity network was constructed using the disease semantic similarity network and the disease Gaussian interaction spectrum kernel similarity network, while a heterogeneous miRNA similarity network was constructed using the miRNA functional similarity network and the miRNA Gaussian interaction spectrum kernel similarity network. Then, the estimated miRNA-disease association scores were evaluated by integrating the outcomes obtained by implementing label propagation algorithms in the heterogeneous disease similarity network and the heterogeneous miRNA similarity network. Finally, the spatial consistency projection algorithm of the network was used to extract miRNA disease association features to predict unverified associations between miRNA and diseases. SCPLPA was compared with four classical methods (MDHGI, NSEMDA, RFMDA and SNMFMDA), and the results of multiple evaluation metrics showed that SCPLPA exhibited the most outstanding predictive performance. Case studies have shown that SCPLPA can effectively identify miRNAs associated with colon neoplasms and kidney neoplasms. In summary, our proposed SCPLPA algorithm is easy to implement and can effectively predict miRNA disease associations, making it a reliable auxiliary tool for biomedical research.
Collapse
Affiliation(s)
- Min Chen
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Yingwei Deng
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Zejun Li
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Yifan Ye
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Lijun Zeng
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Ziyi He
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| | - Guofang Peng
- Hunan Institute of TechnologySchool of Computer Science and EngineeringHengyang 421002China
| |
Collapse
|
12
|
Novoa J, López-Ibáñez J, Chagoyen M, Ranea JAG, Pazos F. CoMentG: comprehensive retrieval of generic relationships between biomedical concepts from the scientific literature. Database (Oxford) 2024; 2024:baae025. [PMID: 38564426 PMCID: PMC10986793 DOI: 10.1093/database/baae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 03/01/2024] [Accepted: 03/15/2024] [Indexed: 04/04/2024]
Abstract
The CoMentG resource contains millions of relationships between terms of biomedical interest obtained from the scientific literature. At the core of the system is a methodology for detecting significant co-mentions of concepts in the entire PubMed corpus. That method was applied to nine sets of terms covering the most important classes of biomedical concepts: diseases, symptoms/clinical signs, molecular functions, biological processes, cellular compartments, anatomic parts, cell types, bacteria and chemical compounds. We obtained more than 7 million relationships between more than 74 000 terms, and many types of relationships were not available in any other resource. As the terms were obtained from widely used resources and ontologies, the relationships are given using the standard identifiers provided by them and hence can be linked to other data. A web interface allows users to browse these associations, searching for relationships for a set of terms of interests provided as input, such as between a disease and their associated symptoms, underlying molecular processes or affected tissues. The results are presented in an interactive interface where the user can explore the reported relationships in different ways and follow links to other resources. Database URL: https://csbg.cnb.csic.es/CoMentG/.
Collapse
Affiliation(s)
- Jorge Novoa
- Computational Systems Biology, National Center for Biotechnology (CNB-CSIC), c/ Darwin, 3., Madrid 28049 , Spain
| | - Javier López-Ibáñez
- Computational Systems Biology, National Center for Biotechnology (CNB-CSIC), c/ Darwin, 3., Madrid 28049 , Spain
| | - Mónica Chagoyen
- Computational Systems Biology, National Center for Biotechnology (CNB-CSIC), c/ Darwin, 3., Madrid 28049 , Spain
| | - Juan A G Ranea
- Department of Molecular Biology and Biochemistry, University of Málaga, Avda. Cervantes, 2., Málaga 29071, Spain
- CIBER de Enfermedades Raras (CIBERER), Instituto de Salud Carlos III, Madrid, Spain
- Institute of Biomedical Research in Malaga and platform of nanomedicine (IBIMA platform BIONAND), Malaga 29071, Spain
- Spanish National Bioinformatics Institute (INB/ELIXIR-ES), Barcelona 08034, Spain
| | - Florencio Pazos
- Computational Systems Biology, National Center for Biotechnology (CNB-CSIC), c/ Darwin, 3., Madrid 28049 , Spain
| |
Collapse
|
13
|
Xie G, Xie W, Gu G, Lin Z, Chen R, Liu S, Yu J. A vector projection similarity-based method for miRNA-disease association prediction. Anal Biochem 2024; 687:115431. [PMID: 38123111 DOI: 10.1016/j.ab.2023.115431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2023] [Revised: 12/06/2023] [Accepted: 12/15/2023] [Indexed: 12/23/2023]
Abstract
[S U M M A R Y] Many miRNA-disease association prediction models incorporate Gaussian interaction profile kernel similarity (GIPS). However, the GIPS fails to consider the specificity of the miRNA-disease association matrix, where matrix elements with a value of 0 represent miRNA and disease relationships that have not been discovered yet. To address this issue and better account for the impact of known and unknown miRNA-disease associations on similarity, we propose a method called vector projection similarity-based method for miRNA-disease association prediction (VPSMDA). In VPSMDA, we introduce three projection rules and combined with logistic functions for the miRNA-disease association matrix and propose a vector projection similarity measure for miRNAs and diseases. By integrating the vector projection similarity matrix with the original one, we obtain the improved miRNA and disease similarity matrix. Additionally, we construct a weight matrix using different numbers of neighbors to reduce the noise in the similarity matrix. In performance evaluation, both LOOCV and 5-fold CV experiments demonstrate that VPSMDA outperforms seven other state-of-the-art methods in AUC. Furthermore, in a case study, VPSMDA successfully predicted 10, 9, and 10 out of the top 10 associations for three important human diseases, respectively, and these predictions were confirmed by recent biomedical resources.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Weijie Xie
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Guosheng Gu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Zhiyi Lin
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Ruibin Chen
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Shigang Liu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Junrui Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| |
Collapse
|
14
|
Peng W, He Z, Dai W, Lan W. MHCLMDA: multihypergraph contrastive learning for miRNA-disease association prediction. Brief Bioinform 2023; 25:bbad524. [PMID: 38243694 PMCID: PMC10796254 DOI: 10.1093/bib/bbad524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 12/13/2023] [Accepted: 12/18/2023] [Indexed: 01/21/2024] Open
Abstract
The correct prediction of disease-associated miRNAs plays an essential role in disease prevention and treatment. Current computational methods to predict disease-associated miRNAs construct different miRNA views and disease views based on various miRNA properties and disease properties and then integrate the multiviews to predict the relationship between miRNAs and diseases. However, most existing methods ignore the information interaction among the views and the consistency of miRNA features (disease features) across multiple views. This study proposes a computational method based on multiple hypergraph contrastive learning (MHCLMDA) to predict miRNA-disease associations. MHCLMDA first constructs multiple miRNA hypergraphs and disease hypergraphs based on various miRNA similarities and disease similarities and performs hypergraph convolution on each hypergraph to capture higher order interactions between nodes, followed by hypergraph contrastive learning to learn the consistent miRNA feature representation and disease feature representation under different views. Then, a variational auto-encoder is employed to extract the miRNA and disease features in known miRNA-disease association relationships. Finally, MHCLMDA fuses the miRNA and disease features from different views to predict miRNA-disease associations. The parameters of the model are optimized in an end-to-end way. We applied MHCLMDA to the prediction of human miRNA-disease association. The experimental results show that our method performs better than several other state-of-the-art methods in terms of the area under the receiver operating characteristic curve and the area under the precision-recall curve.
Collapse
Affiliation(s)
- Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China and Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China
| | - Zhichen He
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China
| | - Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China and Computer Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China
| | - Wei Lan
- Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, China
| |
Collapse
|
15
|
Ning Z, Wu J, Ding Y, Wang Y, Peng Q, Fu L. BertNDA: A Model Based on Graph-Bert and Multi-Scale Information Fusion for ncRNA-Disease Association Prediction. IEEE J Biomed Health Inform 2023; 27:5655-5664. [PMID: 37669210 DOI: 10.1109/jbhi.2023.3311808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
Non-coding RNAs (ncRNAs) are a class of RNA molecules that lack the ability to encode proteins in human cells, but play crucial roles in various biological process. Understanding the interactions between different ncRNAs and their impact on diseases can significantly contribute to diagnosis, prevention, and treatment of diseases. However, predicting tertiary interactions between ncRNAs and diseases based on structural information in multiple scales remains a challenging task. To address this challenge, we propose a method called BertNDA, aiming to predict potential relationships between miRNAs, lncRNAs, and diseases. The framework identifies the local information through connectionless subgraph, which aggregate neighbor nodes' feature. And global information is extracted by leveraging Laplace transform of graph structures and WL (Weisfeiler-Lehman) absolute role coding. Additionally, an EMLP (Element-wise MLP) structure is designed to fuse pairwise global information. The transformer-encoder is employed as the backbone of our approach, followed by a prediction-layer to output the final correlation score. Extensive experiments demonstrate that BertNDA outperforms state-of-the-art methods in prediction assignment and exhibits significant potential for various biological applications. Moreover, we develop an online prediction platform that incorporates the prediction model, providing users with an intuitive and interactive experience. Overall, our model offers an efficient, accurate, and comprehensive tool for predicting tertiary associations between ncRNAs and diseases.
Collapse
|
16
|
Hu X, Liu D, Zhang J, Fan Y, Ouyang T, Luo Y, Zhang Y, Deng L. A comprehensive review and evaluation of graph neural networks for non-coding RNA and complex disease associations. Brief Bioinform 2023; 24:bbad410. [PMID: 37985451 DOI: 10.1093/bib/bbad410] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/07/2023] [Accepted: 10/25/2023] [Indexed: 11/22/2023] Open
Abstract
Non-coding RNAs (ncRNAs) play a critical role in the occurrence and development of numerous human diseases. Consequently, studying the associations between ncRNAs and diseases has garnered significant attention from researchers in recent years. Various computational methods have been proposed to explore ncRNA-disease relationships, with Graph Neural Network (GNN) emerging as a state-of-the-art approach for ncRNA-disease association prediction. In this survey, we present a comprehensive review of GNN-based models for ncRNA-disease associations. Firstly, we provide a detailed introduction to ncRNAs and GNNs. Next, we delve into the motivations behind adopting GNNs for predicting ncRNA-disease associations, focusing on data structure, high-order connectivity in graphs and sparse supervision signals. Subsequently, we analyze the challenges associated with using GNNs in predicting ncRNA-disease associations, covering graph construction, feature propagation and aggregation, and model optimization. We then present a detailed summary and performance evaluation of existing GNN-based models in the context of ncRNA-disease associations. Lastly, we explore potential future research directions in this rapidly evolving field. This survey serves as a valuable resource for researchers interested in leveraging GNNs to uncover the complex relationships between ncRNAs and diseases.
Collapse
Affiliation(s)
- Xiaowen Hu
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Dayun Liu
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Jiaxuan Zhang
- Department of Electrical and Computer Engineering, University of California, San Diego,92093 CA, USA
| | - Yanhao Fan
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Tianxiang Ouyang
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Yue Luo
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| | - Yuanpeng Zhang
- school of software, Xinjiang University, 830046 Urumqi, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University,410075 Changsha, China
| |
Collapse
|
17
|
Ohniwa RL, Takeyasu K, Hibino A. The effectiveness of Japanese public funding to generate emerging topics in life science and medicine. PLoS One 2023; 18:e0290077. [PMID: 37590186 PMCID: PMC10434904 DOI: 10.1371/journal.pone.0290077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Accepted: 08/01/2023] [Indexed: 08/19/2023] Open
Abstract
Understanding the effectiveness of public funds to generate emerging topics will assist policy makers in promoting innovation. In the present study, we aim to clarify the effectiveness of grants to generate emerging topics in life sciences and medicine since 1991 with regard to Japanese researcher productivity and grants from the Japan Society for the Promotion of Science. To clarify how large grant amounts and which categories are more effective in generating emerging topics from both the PI and investment perspectives, we analyzed awarded PI publications containing emerging keywords (EKs; the elements of emerging topics) before and after funding. Our results demonstrated that, in terms of grant amounts, while PIs tended to generate more EKs with larger grants, the most effective investment from the perspective of investor side was found in the smallest amount range for each PI (less than 5 million JPY /year). Second, in terms of grant categories, we found that grant categories providing smaller amounts for diverse researchers without excellent past performance records were more effective from the investment perspective to generate EK. Our results suggest that offering smaller, widely dispersed grants rather than large, concentrated grants is more effective in promoting the generation of emerging topics in life science and medicine.
Collapse
Affiliation(s)
- Ryosuke L. Ohniwa
- Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
- College of Medicine, National Taiwan University, Taipei, Taiwan
| | - Kunio Takeyasu
- Center for Biotechnology, National Taiwan University, Taipei, Taiwan
- Graduate School of Biostudies, Kyoto University, Kyoto, Japan
| | - Aiko Hibino
- Faculty of Humanities and Social Sciences, Hirosaki University, Hirosaki, Japan
| |
Collapse
|
18
|
Lin M, Hou B, Mishra S, Yao T, Huo Y, Yang Q, Wang F, Shih G, Peng Y. Enhancing thoracic disease detection using chest X-rays from PubMed Central Open Access. Comput Biol Med 2023; 159:106962. [PMID: 37094464 PMCID: PMC10349296 DOI: 10.1016/j.compbiomed.2023.106962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 03/26/2023] [Accepted: 04/18/2023] [Indexed: 04/26/2023]
Abstract
Large chest X-rays (CXR) datasets have been collected to train deep learning models to detect thorax pathology on CXR. However, most CXR datasets are from single-center studies and the collected pathologies are often imbalanced. The aim of this study was to automatically construct a public, weakly-labeled CXR database from articles in PubMed Central Open Access (PMC-OA) and to assess model performance on CXR pathology classification by using this database as additional training data. Our framework includes text extraction, CXR pathology verification, subfigure separation, and image modality classification. We have extensively validated the utility of the automatically generated image database on thoracic disease detection tasks, including Hernia, Lung Lesion, Pneumonia, and pneumothorax. We pick these diseases due to their historically poor performance in existing datasets: the NIH-CXR dataset (112,120 CXR) and the MIMIC-CXR dataset (243,324 CXR). We find that classifiers fine-tuned with additional PMC-CXR extracted by the proposed framework consistently and significantly achieved better performance than those without (e.g., Hernia: 0.9335 vs 0.9154; Lung Lesion: 0.7394 vs. 0.7207; Pneumonia: 0.7074 vs. 0.6709; Pneumothorax 0.8185 vs. 0.7517, all in AUC with p< 0.0001) for CXR pathology detection. In contrast to previous approaches that manually submit the medical images to the repository, our framework can automatically collect figures and their accompanied figure legends. Compared to previous studies, the proposed framework improved subfigure segmentation and incorporates our advanced self-developed NLP technique for CXR pathology verification. We hope it complements existing resources and improves our ability to make biomedical image data findable, accessible, interoperable, and reusable.
Collapse
Affiliation(s)
- Mingquan Lin
- Department of Population Health Sciences, Weill Cornell Medicine, New York, USA
| | - Bojian Hou
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, USA
| | - Swati Mishra
- Department of Information Science, Cornell University, New York, USA
| | - Tianyuan Yao
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA
| | - Yuankai Huo
- Department of Computer Science, Vanderbilt University, Nashville, TN, USA
| | - Qian Yang
- Department of Information Science, Cornell University, New York, USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, USA
| | - George Shih
- Department of Radiology, Weill Cornell Medicine, New York, USA
| | - Yifan Peng
- Department of Population Health Sciences, Weill Cornell Medicine, New York, USA.
| |
Collapse
|
19
|
Lopez-Ibañez J, Pazos F, Chagoyen M. MBROLE3: improved functional enrichment of chemical compounds for metabolomics data analysis. Nucleic Acids Res 2023:7161529. [PMID: 37178003 DOI: 10.1093/nar/gkad405] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 04/17/2023] [Accepted: 05/03/2023] [Indexed: 05/15/2023] Open
Abstract
MBROLE (Metabolites Biological Role) facilitates the biological interpretation of metabolomics experiments. It performs enrichment analysis of a set of chemical compounds through statistical analysis of annotations from several databases. The original MBROLE server was released in 2011 and, since then, different groups worldwide have used it to analyze metabolomics experiments from a variety of organisms. Here we present the latest version of the system, MBROLE3, accessible at http://csbg.cnb.csic.es/mbrole3. This new version contains updated annotations from previously included databases as well as a wide variety of new functional annotations, such as additional pathway databases and Gene Ontology terms. Of special relevance is the inclusion of a new category of annotations, 'indirect annotations', extracted from the scientific literature and from curated chemical-protein associations. The latter allows to analyze enriched annotations of the proteins known to interact with the set of chemical compounds of interest. Results are provided in the form of interactive tables, formatted data to download, and graphical plots.
Collapse
Affiliation(s)
- Javier Lopez-Ibañez
- Computational Systems Biology Group, National Center for Biotechnology (CNB-CSIC), 28049 Madrid, Spain
| | - Florencio Pazos
- Computational Systems Biology Group, National Center for Biotechnology (CNB-CSIC), 28049 Madrid, Spain
| | - Monica Chagoyen
- Computational Systems Biology Group, National Center for Biotechnology (CNB-CSIC), 28049 Madrid, Spain
| |
Collapse
|
20
|
Novoa J, Chagoyen M, Benito C, Moreno FJ, Pazos F. PMIDigest: Interactive Review of Large Collections of PubMed Entries to Distill Relevant Information. Genes (Basel) 2023; 14:genes14040942. [PMID: 37107700 PMCID: PMC10137743 DOI: 10.3390/genes14040942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 03/14/2023] [Accepted: 04/18/2023] [Indexed: 04/29/2023] Open
Abstract
Scientific knowledge is being accumulated in the biomedical literature at an unprecedented pace. The most widely used database with biomedicine-related article abstracts, PubMed, currently contains more than 36 million entries. Users performing searches in this database for a subject of interest face thousands of entries (articles) that are difficult to process manually. In this work, we present an interactive tool for automatically digesting large sets of PubMed articles: PMIDigest (PubMed IDs digester). The system allows for classification/sorting of articles according to different criteria, including the type of article and different citation-related figures. It also calculates the distribution of MeSH (medical subject headings) terms for categories of interest, providing in a picture of the themes addressed in the set. These MeSH terms are highlighted in the article abstracts in different colors depending on the category. An interactive representation of the interarticle citation network is also presented in order to easily locate article "clusters" related to particular subjects, as well as their corresponding "hub" articles. In addition to PubMed articles, the system can also process a set of Scopus or Web of Science entries. In summary, with this system, the user can have a "bird's eye view" of a large set of articles and their main thematic tendencies and obtain additional information not evident in a plain list of abstracts.
Collapse
Affiliation(s)
- Jorge Novoa
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), Darwin, 3, 28049 Madrid, Spain
| | - Mónica Chagoyen
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), Darwin, 3, 28049 Madrid, Spain
| | - Carlos Benito
- Instituto de Gestión de la Innovación y del Conocimiento, INGENIO (CSIC and U. Politécnica de Valencia), Edificio 8E, Cam. de Vera, 46022 Valencia, Spain
| | - F Javier Moreno
- Instituto de Investigación en Ciencias de la Alimentación (CIAL), CSIC-UAM, CEI (UAM+CSIC), Nicolás Cabrera, 9, 28049 Madrid, Spain
| | - Florencio Pazos
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), Darwin, 3, 28049 Madrid, Spain
| |
Collapse
|
21
|
Shokri Garjan H, Omidi Y, Poursheikhali Asghari M, Ferdousi R. In-silico computational approaches to study microbiota impacts on diseases and pharmacotherapy. Gut Pathog 2023; 15:10. [PMID: 36882861 PMCID: PMC9990230 DOI: 10.1186/s13099-023-00535-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Accepted: 02/21/2023] [Indexed: 03/09/2023] Open
Abstract
Microorganisms have been linked to a variety of critical human disease, thanks to advances in sequencing technology and microbiology. The growing recognition of human microbe-disease relationships provides crucial insights into the underlying disease process from the perspective of pathogens, which is extremely useful for pathogenesis research, early diagnosis, and precision medicine and therapy. Microbe-based analysis in terms of diseases and related drug discovery can predict new connections/mechanisms and provide new concepts. These phenomena have been studied via various in-silico computational approaches. This review aims to elaborate on the computational works conducted on the microbe-disease and microbe-drug topics, discuss the computational model approaches used for predicting associations and provide comprehensive information on the related databases. Finally, we discussed potential prospects and obstacles in this field of study, while also outlining some recommendations for further enhancing predictive capabilities.
Collapse
Affiliation(s)
- Hassan Shokri Garjan
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Yadollah Omidi
- Department of Pharmaceutical Sciences, Nova Southeastern University, College of Pharmacy, Fort Lauderdale, FL, USA
| | | | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
22
|
Metabolomic Profiling of the Responses of Planktonic and Biofilm Vibrio cholerae to Silver Nanoparticles. Antibiotics (Basel) 2022; 11:antibiotics11111534. [DOI: 10.3390/antibiotics11111534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 10/27/2022] [Accepted: 10/28/2022] [Indexed: 11/06/2022] Open
Abstract
Vibrio cholerae causes cholera and can switch between planktonic and biofilm lifeforms, where biofilm formation enhances transmission, virulence, and antibiotic resistance. Due to antibiotic microbial resistance, new antimicrobials including silver nanoparticles (AgNPs) are being studied. Nevertheless, little is known about the metabolic changes exerted by AgNPs on both microbial lifeforms. Our objective was to evaluate the changes in the metabolomic profile of V. cholerae planktonic and biofilm cells in response to sublethal concentrations of AgNPs using MS2 untargeted metabolomics and chemoinformatics. A total of 690 metabolites were quantified among all groups. More metabolites were significantly modulated in planktonic cells (n = 71) compared to biofilm (n = 37) by the treatment. The chemical class profiles were distinct for both planktonic and biofilm, suggesting a phenotype-dependent metabolic response to the nanoparticles. Chemical enrichment analysis showed altered abundances of oxidized fatty acids (FA), saturated FA, phosphatidic acids, and saturated stearic acid in planktonic cells treated with AgNPs, which hints at a turnover of the membrane. In contrast, no chemical classes were enriched in the biofilm. In conclusion, this study suggests that the response of V. cholerae to silver nanoparticles is phenotype-dependent and that planktonic cells experience a lipid remodeling process, possibly related to an adaptive mechanism involving the cell membrane.
Collapse
|
23
|
Yang N, Zhao Y, Bai Z, Chen H, Ning H, Zou M, Cheng G. The association of non-vitamin K antagonist oral anticoagulants vs. warfarin and the risk of fractures for patients with atrial fibrillation: a systematic review and meta-analysis. Acta Cardiol 2022; 78:298-310. [PMID: 36063197 DOI: 10.1080/00015385.2022.2030555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Abstract
BACKGROUND The fracture risks of non-vitamin K antagonist oral anticoagulants (NOACs) vs. warfarin in patients with atrial fibrillation (AF) remain controversial. METHODS PubMed, Cochrane Library, EMBASE, Clinical Trials.gov databases for RCTs, and cohort studies were systematically searched from inception to 10 June 2021. RESULTS Twelve-two studies met the inclusion criteria and 477,821 patients were included. Warfarin increased the risk of fracture in AF patients compared with NOACs in overall any fracture (RR = 0.79; 95% CI = 0.70-10.88; p = 0.00), osteoporotic fracture (RR = 0.746; 95% CI = 0.630-0.883; p = 0.001). No significant difference was observed in the hip or pelvic fracture, vertebral fracture, extremity fracture, wrist fracture, femoral neck fracture, and ankle fracture. In subgroup analyses based on several aspects, NOACs were associated with a significant reduction in any fracture (standard dosage NOACs, cohort studies, elderly patients, rivaroxaban in RCTs, dabigatran, rivaroxaban, and apixaban in cohort studies), in the hip/pelvic fracture (follow-up time ≤1 year, cohort studies), and osteoporotic fracture (cohort studies). CONCLUSION NOACs were associated with a significantly lower risk of any fracture and osteoporotic fracture compared to warfarin. This benefit was also observed in specific NOACs types of dabigatran, rivaroxaban, and apixaban. However, whether NOACs had a less fracture risk than warfarin on the other risk of fractures was still uncertain.
Collapse
Affiliation(s)
- Nana Yang
- School of Life Sciences and Biopharmaceutics, Shenyang Pharmaceutical University, Shenyang, Liaoning, China
| | - Ying Zhao
- School of Life Sciences and Biopharmaceutics, Shenyang Pharmaceutical University, Shenyang, Liaoning, China
| | - Zhaohui Bai
- School of Life Sciences and Biopharmaceutics, Shenyang Pharmaceutical University, Shenyang, Liaoning, China
| | - Haokun Chen
- School of Life Sciences and Biopharmaceutics, Shenyang Pharmaceutical University, Shenyang, Liaoning, China
| | - Haoyu Ning
- School of Life Sciences and Biopharmaceutics, Shenyang Pharmaceutical University, Shenyang, Liaoning, China
| | - Meijuan Zou
- Pharmaceutical College, Shenyang Pharmaceutical University, Shenyang, Liaoning, China
| | - Gang Cheng
- Pharmaceutical College, Shenyang Pharmaceutical University, Shenyang, Liaoning, China.,NMPA Key Laboratory for Research and Evaluation of Drug Regulatory Technology, Shenyang, Liaoning, China
| |
Collapse
|
24
|
Network-Based Methods for Approaching Human Pathologies from a Phenotypic Point of View. Genes (Basel) 2022; 13:genes13061081. [PMID: 35741843 PMCID: PMC9222217 DOI: 10.3390/genes13061081] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 06/10/2022] [Accepted: 06/14/2022] [Indexed: 01/27/2023] Open
Abstract
Network and systemic approaches to studying human pathologies are helping us to gain insight into the molecular mechanisms of and potential therapeutic interventions for human diseases, especially for complex diseases where large numbers of genes are involved. The complex human pathological landscape is traditionally partitioned into discrete “diseases”; however, that partition is sometimes problematic, as diseases are highly heterogeneous and can differ greatly from one patient to another. Moreover, for many pathological states, the set of symptoms (phenotypes) manifested by the patient is not enough to diagnose a particular disease. On the contrary, phenotypes, by definition, are directly observable and can be closer to the molecular basis of the pathology. These clinical phenotypes are also important for personalised medicine, as they can help stratify patients and design personalised interventions. For these reasons, network and systemic approaches to pathologies are gradually incorporating phenotypic information. This review covers the current landscape of phenotype-centred network approaches to study different aspects of human diseases.
Collapse
|
25
|
Lee S, Jeon S, Kim HS. A Study on Methodologies of Drug Repositioning Using Biomedical Big Data: A Focus on Diabetes Mellitus. Endocrinol Metab (Seoul) 2022; 37:195-207. [PMID: 35413782 PMCID: PMC9081315 DOI: 10.3803/enm.2022.1404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 03/21/2022] [Indexed: 11/11/2022] Open
Abstract
Drug repositioning is a strategy for identifying new applications of an existing drug that has been previously proven to be safe. Based on several examples of drug repositioning, we aimed to determine the methodologies and relevant steps associated with drug repositioning that should be pursued in the future. Reports on drug repositioning, retrieved from PubMed from January 2011 to December 2020, were classified based on an analysis of the methodology and reviewed by experts. Among various drug repositioning methods, the network-based approach was the most common (38.0%, 186/490 cases), followed by machine learning/deep learningbased (34.3%, 168/490 cases), text mining-based (7.1%, 35/490 cases), semantic-based (5.3%, 26/490 cases), and others (15.3%, 75/490 cases). Although drug repositioning offers several advantages, its implementation is curtailed by the need for prior, conclusive clinical proof. This approach requires the construction of various databases, and a deep understanding of the process underlying repositioning is quintessential. An in-depth understanding of drug repositioning could reduce the time, cost, and risks inherent to early drug development, providing reliable scientific evidence. Furthermore, regarding patient safety, drug repurposing might allow the discovery of new relationships between drugs and diseases.
Collapse
Affiliation(s)
- Suehyun Lee
- Department of Biomedical Informatics, Konyang University College of Medicine, Daejeon, Korea
- Health Care Data Science Center, Konyang University Hospital, Daejeon, Korea
| | - Seongwoo Jeon
- Health Care Data Science Center, Konyang University Hospital, Daejeon, Korea
| | - Hun-Sung Kim
- Department of Medical Informatics, College of Medicine, The Catholic University of Korea, Seoul, Korea
- Division of Endocrinology and Metabolism, Department of Internal Medicine, Seoul St. Mary’s Hospital, College of Medicine, The Catholic University of Korea, Seoul, Korea
- Corresponding author: Hun-Sung Kim Department of Medical Informatics, College of Medicine, The Catholic University of Korea, 222 Banpo-daero, Seocho-gu, Seoul 06591, Korea Tel: +82-2-2258-8262, Fax: +82-2-2258-8297, E-mail:
| |
Collapse
|
26
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
27
|
Dhivagaran T, Abbas U, Butt F, Arunasalam L, Chang O. Critical appraisal of clinical practice guidelines for the management of COVID-19: protocol for a systematic review. Syst Rev 2021; 10:317. [PMID: 34937576 PMCID: PMC8694758 DOI: 10.1186/s13643-021-01871-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/08/2020] [Accepted: 12/09/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In December 2019, a novel coronavirus, severe acute respiratory syndrome coronavirus 2 was identified as the cause of an acute respiratory disease, coronavirus disease 2019 (COVID-19). Given the lack of validated treatments, there is an urgent need for a high-quality management of COVID-19. Clinical practice guidelines (CPGs) are one tool that healthcare providers may use to enhance patient care. As such, it is necessary that they have access to high-quality evidence-based CPGs upon which they may base decisions regarding the management and use of therapeutic interventions (TI) for COVID-19. The purpose of the proposed study is to assess the quality of CPGs that make management or TI recommendations for COVID-19 using the AGREE II instrument. METHODS The proposed systematic review will identify CPGs for TI use and/or the management of COVID-19. The MEDLINE, EMBASE, CINAHL, and Web of Science databases, as well as the Guidelines International Network, National Institute for Health and Clinical Excellence, Scottish Intercollegiate Guidelines Network, and the World Health Organization websites, will be searched from December 2019 onwards. The primary outcome of this study is the assessed quality of the CPGs. The quality of eligible CPGs will be assessed using the Appraisal of Guidelines, Research and Evaluation II (AGREE II) instrument. Descriptive statistics will be used to quantify the quality of the CPGs. The secondary outcomes of this study are the types of management and/or TI recommendations made. Inconsistent and duplicate TI and/or management recommendations made between CPGs will be compared across guidelines. To summarize and explain the findings related to the included CPGs, a narrative synthesis will also be provided. DISCUSSION The results of this study will be of utmost importance to enhancing clinical decision-making among healthcare providers caring for patients with COVID-19. Moreover, the results of this study will be relevant to guideline developers in the creation of CPGs or improvement of existing ones, researchers who want to identify gaps in knowledge, and policy-makers looking to encourage and endorse the adoption of CPGs into clinical practice. The results of this review will be published in a peer-reviewed journal and presented at conferences. SYSTEMATIC REVIEW REGISTRATION International Prospective Register for Systematic Reviews (PROSPERO)- CRD42020219944.
Collapse
Affiliation(s)
| | - Umaima Abbas
- Faculty of Science, McMaster University, Hamilton, ON, Canada
| | - Fahad Butt
- Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| | | | - Oswin Chang
- Faculty of Health Sciences, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
28
|
GCAEMDA: Predicting miRNA-disease associations via graph convolutional autoencoder. PLoS Comput Biol 2021; 17:e1009655. [PMID: 34890410 PMCID: PMC8694430 DOI: 10.1371/journal.pcbi.1009655] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 12/22/2021] [Accepted: 11/17/2021] [Indexed: 01/02/2023] Open
Abstract
microRNAs (miRNAs) are small non-coding RNAs related to a number of complicated biological processes. A growing body of studies have suggested that miRNAs are closely associated with many human diseases. It is meaningful to consider disease-related miRNAs as potential biomarkers, which could greatly contribute to understanding the mechanisms of complex diseases and benefit the prevention, detection, diagnosis and treatment of extraordinary diseases. In this study, we presented a novel model named Graph Convolutional Autoencoder for miRNA-Disease Association Prediction (GCAEMDA). In the proposed model, we utilized miRNA-miRNA similarities, disease-disease similarities and verified miRNA-disease associations to construct a heterogeneous network, which is applied to learn the embeddings of miRNAs and diseases. In addition, we separately constructed miRNA-based and disease-based sub-networks. Combining the embeddings of miRNAs and diseases, graph convolutional autoencoder (GCAE) was utilized to calculate association scores of miRNA-disease on two sub-networks, respectively. Furthermore, we obtained final prediction scores between miRNAs and diseases by adopting an average ensemble way to integrate the prediction scores from two types of subnetworks. To indicate the accuracy of GCAEMDA, we applied different cross validation methods to evaluate our model whose performances were better than the state-of-the-art models. Case studies on a common human diseases were also implemented to prove the effectiveness of GCAEMDA. The results demonstrated that GCAEMDA was beneficial to infer potential associations of miRNA-disease. Numerous studies have demonstrated that miRNAs are closely related to several common human diseases, so observing unverified associations between miRNAs and diseases is conducive to the diagnose and treatment of complex diseases. Considerable models proposed to infer potential miRNA-disease associations have made the prediction more effective and productive. We constructed GCAEMDA model to acquire more accuracy prediction result by integrating graph convolutional network and autoencoder to make prediction based on multi-source miRNA and disease information. The five-fold cross validation and global leave-one-out cross validation were implemented to evaluate the performance of our model. Consequently, GCAEMDA reached AUCs of 0.9415 and 0.9505 respectively that were distinctly higher than AUCs of other comparative models. Furthermore, we carried out case studies on lung neoplasms and breast neoplasms to demonstrate the practical application of the model, 47 and 47 of top-50 candidate miRNAs were confirmed by experimental reports. In summary, GCAEMDA could be considered as an effective and accuracy model to reveal relationship between miRNAs and diseases.
Collapse
|
29
|
RFLMDA: A Novel Reinforcement Learning-Based Computational Model for Human MicroRNA-Disease Association Prediction. Biomolecules 2021; 11:biom11121835. [PMID: 34944479 PMCID: PMC8699433 DOI: 10.3390/biom11121835] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 12/01/2021] [Accepted: 12/02/2021] [Indexed: 11/23/2022] Open
Abstract
Numerous studies have confirmed that microRNAs play a crucial role in the research of complex human diseases. Identifying the relationship between miRNAs and diseases is important for improving the treatment of complex diseases. However, traditional biological experiments are not without restrictions. It is an urgent necessity for computational simulation to predict unknown miRNA-disease associations. In this work, we combine Q-learning algorithm of reinforcement learning to propose a RFLMDA model, three submodels CMF, NRLMF, and LapRLS are fused via Q-learning algorithm to obtain the optimal weight S. The performance of RFLMDA was evaluated through five-fold cross-validation and local validation. As a result, the optimal weight is obtained as S (0.1735, 0.2913, 0.5352), and the AUC is 0.9416. By comparing the experiments with other methods, it is proved that RFLMDA model has better performance. For better validate the predictive performance of RFLMDA, we use eight diseases for local verification and carry out case study on three common human diseases. Consequently, all the top 50 miRNAs related to Colorectal Neoplasms and Breast Neoplasms have been confirmed. Among the top 50 miRNAs related to Colon Neoplasms, Gastric Neoplasms, Pancreatic Neoplasms, Kidney Neoplasms, Esophageal Neoplasms, and Lymphoma, we confirm 47, 41, 49, 46, 46 and 48 miRNAs respectively.
Collapse
|
30
|
Affiliation(s)
- Steven Woloshin
- The Center for Medicine and the Media, The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine at Dartmouth
- The Lisa Schwartz Foundation for Truth in Medicine
| | | |
Collapse
|
31
|
Sadatmoosavi A, Tajedini O, Esmaeili O, Abolhasani Zadeh F, Khazaneha M. Emerging Trends and Thematic Evolution of Breast Cancer: Knowledge Mapping and Co-Word Analysis. JMIR Cancer 2021; 7:e26691. [PMID: 34709188 PMCID: PMC8587182 DOI: 10.2196/26691] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 04/08/2021] [Accepted: 05/30/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND One of the requirements for scientists and researchers to enter any field of science is to have a comprehensive and accurate understanding of that discipline. OBJECTIVE This study aims to draw a science map, provide structural analysis, explore the evolution, and determine new trends in research articles published in the field of breast cancer. METHODS This study comprised a descriptive survey with a scientometric approach. Data were collected from MEDLINE using a search strategy based on Medical Subject Heading (MeSH) terms. This study used science mapping, which provides a visual representation and a longitudinal evolution of possible interrelations between scientific areas, documents, or authors, thus reflecting the cognitive architecture of science mapping. For this scientometric evaluation of the topic of breast cancer research, a very long period was considered for data collection. Moreover, due to the availability of numerous publications in the database, the assessment was divided into three different periods ranging from 1988 to 2020. RESULTS A total of 12,577 records related to scientometric studies were extracted. The field of breast cancer research demonstrated three diagrams containing the most relevant themes for the three chronological periods evaluated. Each diagram was plotted based on the centrality and density linked to each research topic. The research output in the field was observed to revolve around 8 areas or themes: radiation injury, cardiovascular disease, fibroadenoma, antineoplastic agent, estrogen antagonistic, immunohistochemistry, soybean, and epitopes, each represented with different colors. CONCLUSIONS In the strategic diagrams, the themes were both well developed and important for the structuring of a research field. The first quadrant comprised motor themes of the specialty, which present strong centrality and high density (eg, corticosteroid antineoplastic age, stem cell, T-lymphocyte, protein tyrosine kinase, dietary, and phosphatidyl inositol-3-kinase). In the second quadrant of diagram, themes have well-developed internal ties but unimportant external ties, as they are of only marginal importance for the field. These themes are very specialized and peripheral (eg, DNA-binding). In the third quadrant, themes are both weakly developed and marginal. The themes in this quadrant have low density and centrality and mainly represent either emerging or declining themes (eg, ovarian neoplasm). Themes in the fourth quadrant of the strategic diagram are considered important for a research field but are not fully developed. This quadrant contains transversal and general, basic themes (eg, immunohistochemistry). Scientometric analysis of breast cancer research can be regarded as a roadmap for future research and policymaking for this important field.
Collapse
Affiliation(s)
- Ali Sadatmoosavi
- Department of Medical Library & Information Sciences, Faculty of Management and Medical Information Sciences, Kerman University of Medical Sciences, Kerman, Iran
| | - Oranus Tajedini
- Department of Knowledge and Information Science, Shahid Bahonar University of Kerman, Kerman, Iran
| | - Omid Esmaeili
- School of Medicine, Shahid Mohammadi Hospital, Hormozgan University of Medical Sciences, Hormozgan, Iran
| | - Firouzeh Abolhasani Zadeh
- Department of General Surgery, School of Medicine, Afzalipour Hospital, Kerman University of Medical Sciences, Kerman, Iran
| | - Mahdiyeh Khazaneha
- Department of Information Sciences and Medical Informatics, Kerman University of Medical Sciences, Kerman, Iran
| |
Collapse
|
32
|
Wilson MP, Kaur J, Blake L, Oliveto AH, Thompson RG, Pyne JM, Wolf L, Walker AP, Waliski AD, Nordstrom K. Adherence to guideline creation recommendations for suicide prevention in the emergency department: A systematic review. Am J Emerg Med 2021; 50:553-560. [PMID: 34547697 DOI: 10.1016/j.ajem.2021.07.042] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 06/21/2021] [Accepted: 07/20/2021] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVES Suicide rates in the United States rose 35.2% from 1999-2018. As emergency department (ED) providers often have limited training in management of suicidal patients and minimal access to mental health experts, clinical practice guidelines (CPGs) may improve care for these patients. However, clinical practice guidelines that do not adhere to quality standards for development may be harmful both to patients, if they promote practices based on flawed evidence, and to ED providers, if used in malpractice claims. In 2011, the Institute of Medicine created standards to determine the trustworthiness of CPGs. This review assessed the adherence of suicide prevention CPGs, intended for the ED, to these standards. Secondary objectives were to assess the association of adherence both with first author/organization specialty (ED vs non-ED) and with inclusion of recommendations on substance use, a potent risk factor for suicide. METHODS This is a systematic review of available suicide-prevention CPGs for the ED in both peer-reviewed and gray literature. This review followed the PRISMA standards for reporting systematic reviews. RESULTS Of 22 included CPGs, the 7 ED-sponsored CPGs had higher adherence to quality standards (3.1 vs 2.4) and included the highest-rated CPG (ICAR2E) identified by this review. Regardless of specialty, nearly all CPGs included some mention of identifying or managing substance use. CONCLUSIONS Most suicide prevention CPGs intended for the ED are written by non-ED first authors or organizations and have low adherence to quality standards. Future CPGs should be developed with more scientific rigor, include a multidisciplinary writing group, and be created by authors working in the practice environment to which the CPG applies.
Collapse
Affiliation(s)
- Michael P Wilson
- Division of Research and Evidence-Based Medicine, Department of Emergency Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America; Department of Emergency Medicine Behavioral Emergencies Research Lab, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America.
| | - Jaskiran Kaur
- Department of Emergency Medicine Behavioral Emergencies Research Lab, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America; College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Lindsay Blake
- Academic Affairs, UAMS Library, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Alison H Oliveto
- Department of Psychiatry, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Ronald G Thompson
- Department of Psychiatry, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Jeffrey M Pyne
- Department of Psychiatry, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Lisa Wolf
- Emergency Nurses Association, Schaumburg, Illinois
| | - A Paige Walker
- Department of Emergency Medicine Behavioral Emergencies Research Lab, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America; College of Medicine, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Angela D Waliski
- Department of Health Services Research and Development, Central Arkansas Veteran's Healthcare System, Little Rock, AR, United States of America
| | - Kimberly Nordstrom
- Department of Emergency Medicine Behavioral Emergencies Research Lab, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America; Department of Psychiatry, University of Colorado School of Medicine, Denver, CO, United States of America
| |
Collapse
|
33
|
Yang H, Ding Y, Tang J, Guo F. Identifying potential association on gene-disease network via dual hypergraph regularized least squares. BMC Genomics 2021; 22:605. [PMID: 34372777 PMCID: PMC8351363 DOI: 10.1186/s12864-021-07864-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Accepted: 06/29/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Identifying potential associations between genes and diseases via biomedical experiments must be the time-consuming and expensive research works. The computational technologies based on machine learning models have been widely utilized to explore genetic information related to complex diseases. Importantly, the gene-disease association detection can be defined as the link prediction problem in bipartite network. However, many existing methods do not utilize multiple sources of biological information; Additionally, they do not extract higher-order relationships among genes and diseases. RESULTS In this study, we propose a novel method called Dual Hypergraph Regularized Least Squares (DHRLS) with Centered Kernel Alignment-based Multiple Kernel Learning (CKA-MKL), in order to detect all potential gene-disease associations. First, we construct multiple kernels based on various biological data sources in gene and disease spaces respectively. After that, we use CAK-MKL to obtain the optimal kernels in the two spaces respectively. To specific, hypergraph can be employed to establish higher-order relationships. Finally, our DHRLS model is solved by the Alternating Least squares algorithm (ALSA), for predicting gene-disease associations. CONCLUSION Comparing with many outstanding prediction tools, DHRLS achieves best performance on gene-disease associations network under two types of cross validation. To verify robustness, our proposed approach has excellent prediction performance on six real-world networks. Our research work can effectively discover potential disease-associated genes and provide guidance for the follow-up verification methods of complex diseases.
Collapse
Affiliation(s)
- Hongpeng Yang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yijie Ding
- Yangtze Delta Region Institute, University of Electronic Science and Technology of China, Quzhou, China.
| | - Jijun Tang
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha, China.
| |
Collapse
|
34
|
Pan Y, Lei X, Zhang Y. Association predictions of genomics, proteinomics, transcriptomics, microbiome, metabolomics, pathomics, radiomics, drug, symptoms, environment factor, and disease networks: A comprehensive approach. Med Res Rev 2021; 42:441-461. [PMID: 34346083 DOI: 10.1002/med.21847] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Revised: 05/22/2021] [Accepted: 07/07/2021] [Indexed: 12/12/2022]
Abstract
Currently, the research of multi-omics, such as genomics, proteinomics, transcriptomics, microbiome, metabolomics, pathomics, and radiomics, are hot spots. The relationship between multi-omics data, drugs, and diseases has received extensive attention from researchers. At the same time, multi-omics can effectively predict the diagnosis, prognosis, and treatment of diseases. In essence, these research entities, such as genes, RNAs, proteins, microbes, metabolites, pathways as well as pathological and medical imaging data, can all be represented by the network at different levels. And some computer and biology scholars have tried to use computational methods to explore the potential relationships between biological entities. We summary a comprehensive research strategy, that is to build a multi-omics heterogeneous network, covering multimodal data, and use the current popular computational methods to make predictions. In this study, we first introduce the calculation method of the similarity of biological entities at the data level, second discuss multimodal data fusion and methods of feature extraction. Finally, the challenges and opportunities at this stage are summarized. Some scholars have used such a framework to calculate and predict. We also summarize them and discuss the challenges. We hope that our review could help scholars who are interested in the field of bioinformatics, biomedical image, and computer research.
Collapse
Affiliation(s)
- Yi Pan
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Yuchen Zhang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| |
Collapse
|
35
|
Kim SS, Hudgins AD, Gonzalez B, Milman S, Barzilai N, Vijg J, Tu Z, Suh Y. A Compendium of Age-Related PheWAS and GWAS Traits for Human Genetic Association Studies, Their Networks and Genetic Correlations. Front Genet 2021; 12:680560. [PMID: 34140970 PMCID: PMC8204079 DOI: 10.3389/fgene.2021.680560] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 04/29/2021] [Indexed: 11/18/2022] Open
Abstract
The rich data from the genome-wide association studies (GWAS) and phenome-wide association studies (PheWAS) offer an unprecedented opportunity to identify the biological underpinnings of age-related disease (ARD) risk and multimorbidity. Surprisingly, however, a comprehensive list of ARDs remains unavailable due to the lack of a clear definition and selection criteria. We developed a method to identify ARDs and to provide a compendium of ARDs for genetic association studies. Querying 1,358 electronic medical record-derived traits, we first defined ARDs and age-related traits (ARTs) based on their prevalence profiles, requiring a unimodal distribution that shows an increasing prevalence after the age of 40 years, and which reaches a maximum peak at 60 years of age or later. As a result, we identified a list of 463 ARDs and ARTs in the GWAS and PheWAS catalogs. We next translated the ARDs and ARTs to their respective 276 Medical Subject Headings diseases and 45 anatomy terms. The most abundant disease categories are neoplasms (48 terms), cardiovascular diseases (44 terms), and nervous system diseases (27 terms). Employing data from a human symptoms-disease network, we found 6 symptom-shared disease groups, representing cancers, heart diseases, brain diseases, joint diseases, eye diseases, and mixed diseases. Lastly, by overlaying our ARD and ART list with genetic correlation data from the UK Biobank, we found 54 phenotypes in 2 clusters with high genetic correlations. Our compendium of ARD and ART is a highly useful resource, with broad applicability for studies of the genetics of aging, ARD, and multimorbidity.
Collapse
Affiliation(s)
- Seung-Soo Kim
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, United States
| | - Adam D. Hudgins
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, United States
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Brenda Gonzalez
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Sofiya Milman
- Institute for Aging Research, Department of Medicine and Genetics, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Nir Barzilai
- Institute for Aging Research, Department of Medicine and Genetics, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Jan Vijg
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, United States
| | - Zhidong Tu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Yousin Suh
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, United States
- Department of Genetics and Development, Columbia University Irving Medical Center, New York, NY, United States
| |
Collapse
|
36
|
|
37
|
Zhou JR, You ZH, Cheng L, Ji BY. Prediction of lncRNA-disease associations via an embedding learning HOPE in heterogeneous information networks. MOLECULAR THERAPY. NUCLEIC ACIDS 2021; 23:277-285. [PMID: 33425486 PMCID: PMC7773765 DOI: 10.1016/j.omtn.2020.10.040] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Accepted: 10/28/2020] [Indexed: 12/30/2022]
Abstract
Uncovering additional long non-coding RNA (lncRNA)-disease associations has become increasingly important for developing treatments for complex human diseases. Identification of lncRNA biomarkers and lncRNA-disease associations is central to diagnoses and treatment. However, traditional experimental methods are expensive and time-consuming. Enormous amounts of data present in public biological databases are available for computational methods used to predict lncRNA-disease associations. In this study, we propose a novel computational method to predict lncRNA-disease associations. More specifically, a heterogeneous network is first constructed by integrating the associations among microRNA (miRNA), lncRNA, protein, drug, and disease, Second, high-order proximity preserved embedding (HOPE) was used to embed nodes into a network. Finally, the rotation forest classifier was adopted to train the prediction model. In the 5-fold cross-validation experiment, the area under the curve (AUC) of our method achieved 0.8328 ± 0.0236. We compare it with the other four classifiers, in which the proposed method remarkably outperformed other comparison methods. Otherwise, we constructed three case studies for three excess death rate cancers, respectively. The results show that 9 (lung cancer, gastric cancer, and hepatocellular carcinomas) out of the top 15 predicted disease-related lncRNAs were confirmed by our method. In conclusion, our method could predict the unknown lncRNA-disease associations effectively.
Collapse
Affiliation(s)
- Ji-Ren Zhou
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Zhu-Hong You
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Li Cheng
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Bo-Ya Ji
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
38
|
Ding Y, Jiang L, Tang J, Guo F. Identification of human microRNA-disease association via hypergraph embedded bipartite local model. Comput Biol Chem 2020; 89:107369. [DOI: 10.1016/j.compbiolchem.2020.107369] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 08/03/2020] [Accepted: 08/31/2020] [Indexed: 12/16/2022]
|
39
|
Abstract
Machine learning has been heavily researched and widely used in many disciplines. However, achieving high accuracy requires a large amount of data that is sometimes difficult, expensive, or impractical to obtain. Integrating human knowledge into machine learning can significantly reduce data requirement, increase reliability and robustness of machine learning, and build explainable machine learning systems. This allows leveraging the vast amount of human knowledge and capability of machine learning to achieve functions and performance not available before and will facilitate the interaction between human beings and machine learning systems, making machine learning decisions understandable to humans. This paper gives an overview of the knowledge and its representations that can be integrated into machine learning and the methodology. We cover the fundamentals, current status, and recent progress of the methods, with a focus on popular and new topics. The perspectives on future directions are also discussed.
Collapse
Affiliation(s)
- Changyu Deng
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xunbi Ji
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Colton Rainey
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jianyu Zhang
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Wei Lu
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Materials Science & Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
40
|
Benefit-Risk Assessment of Vaccines. Part II: Proposal Towards Consolidated Standards of Reporting Quantitative Benefit-Risk Models Applied to Vaccines (BRIVAC). Drug Saf 2020; 43:1105-1120. [PMID: 32918682 PMCID: PMC7486804 DOI: 10.1007/s40264-020-00982-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
INTRODUCTION Quantitative benefit-risk models (qBRm) applied to vaccines are increasingly used by public health authorities and pharmaceutical companies as an important tool to help decision makers with supporting benefit-risk assessment (BRA). However, many publications on vaccine qBRm provide insufficient details on the methodological approaches used. Incomplete and/or inadequate qBRm reporting may affect result interpretation and confidence in BRA, highlighting a need for the development of standard reporting guidance. OBJECTIVES Our objective was to provide an operational checklist for improved reporting of vaccine qBRm. METHODS The consolidated standards of reporting quantitative Benefit-RIsk models applied to VACcines (BRIVAC) were designed as a checklist of key information to report in qBRm scientific publications regarding the assessed vaccines, the methodological considerations and the results and their interpretation. RESULTS In total, 22 items and accompanying definitions, recommendations, explanations and examples were provided and divided into six main sections corresponding to the classic subdivisions of a scientific publication: title and abstract (items 1-2), introduction (items 3-4), methods (items 5-15), results (items 16-17), discussion (items 18-20) and other (items 21-22). CONCLUSIONS The BRIVAC checklist is the first initiative providing an operational checklist for improved reporting of qBRm applied to vaccines in scientific articles. It is intended to assist authors, peer-reviewers, editors and readers in their critical appraisal. Future initiatives are needed to provide methodological guidance to perform qBRm while taking into account the vaccine specificities.
Collapse
|
41
|
Lei X, Tie J, Fujita H. Relational completion based non-negative matrix factorization for predicting metabolite-disease associations. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106238] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
42
|
Wen Z, Yan C, Duan G, Li S, Wu FX, Wang J. A survey on predicting microbe-disease associations: biological data and computational methods. Brief Bioinform 2020; 22:5881365. [PMID: 34020541 DOI: 10.1093/bib/bbaa157] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 06/18/2020] [Accepted: 06/22/2020] [Indexed: 02/06/2023] Open
Abstract
Various microbes have proved to be closely related to the pathogenesis of human diseases. While many computational methods for predicting human microbe-disease associations (MDAs) have been developed, few systematic reviews on these methods have been reported. In this study, we provide a comprehensive overview of the existing methods. Firstly, we introduce the data used in existing MDA prediction methods. Secondly, we classify those methods into different categories by their nature and describe their algorithms and strategies in detail. Next, experimental evaluations are conducted on representative methods using different similarity data and calculation methods to compare their prediction performances. Based on the principles of computational methods and experimental results, we discuss the advantages and disadvantages of those methods and propose suggestions for the improvement of prediction performances. Considering the problems of the MDA prediction at present stage, we discuss future work from three perspectives including data, methods and formulations at the end.
Collapse
Affiliation(s)
- Zhongqi Wen
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| | - Cheng Yan
- School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University
| | - Suning Li
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan, China
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Sciences, University of Saskatchewan, Saskatoon, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab of Bioinformatics, School of Computer Science and Engineering at Central South University, Hunan, China
| |
Collapse
|
43
|
Spitale G. Making sense in the flood. How to cope with the massive flow of digital information in medical ethics. Heliyon 2020; 6:e04426. [PMID: 32743090 PMCID: PMC7385457 DOI: 10.1016/j.heliyon.2020.e04426] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Revised: 01/20/2020] [Accepted: 07/08/2020] [Indexed: 01/08/2023] Open
Abstract
Scientific publications have become the currency of Academia, hence the concept of 'publish or perish'. But there are consequences: the amount of existing literature and its proliferation rate have reached the point where keeping pace is just impossible. If this is true in general, it becomes a huge issue in interdisciplinary fields such as bioethics where knowing the state of the art in more than one single discipline is a concrete necessity. If we accept the idea of building new science on an exhaustive comprehension of existing knowledge, a radical change is needed. Smart iterative search strategies, frequency analysis and text mining, techniques described in this paper, can't be a long run solution. But they might serve as a useful coping strategy.
Collapse
Affiliation(s)
- Giovanni Spitale
- Institute of Biomedical Ethics and History of Medicine, University of Zurich, Winterthurerstrasse 30, 8006, Zurich, Switzerland
| |
Collapse
|
44
|
Lu K, Yang K, Niyongabo E, Shu Z, Wang J, Chang K, Zou Q, Jiang J, Jia C, Liu B, Zhou X. Integrated network analysis of symptom clusters across disease conditions. J Biomed Inform 2020; 107:103482. [PMID: 32535270 DOI: 10.1016/j.jbi.2020.103482] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Revised: 05/18/2020] [Accepted: 06/08/2020] [Indexed: 10/24/2022]
Abstract
Identifying the symptom clusters (two or more related symptoms) with shared underlying molecular mechanisms has been a vital analysis task to promote the symptom science and precision health. Related studies have applied the clustering algorithms (e.g. k-means, latent class model) to detect the symptom clusters mostly from various kinds of clinical data. In addition, they focused on identifying the symptom clusters (SCs) for a specific disease, which also mainly concerned with the clinical regularities for symptom management. Here, we utilized a network-based clustering algorithm (i.e., BigCLAM) to obtain 208 typical SCs across disease conditions on a large-scale symptom network derived from integrated high-quality disease-symptom associations. Furthermore, we evaluated the underlying shared molecular mechanisms for SCs, i.e., shared genes, protein-protein interaction (PPI) and gene functional annotations using integrated networks and similarity measures. We found that the symptoms in the same SCs tend to share a higher degree of genes, PPIs and have higher functional homogeneities. In addition, we found that most SCs have related symptoms with shared underlying molecular mechanisms (e.g. enriched pathways) across different disease conditions. Our work demonstrated that the integrated network analysis method could be used for identifying robust SCs and investigate the molecular mechanisms of these SCs, which would be valuable for symptom science and precision health.
Collapse
Affiliation(s)
- Kezhi Lu
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Kuo Yang
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Edouard Niyongabo
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Zixin Shu
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Jingjing Wang
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Kai Chang
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Qunsheng Zou
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Jiyue Jiang
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Caiyan Jia
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China.
| | - Baoyan Liu
- Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China.
| | - Xuezhong Zhou
- Institute of Medical Intelligence, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China; Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China.
| |
Collapse
|
45
|
Zhang Y, Chen M, Cheng X, Wei H. MSFSP: A Novel miRNA-Disease Association Prediction Model by Federating Multiple-Similarities Fusion and Space Projection. Front Genet 2020; 11:389. [PMID: 32425980 PMCID: PMC7204399 DOI: 10.3389/fgene.2020.00389] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 03/27/2020] [Indexed: 12/11/2022] Open
Abstract
Growing evidences have indicated that microRNAs (miRNAs) play a significant role relating to many important bioprocesses; their mutations and disorders will cause the occurrence of various complex diseases. The prediction of miRNAs associated with underlying diseases via computational approaches is beneficial to identify biomarkers and discover specific medicine, which can greatly reduce the cost of diagnosis, cure, prognosis, and prevention of human diseases. However, how to further achieve a more reliable prediction of potential miRNA-disease associations with effective integration of different biological data is a challenge for researchers. In this study, we proposed a computational model by using a federated method of combined multiple-similarities fusion and space projection (MSFSP). MSFSP firstly fused the integrated disease similarity (composed of disease semantic similarity, disease functional similarity, and disease Hamming similarity) with the integrated miRNA similarity (composed of miRNA functional similarity, miRNA sequence similarity, and miRNA Hamming similarity). Secondly, it constructed the weighted network of miRNA-disease associations from the experimentally verified Boolean network of miRNA-disease associations by using similarity networks. Finally, it calculated the prediction results by weighting miRNA space projection scores and the disease space projection scores. Leave-one-out cross-validation demonstrated that MSFSP has the distinguished predictive accuracy with area under the receiver operating characteristics curve (AUC) of 0.9613 better than that of five other existing models. In case studies, the predictive ability of MSFSP was further confirmed as 96 and 98% of the top 50 predictions for prostatic neoplasms and lung neoplasms were successfully validated by experimental evidences and supporting experimental evidences were also found for 100% of the top 50 predictions for isolated diseases.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Xiaohui Cheng
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Hanyan Wei
- School of Pharmacy, Guilin Medical University, Guilin, China
| |
Collapse
|
46
|
Ma S, Yang K, Wang N, Zhu Q, Gao Z, Zhang R, Liu B, Zhou X. Disease phenotype synonymous prediction through network representation learning from PubMed database. Artif Intell Med 2020; 102:101745. [DOI: 10.1016/j.artmed.2019.101745] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Revised: 03/04/2019] [Accepted: 10/23/2019] [Indexed: 11/15/2022]
|
47
|
Lei X, Tie J. Prediction of disease-related metabolites using bi-random walks. PLoS One 2019; 14:e0225380. [PMID: 31730648 PMCID: PMC6857945 DOI: 10.1371/journal.pone.0225380] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2019] [Accepted: 11/04/2019] [Indexed: 12/25/2022] Open
Abstract
Metabolites play a significant role in various complex human disease. The exploration of the relationship between metabolites and diseases can help us to better understand the underlying pathogenesis. Several network-based methods have been used to predict the association between metabolite and disease. However, some methods ignored hierarchical differences in disease network and failed to work in the absence of known metabolite-disease associations. This paper presents a bi-random walks based method for disease-related metabolites prediction, called MDBIRW. First of all, we reconstruct the disease similarity network and metabolite functional similarity network by integrating Gaussian Interaction Profile (GIP) kernel similarity of diseases and GIP kernel similarity of metabolites, respectively. Then, the bi-random walks algorithm is executed on the reconstructed disease similarity network and metabolite functional similarity network to predict potential disease-metabolite associations. At last, MDBIRW achieves reliable performance in leave-one-out cross validation (AUC of 0.910) and 5-fold cross validation (AUC of 0.924). The experimental results show that our method outperforms other existing methods for predicting disease-related metabolites.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi’an China
| | - Jiaojiao Tie
- School of Computer Science, Shaanxi Normal University, Xi’an China
| |
Collapse
|
48
|
Meyer JG, Liu S, Miller IJ, Coon JJ, Gitter A. Learning Drug Functions from Chemical Structures with Convolutional Neural Networks and Random Forests. J Chem Inf Model 2019; 59:4438-4449. [PMID: 31518132 PMCID: PMC6819987 DOI: 10.1021/acs.jcim.9b00236] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Indexed: 02/08/2023]
Abstract
Empirical testing of chemicals for drug efficacy costs many billions of dollars every year. The ability to predict the action of molecules in silico would greatly increase the speed and decrease the cost of prioritizing drug leads. Here, we asked whether drug function, defined as MeSH "therapeutic use" classes, can be predicted from only a chemical structure. We evaluated two chemical-structure-derived drug classification methods, chemical images with convolutional neural networks and molecular fingerprints with random forests, both of which outperformed previous predictions that used drug-induced transcriptomic changes as chemical representations. This suggests that the structure of a chemical contains at least as much information about its therapeutic use as the transcriptional cellular response to that chemical. Furthermore, because training data based on chemical structure is not limited to a small set of molecules for which transcriptomic measurements are available, our strategy can leverage more training data to significantly improve predictive accuracy to 83-88%. Finally, we explore use of these models for prediction of side effects and drug-repurposing opportunities and demonstrate the effectiveness of this modeling strategy for multilabel classification.
Collapse
Affiliation(s)
- Jesse G. Meyer
- Department
of Chemistry, Department of Biomolecular Chemistry, National Center for
Quantitative Biology of Complex Systems, Department of Computer Sciences, Morgridge Institute
for Research, DOE Great Lakes Bioenergy Research Center, and Department of Biostatistics and
Medical Informatics, University of Wisconsin—Madison, Madison, Wisconsin 53706, United States
| | - Shengchao Liu
- Department
of Chemistry, Department of Biomolecular Chemistry, National Center for
Quantitative Biology of Complex Systems, Department of Computer Sciences, Morgridge Institute
for Research, DOE Great Lakes Bioenergy Research Center, and Department of Biostatistics and
Medical Informatics, University of Wisconsin—Madison, Madison, Wisconsin 53706, United States
| | - Ian J. Miller
- Department
of Chemistry, Department of Biomolecular Chemistry, National Center for
Quantitative Biology of Complex Systems, Department of Computer Sciences, Morgridge Institute
for Research, DOE Great Lakes Bioenergy Research Center, and Department of Biostatistics and
Medical Informatics, University of Wisconsin—Madison, Madison, Wisconsin 53706, United States
| | - Joshua J. Coon
- Department
of Chemistry, Department of Biomolecular Chemistry, National Center for
Quantitative Biology of Complex Systems, Department of Computer Sciences, Morgridge Institute
for Research, DOE Great Lakes Bioenergy Research Center, and Department of Biostatistics and
Medical Informatics, University of Wisconsin—Madison, Madison, Wisconsin 53706, United States
| | - Anthony Gitter
- Department
of Chemistry, Department of Biomolecular Chemistry, National Center for
Quantitative Biology of Complex Systems, Department of Computer Sciences, Morgridge Institute
for Research, DOE Great Lakes Bioenergy Research Center, and Department of Biostatistics and
Medical Informatics, University of Wisconsin—Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
49
|
Balaneshinkordan S, Kotov A. Bayesian approach to incorporating different types of biomedical knowledge bases into information retrieval systems for clinical decision support in precision medicine. J Biomed Inform 2019; 98:103238. [DOI: 10.1016/j.jbi.2019.103238] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2018] [Revised: 06/15/2019] [Accepted: 06/21/2019] [Indexed: 10/26/2022]
|
50
|
Cheng L, Zhao H, Wang P, Zhou W, Luo M, Li T, Han J, Liu S, Jiang Q. Computational Methods for Identifying Similar Diseases. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 18:590-604. [PMID: 31678735 PMCID: PMC6838934 DOI: 10.1016/j.omtn.2019.09.019] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 09/11/2019] [Accepted: 09/12/2019] [Indexed: 02/01/2023]
Abstract
Although our knowledge of human diseases has increased dramatically, the molecular basis, phenotypic traits, and therapeutic targets of most diseases still remain unclear. An increasing number of studies have observed that similar diseases often are caused by similar molecules, can be diagnosed by similar markers or phenotypes, or can be cured by similar drugs. Thus, the identification of diseases similar to known ones has attracted considerable attention worldwide. To this end, the associations between diseases at the molecular, phenotypic, and taxonomic levels were used to measure the pairwise similarity in diseases. The corresponding performance assessment strategies for these methods involving the terms “category-based,” “simulated-patient-based,” and “benchmark-data-based” were thus further emphasized. Then, frequently used methods were evaluated using a benchmark-data-based strategy. To facilitate the assessment of disease similarity scores, researchers have designed dozens of tools that implement these methods for calculating disease similarity. Currently, disease similarity has been advantageous in predicting noncoding RNA (ncRNA) function and therapeutic drugs for diseases. In this article, we review disease similarity methods, evaluation strategies, tools, and their applications in the biomedical community. We further evaluate the performance of these methods and discuss the current limitations and future trends for calculating disease similarity.
Collapse
Affiliation(s)
- Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Hengqiang Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Pingping Wang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Wenyang Zhou
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Meng Luo
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Tianxin Li
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| | - Shulin Liu
- Systemomics Center, College of Pharmacy, and Genomics Research Center (State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), Harbin Medical University, Harbin, Heilongjiang, China; Department of Microbiology, Immunology and Infectious Diseases, University of Calgary, Calgary, AB, Canada.
| | - Qinghua Jiang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China.
| |
Collapse
|