Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Callahan TJ, Tripodi IJ, Pielke-Lombardo H, Hunter LE. Knowledge-Based Biomedical Data Science. Annu Rev Biomed Data Sci 2020;3:23-41. [PMID: 33954284 PMCID: PMC8095730 DOI: 10.1146/annurev-biodatasci-010820-091627] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

For:	Callahan TJ, Tripodi IJ, Pielke-Lombardo H, Hunter LE. Knowledge-Based Biomedical Data Science. Annu Rev Biomed Data Sci 2020;3:23-41. [PMID: 33954284 PMCID: PMC8095730 DOI: 10.1146/annurev-biodatasci-010820-091627] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Number

Cited by Other Article(s)

Callahan TJ, Tripodi IJ, Stefanski AL, Cappelletti L, Taneja SB, Wyrwa JM, Casiraghi E, Matentzoglu NA, Reese J, Silverstein JC, Hoyt CT, Boyce RD, Malec SA, Unni DR, Joachimiak MP, Robinson PN, Mungall CJ, Cavalleri E, Fontana T, Valentini G, Mesiti M, Gillenwater LA, Santangelo B, Vasilevsky NA, Hoehndorf R, Bennett TD, Ryan PB, Hripcsak G, Kahn MG, Bada M, Baumgartner WA, Hunter LE. An open source knowledge graph ecosystem for the life sciences. Sci Data 2024;11:363. [PMID: 38605048 PMCID: PMC11009265 DOI: 10.1038/s41597-024-03171-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 03/21/2024] [Indexed: 04/13/2024] Open

Affiliation(s)

Tiffany J Callahan Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA. Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA.
Ignacio J Tripodi Computer Science Department, Interdisciplinary Quantitative Biology, University of Colorado Boulder, Boulder, CO, 80301, USA
Adrianne L Stefanski Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
Luca Cappelletti AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
Sanya B Taneja Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
Jordan M Wyrwa Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
Elena Casiraghi AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Nicolas A Matentzoglu Semanticly, Athens, Greece
Justin Reese Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Jonathan C Silverstein Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
Charles Tapley Hoyt Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, 02115, USA
Richard D Boyce Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
Scott A Malec Division of Translational Informatics, University of New Mexico School of Medicine, Albuquerque, NM, 87131, USA
Deepak R Unni SIB Swiss Institute of Bioinformatics, Basel, Switzerland
Marcin P Joachimiak Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Peter N Robinson Berlin Institute of Health at Charité-Universitatsmedizin, 10117, Berlin, Germany
Christopher J Mungall Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
Emanuele Cavalleri AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
Tommaso Fontana AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
Giorgio Valentini AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy ELLIS, European Laboratory for Learning and Intelligent Systems, Milan Unit, Italy
Marco Mesiti AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
Lucas A Gillenwater Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
Brook Santangelo Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
Nicole A Vasilevsky Data Collaboration Center, Critical Path Institute, 1840 E River Rd. Suite 100, Tucson, AZ, 85718, USA
Robert Hoehndorf Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
Tellen D Bennett Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
Patrick B Ryan Janssen Research and Development, Raritan, NJ, 08869, USA
George Hripcsak Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
Michael G Kahn Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
Michael Bada Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA
William A Baumgartner Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
Lawrence E Hunter Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA. Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.

Collapse

Verma G, Rebholz-Schuhmann D, Madden MG. Enabling personalised disease diagnosis by combining a patient's time-specific gene expression profile with a biomedical knowledge base. BMC Bioinformatics 2024;25:62. [PMID: 38326757 PMCID: PMC10848462 DOI: 10.1186/s12859-024-05674-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 01/25/2024] [Indexed: 02/09/2024] Open

Abstract

BACKGROUND

Recent developments in the domain of biomedical knowledge bases (KBs) open up new ways to exploit biomedical knowledge that is available in the form of KBs. Significant work has been done in the direction of biomedical KB creation and KB completion, specifically, those having gene-disease associations and other related entities. However, the use of such biomedical KBs in combination with patients' temporal clinical data still largely remains unexplored, but has the potential to immensely benefit medical diagnostic decision support systems.

RESULTS

We propose two new algorithms, LOADDx and SCADDx, to combine a patient's gene expression data with gene-disease association and other related information available in the form of a KB, to assist personalized disease diagnosis. We have tested both of the algorithms on two KBs and on four real-world gene expression datasets of respiratory viral infection caused by Influenza-like viruses of 19 subtypes. We also compare the performance of proposed algorithms with that of five existing state-of-the-art machine learning algorithms (k-NN, Random Forest, XGBoost, Linear SVM, and SVM with RBF Kernel) using two validation approaches: LOOCV and a single internal validation set. Both SCADDx and LOADDx outperform the existing algorithms when evaluated with both validation approaches. SCADDx is able to detect infections with up to 100% accuracy in the cases of Datasets 2 and 3. Overall, SCADDx and LOADDx are able to detect an infection within 72 h of infection with 91.38% and 92.66% average accuracy respectively considering all four datasets, whereas XGBoost, which performed best among the existing machine learning algorithms, can detect the infection with only 86.43% accuracy on an average.

CONCLUSIONS

We demonstrate how our novel idea of using the most and least differentially expressed genes in combination with a KB can enable identification of the diseases that a patient is most likely to have at a particular time, from a KB with thousands of diseases. Moreover, the proposed algorithms can provide a short ranked list of the most likely diseases for each patient along with their most affected genes, and other entities linked with them in the KB, which can support health care professionals in their decision-making.

Collapse

Arsène S, Parès Y, Tixier E, Granjeon-Noriot S, Martin B, Bruezière L, Couty C, Courcelles E, Kahoul R, Pitrat J, Go N, Monteiro C, Kleine-Schultjann J, Jemai S, Pham E, Boissel JP, Kulesza A. In Silico Clinical Trials: Is It Possible? Methods Mol Biol 2024;2716:51-99. [PMID: 37702936 DOI: 10.1007/978-1-0716-3449-3_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]

Abstract

Modeling and simulation (M&S), including in silico (clinical) trials, helps accelerate drug research and development and reduce costs and have coined the term "model-informed drug development (MIDD)." Data-driven, inferential approaches are now becoming increasingly complemented by emerging complex physiologically and knowledge-based disease (and drug) models, but differ in setup, bottlenecks, data requirements, and applications (also reminiscent of the different scientific communities they arose from). At the same time, and within the MIDD landscape, regulators and drug developers start to embrace in silico trials as a potential tool to refine, reduce, and ultimately replace clinical trials. Effectively, silos between the historically distinct modeling approaches start to break down. Widespread adoption of in silico trials still needs more collaboration between different stakeholders and established precedence use cases in key applications, which is currently impeded by a shattered collection of tools and practices. In order to address these key challenges, efforts to establish best practice workflows need to be undertaken and new collaborative M&S tools devised, and an attempt to provide a coherent set of solutions is provided in this chapter. First, a dedicated workflow for in silico clinical trial (development) life cycle is provided, which takes up general ideas from the systems biology and quantitative systems pharmacology space and which implements specific steps toward regulatory qualification. Then, key characteristics of an in silico trial software platform implementation are given on the example of jinkō.ai (nova's end-to-end in silico clinical trial platform). Considering these enabling scientific and technological advances, future applications of in silico trials to refine, reduce, and replace clinical research are indicated, ranging from synthetic control strategies and digital twins, which overall shows promise to begin a new era of more efficient drug development.

Collapse

Lotz JC, Ropella G, Anderson P, Yang Q, Hedderich MA, Bailey J, Hunt CA. An exploration of knowledge-organizing technologies to advance transdisciplinary back pain research. JOR Spine 2023;6:e1300. [PMID: 38156063 PMCID: PMC10751978 DOI: 10.1002/jsp2.1300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 10/02/2023] [Accepted: 10/29/2023] [Indexed: 12/30/2023] Open

Abstract

Chronic low back pain (LBP) is influenced by a broad spectrum of patient-specific factors as codified in domains of the biopsychosocial model (BSM). Operationalizing the BSM into research and clinical care is challenging because most investigators work in silos that concentrate on only one or two BSM domains. Furthermore, the expanding, multidisciplinary nature of BSM research creates practical limitations as to how individual investigators integrate current data into their processes of generating impactful hypotheses. The rapidly advancing field of artificial intelligence (AI) is providing new tools for organizing knowledge, but the practical aspects for how AI may advance LBP research and clinical are beginning to be explored. The goals of the work presented here are to: (1) explore the current capabilities of knowledge integration technologies (large language models (LLM), similarity graphs (SGs), and knowledge graphs (KGs)) to synthesize biomedical literature and depict multimodal relationships reflected in the BSM, and; (2) highlight limitations, implementation details, and future areas of research to improve performance. We demonstrate preliminary evidence that LLMs, like GPT-3, may be useful in helping scientists analyze and distinguish cLBP publications across multiple BSM domains and determine the degree to which the literature supports or contradicts emergent hypotheses. We show that SG representations and KGs enable exploring LBP's literature in novel ways, possibly providing, trans-disciplinary perspectives or insights that are currently difficult, if not infeasible to achieve. The SG approach is automated, simple, and inexpensive to execute, and thereby may be useful for early-phase literature and narrative explorations beyond one's areas of expertise. Likewise, we show that KGs can be constructed using automated pipelines, queried to provide semantic information, and analyzed to explore trans-domain linkages. The examples presented support the feasibility for LBP-tailored AI protocols to organize knowledge and support developing and refining trans-domain hypotheses.

Collapse

Sun Z, Lin M, Zhu Q, Xie Q, Wang F, Lu Z, Peng Y. A scoping review on multimodal deep learning in biomedical images and texts. J Biomed Inform 2023;146:104482. [PMID: 37652343 PMCID: PMC10591890 DOI: 10.1016/j.jbi.2023.104482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 07/18/2023] [Accepted: 08/28/2023] [Indexed: 09/02/2023]

Lobentanzer S, Aloy P, Baumbach J, Bohar B, Carey VJ, Charoentong P, Danhauser K, Doğan T, Dreo J, Dunham I, Farr E, Fernandez-Torras A, Gyori BM, Hartung M, Hoyt CT, Klein C, Korcsmaros T, Maier A, Mann M, Ochoa D, Pareja-Lorente E, Popp F, Preusse M, Probul N, Schwikowski B, Sen B, Strauss MT, Turei D, Ulusoy E, Waltemath D, Wodke JAH, Saez-Rodriguez J. Democratizing knowledge representation with BioCypher. Nat Biotechnol 2023;41:1056-1059. [PMID: 37337100 DOI: 10.1038/s41587-023-01848-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]

Affiliation(s)

Sebastian Lobentanzer Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany.
Patrick Aloy Institute for Research in Biomedicine (IRB Barcelona), the Barcelona Institute of Science and Technology, Barcelona, Spain Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
Jan Baumbach Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
Balazs Bohar Earlham Institute, Norwich, UK Biological Research Centre, Szeged, Hungary
Vincent J Carey Channing Division of Network Medicine, Mass General Brigham, Harvard Medical School, Boston, MA, USA
Pornpimol Charoentong Centre for Quantitative Analysis of Molecular and Cellular Biosystems (Bioquant), Heidelberg University, Heidelberg, Germany Department of Medical Oncology, National Centre for Tumour Diseases (NCT), Heidelberg University Hospital (UKHD), Heidelberg, Germany
Katharina Danhauser Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, LMU Munich, Munich, Germany
Tunca Doğan Biological Data Science Lab, Department of Computer Engineering, Hacettepe University, Ankara, Turkey Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey
Johann Dreo Computational Systems Biomedicine Lab, Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France Bioinformatics and Biostatistics Hub, Institut Pasteur, Université Paris Cité, Paris, France
Ian Dunham European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK Open Targets, Wellcome Genome Campus, Hinxton, UK
Elias Farr Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
Adrià Fernandez-Torras Institute for Research in Biomedicine (IRB Barcelona), the Barcelona Institute of Science and Technology, Barcelona, Spain
Benjamin M Gyori Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
Michael Hartung Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
Charles Tapley Hoyt Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, USA
Christoph Klein Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, LMU Munich, Munich, Germany
Tamas Korcsmaros Earlham Institute, Norwich, UK Imperial College London, London, UK Quadram Institute Bioscience, Norwich, UK
Andreas Maier Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
Matthias Mann Proteomics Program, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Copenhagen, Denmark Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Martinsried, Germany
David Ochoa European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK Open Targets, Wellcome Genome Campus, Hinxton, UK
Elena Pareja-Lorente Institute for Research in Biomedicine (IRB Barcelona), the Barcelona Institute of Science and Technology, Barcelona, Spain
Ferdinand Popp Applied Tumour Immunity Clinical Cooperation Unit, National Centre for Tumour Diseases (NCT), German Cancer Research Centre (DKFZ), Heidelberg, Germany
Martin Preusse German Centre for Diabetes Research (DZD), Neuherberg, Germany
Niklas Probul Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
Benno Schwikowski Computational Systems Biomedicine Lab, Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris, France
Bünyamin Sen Biological Data Science Lab, Department of Computer Engineering, Hacettepe University, Ankara, Turkey Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey
Maximilian T Strauss Proteomics Program, Novo Nordisk Foundation Centre for Protein Research, University of Copenhagen, Copenhagen, Denmark
Denes Turei Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
Erva Ulusoy Biological Data Science Lab, Department of Computer Engineering, Hacettepe University, Ankara, Turkey Department of Bioinformatics, Graduate School of Health Sciences, Hacettepe University, Ankara, Turkey
Dagmar Waltemath Medical Informatics Laboratory, University Medicine Greifswald, Greifswald, Germany
Judith A H Wodke Medical Informatics Laboratory, University Medicine Greifswald, Greifswald, Germany
Julio Saez-Rodriguez Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany.

Collapse

Morgan JP, Paiement A, Klinke C. Domain-informed graph neural networks: A quantum chemistry case study. Neural Netw 2023;165:938-952. [PMID: 37453397 DOI: 10.1016/j.neunet.2023.06.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 05/05/2023] [Accepted: 06/24/2023] [Indexed: 07/18/2023]

Hou Y, Yeung J, Xu H, Su C, Wang F, Zhang R. From Answers to Insights: Unveiling the Strengths and Limitations of ChatGPT and Biomedical Knowledge Graphs. RESEARCH SQUARE 2023:rs.3.rs-3185632. [PMID: 37577545 PMCID: PMC10418534 DOI: 10.21203/rs.3.rs-3185632/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]

Abstract

Purpose

Large Language Models (LLMs) have shown exceptional performance in various natural language processing tasks, benefiting from their language generation capabilities and ability to acquire knowledge from unstructured text. However, in the biomedical domain, LLMs face limitations that lead to inaccurate and inconsistent answers. Knowledge Graphs (KGs) have emerged as valuable resources for organizing structured information. Biomedical Knowledge Graphs (BKGs) have gained significant attention for managing diverse and large-scale biomedical knowledge. The objective of this study is to assess and compare the capabilities of ChatGPT and existing BKGs in question-answering, biomedical knowledge discovery, and reasoning tasks within the biomedical domain.

Methods

We conducted a series of experiments to assess the performance of ChatGPT and the BKGs in various aspects of querying existing biomedical knowledge, knowledge discovery, and knowledge reasoning. Firstly, we tasked ChatGPT with answering questions sourced from the "Alternative Medicine" sub-category of Yahoo! Answers and recorded the responses. Additionally, we queried BKG to retrieve the relevant knowledge records corresponding to the questions and assessed them manually. In another experiment, we formulated a prediction scenario to assess ChatGPT's ability to suggest potential drug/dietary supplement repurposing candidates. Simultaneously, we utilized BKG to perform link prediction for the same task. The outcomes of ChatGPT and BKG were compared and analyzed. Furthermore, we evaluated ChatGPT and BKG's capabilities in establishing associations between pairs of proposed entities. This evaluation aimed to assess their reasoning abilities and the extent to which they can infer connections within the knowledge domain.

Results

The results indicate that ChatGPT with GPT-4.0 outperforms both GPT-3.5 and BKGs in providing existing information. However, BKGs demonstrate higher reliability in terms of information accuracy. ChatGPT exhibits limitations in performing novel discoveries and reasoning, particularly in establishing structured links between entities compared to BKGs.

Conclusions

To address the limitations observed, future research should focus on integrating LLMs and BKGs to leverage the strengths of both approaches. Such integration would optimize task performance and mitigate potential risks, leading to advancements in knowledge within the biomedical field and contributing to the overall well-being of individuals.

Collapse

Boguslav MR, Salem NM, White EK, Sullivan KJ, Bada M, Hernandez TL, Leach SM, Hunter LE. Creating an ignorance-base: Exploring known unknowns in the scientific literature. J Biomed Inform 2023;143:104405. [PMID: 37270143 PMCID: PMC10528083 DOI: 10.1016/j.jbi.2023.104405] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 05/18/2023] [Accepted: 05/21/2023] [Indexed: 06/05/2023]

Abstract

BACKGROUND

Scientific discovery progresses by exploring new and uncharted territory. More specifically, it advances by a process of transforming unknown unknowns first into known unknowns, and then into knowns. Over the last few decades, researchers have developed many knowledge bases to capture and connect the knowns, which has enabled topic exploration and contextualization of experimental results. But recognizing the unknowns is also critical for finding the most pertinent questions and their answers. Prior work on known unknowns has sought to understand them, annotate them, and automate their identification. However, no knowledge-bases yet exist to capture these unknowns, and little work has focused on how scientists might use them to trace a given topic or experimental result in search of open questions and new avenues for exploration. We show here that a knowledge base of unknowns can be connected to ontologically grounded biomedical knowledge to accelerate research in the field of prenatal nutrition.

RESULTS

We present the first ignorance-base, a knowledge-base created by combining classifiers to recognize ignorance statements (statements of missing or incomplete knowledge that imply a goal for knowledge) and biomedical concepts over the prenatal nutrition literature. This knowledge-base places biomedical concepts mentioned in the literature in context with the ignorance statements authors have made about them. Using our system, researchers interested in the topic of vitamin D and prenatal health were able to uncover three new avenues for exploration (immune system, respiratory system, and brain development) by searching for concepts enriched in ignorance statements. These were buried among the many standard enriched concepts. Additionally, we used the ignorance-base to enrich concepts connected to a gene list associated with vitamin D and spontaneous preterm birth and found an emerging topic of study (brain development) in an implied field (neuroscience). The researchers could look to the field of neuroscience for potential answers to the ignorance statements.

CONCLUSION

Our goal is to help students, researchers, funders, and publishers better understand the state of our collective scientific ignorance (known unknowns) in order to help accelerate research through the continued illumination of and focus on the known unknowns and their respective goals for scientific knowledge.

Collapse

Hou Y, Yeung J, Xu H, Su C, Wang F, Zhang R. From Answers to Insights: Unveiling the Strengths and Limitations of ChatGPT and Biomedical Knowledge Graphs. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.09.23291208. [PMID: 37398259 PMCID: PMC10312889 DOI: 10.1101/2023.06.09.23291208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]

Cenikj G, Strojnik L, Angelski R, Ogrinc N, Koroušić Seljak B, Eftimov T. From language models to large-scale food and biomedical knowledge graphs. Sci Rep 2023;13:7815. [PMID: 37188766 DOI: 10.1038/s41598-023-34981-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 05/10/2023] [Indexed: 05/17/2023] Open

Su C, Hou Y, Zhou M, Rajendran S, Maasch JRA, Abedi Z, Zhang H, Bai Z, Cuturrufo A, Guo W, Chaudhry FF, Ghahramani G, Tang J, Cheng F, Li Y, Zhang R, DeKosky ST, Bian J, Wang F. Biomedical discovery through the integrative biomedical knowledge hub (iBKH). iScience 2023;26:106460. [PMID: 37020958 PMCID: PMC10068563 DOI: 10.1016/j.isci.2023.106460] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 09/20/2022] [Accepted: 03/16/2023] [Indexed: 04/01/2023] Open

Affiliation(s)

Chang Su Department of Health Service Administration and Policy, College of Public Health, Temple University, Philadelphia, PA 19122, USA
Yu Hou Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA Department of Surgery, University of Minnesota, Minneapolis, MN 55455, USA
Manqi Zhou Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
Suraj Rajendran Tri-Institutional Computational Biology & Medicine Program, Cornell University, New York, NY 10065, USA
Jacqueline R.M. A. Maasch Department of Computer Science, Cornell Tech, New York, NY 10044, USA
Zehra Abedi Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA
Haotan Zhang Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA
Zilong Bai Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA
Anthony Cuturrufo Computer Science, Cornell University, Ithaca, NY 14850, USA
Winston Guo Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA
Fayzan F. Chaudhry Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA
Gregory Ghahramani Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA
Jian Tang Mila-Quebec AI Institute and HEC Montreal, Montreal, QC H2S 3H1, Canada
Feixiong Cheng Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
Yue Li School of Computer Science, McGill University, Montreal, QC H3A 0C6, Canada
Rui Zhang Department of Surgery, University of Minnesota, Minneapolis, MN 55455, USA
Steven T. DeKosky Department of Neurology, College of Medicine, University of Florida, Gainesville, FL 32610, USA
Jiang Bian Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32610, USA
Fei Wang Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA

Collapse

Moris D, Henao R, Hensman H, Stempora L, Chasse S, Schobel S, Dente CJ, Kirk AD, Elster E. Multidimensional machine learning models predicting outcomes after trauma. Surgery 2022;172:1851-1859. [PMID: 36116976 DOI: 10.1016/j.surg.2022.08.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 08/01/2022] [Accepted: 08/04/2022] [Indexed: 01/07/2023]

Abstract

BACKGROUND

An emerging body of literature supports the role of individualized prognostic tools to guide the management of patients after trauma. The aim of this study was to develop advanced modeling tools from multidimensional data sources, including immunological analytes and clinical and administrative data, to predict outcomes in trauma patients.

METHODS

This was a prospective study of trauma patients at Level 1 centers from 2015 to 2019. Clinical, flow cytometry, and serum cytokine data were collected within 48 hours of admission. Sparse logistic regression models were developed, jointly selecting predictors and estimating the risk of ventilator-associated pneumonia, acute kidney injury, complicated disposition (death, rehabilitation, or nursing facility), and return to the operating room. Model parameters (regularization controlling model sparsity) and performance estimation were obtained via nested leave-one-out cross-validation.

RESULTS

A total of 179 patients were included. The incidences of ventilator-associated pneumonia, acute kidney injury, complicated disposition, and return to the operating room were 17.7%, 28.8%, 22.5%, and 12.3%, respectively. Regarding extensive resource use, 30.7% of patients had prolonged intensive care unit stay, 73.2% had prolonged length of stay, and 23.5% had need for prolonged ventilatory support. The models were developed and cross-validated for ventilator-associated pneumonia, acute kidney injury, complicated dispositions, and return to the operating room, yielding predictive areas under the curve from 0.70 to 0.91. Each model derived its optimal predictive value by combining clinical, administrative, and immunological analyte data.

CONCLUSION

Clinical, immunological, and administrative data can be combined to predict post-traumatic outcomes and resource use. Multidimensional machine learning modeling can identify trauma patients with complicated clinical trajectories and high resource needs.

Collapse

Li MM, Huang K, Zitnik M. Graph representation learning in biomedicine and healthcare. Nat Biomed Eng 2022;6:1353-1369. [PMID: 36316368 PMCID: PMC10699434 DOI: 10.1038/s41551-022-00942-x] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 08/09/2022] [Indexed: 11/11/2022]

Turki H, Rasberry L, Ali Hadj Taieb M, Mietchen D, Ben Aouicha M, Pouris A, Bousrih Y. Letter to the Editor: FHIR RDF - Why the world needs structured electronic health records. J Biomed Inform 2022;136:104253. [DOI: 10.1016/j.jbi.2022.104253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Revised: 11/15/2022] [Accepted: 11/16/2022] [Indexed: 11/21/2022]

DeBellis M, Dutta B. From ontology to knowledge graph with agile methods: the case of COVID-19 CODO knowledge graph. INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS 2022. [DOI: 10.1108/ijwis-03-2022-0047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Fry M. Question-driven stepwise experimental discoveries in biochemistry: two case studies. HISTORY AND PHILOSOPHY OF THE LIFE SCIENCES 2022;44:12. [PMID: 35320436 DOI: 10.1007/s40656-022-00491-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 02/09/2022] [Indexed: 06/14/2023]

Dedié A, Bleimehl T, Täger J, Preusse M, Hrabě de Angelis M, Jarasch A. DZDconnect: mit vernetzten Daten gegen Diabetes. DIABETOLOGE 2021. [DOI: 10.1007/s11428-021-00807-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Mann M, Kumar C, Zeng WF, Strauss MT. Artificial intelligence for proteomics and biomarker discovery. Cell Syst 2021;12:759-770. [PMID: 34411543 DOI: 10.1016/j.cels.2021.06.006] [Citation(s) in RCA: 71] [Impact Index Per Article: 23.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 05/07/2021] [Accepted: 06/28/2021] [Indexed: 12/14/2022]

Kulmanov M, Smaili FZ, Gao X, Hoehndorf R. Semantic similarity and machine learning with ontologies. Brief Bioinform 2021;22:bbaa199. [PMID: 33049044 PMCID: PMC8293838 DOI: 10.1093/bib/bbaa199] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 08/03/2020] [Accepted: 08/04/2020] [Indexed: 12/13/2022] Open