1
|
Soman K, Rose PW, Morris JH, Akbas RE, Smith B, Peetoom B, Villouta-Reyes C, Cerono G, Shi Y, Rizk-Jackson A, Israni S, Nelson CA, Huang S, Baranzini SE. Biomedical knowledge graph-optimized prompt generation for large language models. BIOINFORMATICS (OXFORD, ENGLAND) 2024; 40:btae560. [PMID: 39288310 PMCID: PMC11441322 DOI: 10.1093/bioinformatics/btae560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Revised: 08/29/2024] [Accepted: 09/15/2024] [Indexed: 09/19/2024]
Abstract
MOTIVATION Large language models (LLMs) are being adopted at an unprecedented rate, yet still face challenges in knowledge-intensive domains such as biomedicine. Solutions such as pretraining and domain-specific fine-tuning add substantial computational overhead, requiring further domain-expertise. Here, we introduce a token-optimized and robust Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework by leveraging a massive biomedical KG (SPOKE) with LLMs such as Llama-2-13b, GPT-3.5-Turbo, and GPT-4, to generate meaningful biomedical text rooted in established knowledge. RESULTS Compared to the existing RAG technique for Knowledge Graphs, the proposed method utilizes minimal graph schema for context extraction and uses embedding methods for context pruning. This optimization in context extraction results in more than 50% reduction in token consumption without compromising the accuracy, making a cost-effective and robust RAG implementation on proprietary LLMs. KG-RAG consistently enhanced the performance of LLMs across diverse biomedical prompts by generating responses rooted in established knowledge, accompanied by accurate provenance and statistical evidence (if available) to substantiate the claims. Further benchmarking on human curated datasets, such as biomedical true/false and multiple-choice questions (MCQ), showed a remarkable 71% boost in the performance of the Llama-2 model on the challenging MCQ dataset, demonstrating the framework's capacity to empower open-source models with fewer parameters for domain-specific questions. Furthermore, KG-RAG enhanced the performance of proprietary GPT models, such as GPT-3.5 and GPT-4. In summary, the proposed framework combines explicit and implicit knowledge of KG and LLM in a token optimized fashion, thus enhancing the adaptability of general-purpose LLMs to tackle domain-specific questions in a cost-effective fashion. AVAILABILITY AND IMPLEMENTATION SPOKE KG can be accessed at https://spoke.rbvi.ucsf.edu/neighborhood.html. It can also be accessed using REST-API (https://spoke.rbvi.ucsf.edu/swagger/). KG-RAG code is made available at https://github.com/BaranziniLab/KG_RAG. Biomedical benchmark datasets used in this study are made available to the research community in the same GitHub repository.
Collapse
Affiliation(s)
- Karthik Soman
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Peter W Rose
- San Diego Supercomputer Center, University of California, San Diego, CA 92093, United States
| | - John H Morris
- Department of Pharmaceutical Chemistry, School of Pharmacy, University of California, San Francisco, CA 94158, United States
| | - Rabia E Akbas
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Brett Smith
- Institute for Systems Biology, Seattle, WA 98109, United States
| | - Braian Peetoom
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Catalina Villouta-Reyes
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Gabriel Cerono
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| | - Yongmei Shi
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158, United States
| | - Angela Rizk-Jackson
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158, United States
| | - Sharat Israni
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA 94158, United States
| | - Charlotte A Nelson
- Mate Bioservices, Inc. Swallowtail Ct., Brisbane, CA 94005, United States
| | - Sui Huang
- Institute for Systems Biology, Seattle, WA 98109, United States
| | - Sergio E Baranzini
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, United States
| |
Collapse
|
2
|
Johnson R, Li MM, Noori A, Queen O, Zitnik M. Graph Artificial Intelligence in Medicine. Annu Rev Biomed Data Sci 2024; 7:345-368. [PMID: 38749465 PMCID: PMC11344018 DOI: 10.1146/annurev-biodatasci-110723-024625] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2024]
Abstract
In clinical artificial intelligence (AI), graph representation learning, mainly through graph neural networks and graph transformer architectures, stands out for its capability to capture intricate relationships and structures within clinical datasets. With diverse data-from patient records to imaging-graph AI models process data holistically by viewing modalities and entities within them as nodes interconnected by their relationships. Graph AI facilitates model transfer across clinical tasks, enabling models to generalize across patient populations without additional parameters and with minimal to no retraining. However, the importance of human-centered design and model interpretability in clinical decision-making cannot be overstated. Since graph AI models capture information through localized neural transformations defined on relational datasets, they offer both an opportunity and a challenge in elucidating model rationale. Knowledge graphs can enhance interpretability by aligning model-driven insights with medical knowledge. Emerging graph AI models integrate diverse data modalities through pretraining, facilitate interactive feedback loops, and foster human-AI collaboration, paving the way toward clinically meaningful predictions.
Collapse
Affiliation(s)
- Ruth Johnson
- Berkowitz Family Living Laboratory, Harvard Medical School, Boston, Massachusetts, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA;
| | - Michelle M Li
- Bioinformatics and Integrative Genomics Program, Harvard Medical School, Boston, Massachusetts, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA;
| | - Ayush Noori
- Department of Computer Science, Harvard John A. Paulson School of Engineering and Applied Sciences, Allston, Massachusetts, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA;
| | - Owen Queen
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA;
| | - Marinka Zitnik
- Harvard Data Science Initiative, Cambridge, Massachusetts, USA
- Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Allston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA;
| |
Collapse
|
3
|
Sirocchi C, Bogliolo A, Montagna S. Medical-informed machine learning: integrating prior knowledge into medical decision systems. BMC Med Inform Decis Mak 2024; 24:186. [PMID: 38943085 PMCID: PMC11212227 DOI: 10.1186/s12911-024-02582-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 06/20/2024] [Indexed: 07/01/2024] Open
Abstract
BACKGROUND Clinical medicine offers a promising arena for applying Machine Learning (ML) models. However, despite numerous studies employing ML in medical data analysis, only a fraction have impacted clinical care. This article underscores the importance of utilising ML in medical data analysis, recognising that ML alone may not adequately capture the full complexity of clinical data, thereby advocating for the integration of medical domain knowledge in ML. METHODS The study conducts a comprehensive review of prior efforts in integrating medical knowledge into ML and maps these integration strategies onto the phases of the ML pipeline, encompassing data pre-processing, feature engineering, model training, and output evaluation. The study further explores the significance and impact of such integration through a case study on diabetes prediction. Here, clinical knowledge, encompassing rules, causal networks, intervals, and formulas, is integrated at each stage of the ML pipeline, resulting in a spectrum of integrated models. RESULTS The findings highlight the benefits of integration in terms of accuracy, interpretability, data efficiency, and adherence to clinical guidelines. In several cases, integrated models outperformed purely data-driven approaches, underscoring the potential for domain knowledge to enhance ML models through improved generalisation. In other cases, the integration was instrumental in enhancing model interpretability and ensuring conformity with established clinical guidelines. Notably, knowledge integration also proved effective in maintaining performance under limited data scenarios. CONCLUSIONS By illustrating various integration strategies through a clinical case study, this work provides guidance to inspire and facilitate future integration efforts. Furthermore, the study identifies the need to refine domain knowledge representation and fine-tune its contribution to the ML model as the two main challenges to integration and aims to stimulate further research in this direction.
Collapse
Affiliation(s)
- Christel Sirocchi
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy.
| | - Alessandro Bogliolo
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy
| | - Sara Montagna
- Department of Pure and Applied Sciences, University of Urbino, Piazza della Repubblica, 13, Urbino, 61029, Italy
| |
Collapse
|
4
|
Callahan TJ, Tripodi IJ, Stefanski AL, Cappelletti L, Taneja SB, Wyrwa JM, Casiraghi E, Matentzoglu NA, Reese J, Silverstein JC, Hoyt CT, Boyce RD, Malec SA, Unni DR, Joachimiak MP, Robinson PN, Mungall CJ, Cavalleri E, Fontana T, Valentini G, Mesiti M, Gillenwater LA, Santangelo B, Vasilevsky NA, Hoehndorf R, Bennett TD, Ryan PB, Hripcsak G, Kahn MG, Bada M, Baumgartner WA, Hunter LE. An open source knowledge graph ecosystem for the life sciences. Sci Data 2024; 11:363. [PMID: 38605048 PMCID: PMC11009265 DOI: 10.1038/s41597-024-03171-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 03/21/2024] [Indexed: 04/13/2024] Open
Abstract
Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| | - Ignacio J Tripodi
- Computer Science Department, Interdisciplinary Quantitative Biology, University of Colorado Boulder, Boulder, CO, 80301, USA
| | - Adrianne L Stefanski
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Luca Cappelletti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Jordan M Wyrwa
- Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Elena Casiraghi
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Justin Reese
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Jonathan C Silverstein
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Charles Tapley Hoyt
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, 02115, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Scott A Malec
- Division of Translational Informatics, University of New Mexico School of Medicine, Albuquerque, NM, 87131, USA
| | - Deepak R Unni
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Marcin P Joachimiak
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Peter N Robinson
- Berlin Institute of Health at Charité-Universitatsmedizin, 10117, Berlin, Germany
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Emanuele Cavalleri
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Tommaso Fontana
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Giorgio Valentini
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- ELLIS, European Laboratory for Learning and Intelligent Systems, Milan Unit, Italy
| | - Marco Mesiti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Lucas A Gillenwater
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Brook Santangelo
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Nicole A Vasilevsky
- Data Collaboration Center, Critical Path Institute, 1840 E River Rd. Suite 100, Tucson, AZ, 85718, USA
| | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Tellen D Bennett
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Patrick B Ryan
- Janssen Research and Development, Raritan, NJ, 08869, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Michael G Kahn
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Michael Bada
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - William A Baumgartner
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| | - Lawrence E Hunter
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| |
Collapse
|
5
|
Yurkovich JT, Evans SJ, Rappaport N, Boore JL, Lovejoy JC, Price ND, Hood LE. The transition from genomics to phenomics in personalized population health. Nat Rev Genet 2024; 25:286-302. [PMID: 38093095 DOI: 10.1038/s41576-023-00674-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/03/2023] [Indexed: 03/21/2024]
Abstract
Modern health care faces several serious challenges, including an ageing population and its inherent burden of chronic diseases, rising costs and marginal quality metrics. By assessing and optimizing the health trajectory of each individual using a data-driven personalized approach that reflects their genetics, behaviour and environment, we can start to address these challenges. This assessment includes longitudinal phenome measures, such as the blood proteome and metabolome, gut microbiome composition and function, and lifestyle and behaviour through wearables and questionnaires. Here, we review ongoing large-scale genomics and longitudinal phenomics efforts and the powerful insights they provide into wellness. We describe our vision for the transformation of the current health care from disease-oriented to data-driven, wellness-oriented and personalized population health.
Collapse
Affiliation(s)
- James T Yurkovich
- Phenome Health, Seattle, WA, USA
- Center for Phenomic Health, The Buck Institute for Research on Aging, Novato, CA, USA
- Department of Bioengineering, University of Texas at Dallas, Richardson, TX, USA
| | - Simon J Evans
- Phenome Health, Seattle, WA, USA
- Center for Phenomic Health, The Buck Institute for Research on Aging, Novato, CA, USA
| | - Noa Rappaport
- Center for Phenomic Health, The Buck Institute for Research on Aging, Novato, CA, USA
- Institute for Systems Biology, Seattle, WA, USA
| | - Jeffrey L Boore
- Phenome Health, Seattle, WA, USA
- Center for Phenomic Health, The Buck Institute for Research on Aging, Novato, CA, USA
| | - Jennifer C Lovejoy
- Phenome Health, Seattle, WA, USA
- Center for Phenomic Health, The Buck Institute for Research on Aging, Novato, CA, USA
- Institute for Systems Biology, Seattle, WA, USA
| | - Nathan D Price
- Institute for Systems Biology, Seattle, WA, USA
- Thorne HealthTech, New York, NY, USA
- Department of Bioengineering, University of Washington, Seattle, WA, USA
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA
| | - Leroy E Hood
- Phenome Health, Seattle, WA, USA.
- Center for Phenomic Health, The Buck Institute for Research on Aging, Novato, CA, USA.
- Institute for Systems Biology, Seattle, WA, USA.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
- Paul G. Allen School of Computer Science & Engineering, University of Washington, Seattle, WA, USA.
- Department of Immunology, University of Washington, Seattle, WA, USA.
| |
Collapse
|
6
|
Ma C, Liu S, Koslicki D. MetagenomicKG: a knowledge graph for metagenomic applications. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.14.585056. [PMID: 38559251 PMCID: PMC10980061 DOI: 10.1101/2024.03.14.585056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Motivation The sheer volume and variety of genomic content within microbial communities makes metagenomics a field rich in biomedical knowledge. To traverse these complex communities and their vast unknowns, metagenomic studies often depend on distinct reference databases, such as the Genome Taxonomy Database (GTDB), the Kyoto Encyclopedia of Genes and Genomes (KEGG), and the Bacterial and Viral Bioinformatics Resource Center (BV-BRC), for various analytical purposes. These databases are crucial for genetic and functional annotation of microbial communities. Nevertheless, the inconsistent nomenclature or identifiers of these databases present challenges for effective integration, representation, and utilization. Knowledge graphs (KGs) offer an appropriate solution by organizing biological entities and their interrelations into a cohesive network. The graph structure not only facilitates the unveiling of hidden patterns but also enriches our biological understanding with deeper insights. Despite KGs having shown potential in various biomedical fields, their application in metagenomics remains underexplored. Results We present MetagenomicKG, a novel knowledge graph specifically tailored for metagenomic analysis. MetagenomicKG integrates taxonomic, functional, and pathogenesis-related information from widely used databases, and further links these with established biomedical knowledge graphs to expand biological connections. Through several use cases, we demonstrate its utility in enabling hypothesis generation regarding the relationships between microbes and diseases, generating sample-specific graph embeddings, and providing robust pathogen prediction. Availability and Implementation The source code and technical details for constructing the MetagenomicKG and reproducing all analyses are available at Github: https://github.com/KoslickiLab/MetagenomicKG. We also host a Neo4j instance: http://mkg.cse.psu.edu:7474 for accessing and querying this graph.
Collapse
Affiliation(s)
- Chunyu Ma
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, Pennsylvania, USA
| | - Shaopeng Liu
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, Pennsylvania, USA
| | - David Koslicki
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, Pennsylvania, USA
- Department of Computer Science and Engineering, Pennsylvania State University, State College, Pennsylvania, USA
- Department of Biology, Pennsylvania State University, State College, Pennsylvania, USA
- The One Health Microbiome Center, Huck Institutes of the Life Sciences, Pennsylvania State University, State College, Pennsylvania, USA
| |
Collapse
|
7
|
Tang AS, Rankin KP, Cerono G, Miramontes S, Mills H, Roger J, Zeng B, Nelson C, Soman K, Woldemariam S, Li Y, Lee A, Bove R, Glymour M, Aghaeepour N, Oskotsky TT, Miller Z, Allen IE, Sanders SJ, Baranzini S, Sirota M. Leveraging electronic health records and knowledge networks for Alzheimer's disease prediction and sex-specific biological insights. NATURE AGING 2024; 4:379-395. [PMID: 38383858 PMCID: PMC10950787 DOI: 10.1038/s43587-024-00573-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Accepted: 01/19/2024] [Indexed: 02/23/2024]
Abstract
Identification of Alzheimer's disease (AD) onset risk can facilitate interventions before irreversible disease progression. We demonstrate that electronic health records from the University of California, San Francisco, followed by knowledge networks (for example, SPOKE) allow for (1) prediction of AD onset and (2) prioritization of biological hypotheses, and (3) contextualization of sex dimorphism. We trained random forest models and predicted AD onset on a cohort of 749 individuals with AD and 250,545 controls with a mean area under the receiver operating characteristic of 0.72 (7 years prior) to 0.81 (1 day prior). We further harnessed matched cohort models to identify conditions with predictive power before AD onset. Knowledge networks highlight shared genes between multiple top predictors and AD (for example, APOE, ACTB, IL6 and INS). Genetic colocalization analysis supports AD association with hyperlipidemia at the APOE locus, as well as a stronger female AD association with osteoporosis at a locus near MS4A6A. We therefore show how clinical data can be utilized for early AD prediction and identification of personalized biological hypotheses.
Collapse
Affiliation(s)
- Alice S Tang
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
- Graduate Program in Bioengineering, University of California, San Francisco and University of California, Berkeley, San Francisco and Berkeley, CA, USA.
| | - Katherine P Rankin
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Memory and Aging Center, Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Gabriel Cerono
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Silvia Miramontes
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Hunter Mills
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Jacquelyn Roger
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Billy Zeng
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Charlotte Nelson
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Karthik Soman
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Sarah Woldemariam
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Yaqiao Li
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Albert Lee
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Riley Bove
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Maria Glymour
- Department of Anesthesiology, Pain, and Perioperative Medicine, Stanford University, Palo Alto, CA, USA
| | - Nima Aghaeepour
- Department of Anesthesiology, Pain, and Perioperative Medicine, Stanford University, Palo Alto, CA, USA
- Department of Pediatrics, Stanford University, Palo Alto, CA, USA
- Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA
| | - Tomiko T Oskotsky
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Zachary Miller
- Memory and Aging Center, Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Isabel E Allen
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA
| | - Stephan J Sanders
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Institute of Developmental and Regenerative Medicine, Department of Paediatrics, University of Oxford, Oxford, UK
- Department of Psychiatry and Behavioral Sciences, Weill Institute for Neurosciences, University of California, San Francisco, CA, USA
| | - Sergio Baranzini
- Weill Institute for Neuroscience. Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA.
- Department of Pediatrics, University of California, San Francisco, CA, USA.
| |
Collapse
|
8
|
Woodman RJ, Koczwara B, Mangoni AA. Applying precision medicine principles to the management of multimorbidity: the utility of comorbidity networks, graph machine learning, and knowledge graphs. Front Med (Lausanne) 2024; 10:1302844. [PMID: 38404463 PMCID: PMC10885565 DOI: 10.3389/fmed.2023.1302844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 12/22/2023] [Indexed: 02/27/2024] Open
Abstract
The current management of patients with multimorbidity is suboptimal, with either a single-disease approach to care or treatment guideline adaptations that result in poor adherence due to their complexity. Although this has resulted in calls for more holistic and personalized approaches to prescribing, progress toward these goals has remained slow. With the rapid advancement of machine learning (ML) methods, promising approaches now also exist to accelerate the advance of precision medicine in multimorbidity. These include analyzing disease comorbidity networks, using knowledge graphs that integrate knowledge from different medical domains, and applying network analysis and graph ML. Multimorbidity disease networks have been used to improve disease diagnosis, treatment recommendations, and patient prognosis. Knowledge graphs that combine different medical entities connected by multiple relationship types integrate data from different sources, allowing for complex interactions and creating a continuous flow of information. Network analysis and graph ML can then extract the topology and structure of networks and reveal hidden properties, including disease phenotypes, network hubs, and pathways; predict drugs for repurposing; and determine safe and more holistic treatments. In this article, we describe the basic concepts of creating bipartite and unipartite disease and patient networks and review the use of knowledge graphs, graph algorithms, graph embedding methods, and graph ML within the context of multimorbidity. Specifically, we provide an overview of the application of graph theory for studying multimorbidity, the methods employed to extract knowledge from graphs, and examples of the application of disease networks for determining the structure and pathways of multimorbidity, identifying disease phenotypes, predicting health outcomes, and selecting safe and effective treatments. In today's modern data-hungry, ML-focused world, such network-based techniques are likely to be at the forefront of developing robust clinical decision support tools for safer and more holistic approaches to treating older patients with multimorbidity.
Collapse
Affiliation(s)
- Richard John Woodman
- Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia
| | - Bogda Koczwara
- Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia
- Department of Medical Oncology, Flinders Medical Centre, Southern Adelaide Local Health Network, Adelaide, SA, Australia
| | - Arduino Aleksander Mangoni
- Flinders Health and Medical Research Institute, College of Medicine and Public Health, Flinders University, Adelaide, SA, Australia
- Department of Clinical Pharmacology, Flinders Medical Centre, Southern Adelaide Local Health Network, Adelaide, SA, Australia
| |
Collapse
|
9
|
Abstract
Knowledge graphs represent information in the form of entities and relationships between those entities. Such a representation has multiple potential applications in drug discovery, including democratizing access to biomedical data, contextualizing or visualizing that data, and generating novel insights through the application of machine learning approaches. Knowledge graphs put data into context and therefore offer the opportunity to generate explainable predictions, which is a key topic in contemporary artificial intelligence. In this chapter, we outline some of the factors that need to be considered when constructing biomedical knowledge graphs, examine recent advances in mining such systems to gain insights for drug discovery, and identify potential future areas for further development.
Collapse
Affiliation(s)
- Tim James
- Evotec (UK) Ltd., Abingdon, Oxfordshire, UK.
| | | |
Collapse
|
10
|
Sánchez-Valle J, Valencia A. Molecular bases of comorbidities: present and future perspectives. Trends Genet 2023; 39:773-786. [PMID: 37482451 DOI: 10.1016/j.tig.2023.06.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 06/12/2023] [Accepted: 06/12/2023] [Indexed: 07/25/2023]
Abstract
Co-occurrence of diseases decreases patient quality of life, complicates treatment choices, and increases mortality. Analyses of electronic health records present a complex scenario of comorbidity relationships that vary by age, sex, and cohort under study. The study of similarities between diseases using 'omics data, such as genes altered in diseases, gene expression, proteome, and microbiome, are fundamental to uncovering the origin of, and potential treatment for, comorbidities. Recent studies have produced a first generation of genetic interpretations for as much as 46% of the comorbidities described in large cohorts. Integrating different sources of molecular information and using artificial intelligence (AI) methods are promising approaches for the study of comorbidities. They may help to improve the treatment of comorbidities, including the potential repositioning of drugs.
Collapse
Affiliation(s)
- Jon Sánchez-Valle
- Life Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain.
| | - Alfonso Valencia
- Life Sciences Department, Barcelona Supercomputing Center, Barcelona, 08034, Spain; ICREA, Barcelona, 08010, Spain.
| |
Collapse
|
11
|
Hou Y, Yeung J, Xu H, Su C, Wang F, Zhang R. From Answers to Insights: Unveiling the Strengths and Limitations of ChatGPT and Biomedical Knowledge Graphs. RESEARCH SQUARE 2023:rs.3.rs-3185632. [PMID: 37577545 PMCID: PMC10418534 DOI: 10.21203/rs.3.rs-3185632/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Purpose Large Language Models (LLMs) have shown exceptional performance in various natural language processing tasks, benefiting from their language generation capabilities and ability to acquire knowledge from unstructured text. However, in the biomedical domain, LLMs face limitations that lead to inaccurate and inconsistent answers. Knowledge Graphs (KGs) have emerged as valuable resources for organizing structured information. Biomedical Knowledge Graphs (BKGs) have gained significant attention for managing diverse and large-scale biomedical knowledge. The objective of this study is to assess and compare the capabilities of ChatGPT and existing BKGs in question-answering, biomedical knowledge discovery, and reasoning tasks within the biomedical domain. Methods We conducted a series of experiments to assess the performance of ChatGPT and the BKGs in various aspects of querying existing biomedical knowledge, knowledge discovery, and knowledge reasoning. Firstly, we tasked ChatGPT with answering questions sourced from the "Alternative Medicine" sub-category of Yahoo! Answers and recorded the responses. Additionally, we queried BKG to retrieve the relevant knowledge records corresponding to the questions and assessed them manually. In another experiment, we formulated a prediction scenario to assess ChatGPT's ability to suggest potential drug/dietary supplement repurposing candidates. Simultaneously, we utilized BKG to perform link prediction for the same task. The outcomes of ChatGPT and BKG were compared and analyzed. Furthermore, we evaluated ChatGPT and BKG's capabilities in establishing associations between pairs of proposed entities. This evaluation aimed to assess their reasoning abilities and the extent to which they can infer connections within the knowledge domain. Results The results indicate that ChatGPT with GPT-4.0 outperforms both GPT-3.5 and BKGs in providing existing information. However, BKGs demonstrate higher reliability in terms of information accuracy. ChatGPT exhibits limitations in performing novel discoveries and reasoning, particularly in establishing structured links between entities compared to BKGs. Conclusions To address the limitations observed, future research should focus on integrating LLMs and BKGs to leverage the strengths of both approaches. Such integration would optimize task performance and mitigate potential risks, leading to advancements in knowledge within the biomedical field and contributing to the overall well-being of individuals.
Collapse
|
12
|
Caufield JH, Putman T, Schaper K, Unni DR, Hegde H, Callahan TJ, Cappelletti L, Moxon SAT, Ravanmehr V, Carbon S, Chan LE, Cortes K, Shefchek KA, Elsarboukh G, Balhoff J, Fontana T, Matentzoglu N, Bruskiewich RM, Thessen AE, Harris NL, Munoz-Torres MC, Haendel MA, Robinson PN, Joachimiak MP, Mungall CJ, Reese JT. KG-Hub-building and exchanging biological knowledge graphs. Bioinformatics 2023; 39:btad418. [PMID: 37389415 PMCID: PMC10336030 DOI: 10.1093/bioinformatics/btad418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 05/09/2023] [Accepted: 06/29/2023] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Knowledge graphs (KGs) are a powerful approach for integrating heterogeneous data and making inferences in biology and many other domains, but a coherent solution for constructing, exchanging, and facilitating the downstream use of KGs is lacking. RESULTS Here we present KG-Hub, a platform that enables standardized construction, exchange, and reuse of KGs. Features include a simple, modular extract-transform-load pattern for producing graphs compliant with Biolink Model (a high-level data model for standardizing biological data), easy integration of any OBO (Open Biological and Biomedical Ontologies) ontology, cached downloads of upstream data sources, versioned and automatically updated builds with stable URLs, web-browsable storage of KG artifacts on cloud infrastructure, and easy reuse of transformed subgraphs across projects. Current KG-Hub projects span use cases including COVID-19 research, drug repurposing, microbial-environmental interactions, and rare disease research. KG-Hub is equipped with tooling to easily analyze and manipulate KGs. KG-Hub is also tightly integrated with graph machine learning (ML) tools which allow automated graph ML, including node embeddings and training of models for link prediction and node classification. AVAILABILITY AND IMPLEMENTATION https://kghub.org.
Collapse
Affiliation(s)
- J Harry Caufield
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United States
| | - Tim Putman
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, United States
| | - Kevin Schaper
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, United States
| | - Deepak R Unni
- SIB Swiss Institute of Bioinformatics, Basel 1015, Switzerland
| | - Harshad Hegde
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United States
| | - Tiffany J Callahan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY 10032, United States
| | - Luca Cappelletti
- Department of Computer Science, University of Milano, Milan 20126, Italy
| | - Sierra A T Moxon
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United States
| | - Vida Ravanmehr
- Department of Lymphoma-Myeloma, MD Anderson Cancer Center, Houston, TX 77030, United States
| | - Seth Carbon
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United States
| | - Lauren E Chan
- College of Public Health and Human Sciences, Oregon State University, Corvallis, OR 97331, United States
| | - Katherina Cortes
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, United States
| | - Kent A Shefchek
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, United States
| | - Glass Elsarboukh
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, United States
| | - Jim Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC 27517, United States
| | - Tommaso Fontana
- Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Milan 20133, Italy
| | | | | | - Anne E Thessen
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, United States
| | - Nomi L Harris
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United States
| | | | - Melissa A Haendel
- Anschutz Medical Campus, University of Colorado, Aurora, CO 80045, United States
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, United States
| | - Marcin P Joachimiak
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United States
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United States
| | - Justin T Reese
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, United States
| |
Collapse
|
13
|
Fernández Ó, Montalban X, Agüera E, Aladro Y, Alonso A, Arroyo R, Brieva L, Calles C, Costa-Frossard L, Eichau S, García-Domínguez JM, Hernández MÁ, Landete L, Llaneza M, Llufriu S, Meca-Lallana JE, Meca-Lallana V, Mongay-Ochoa N, Moral E, Oreja-Guevara C, Ramió-Torrentà L, Téllez N, Romero-Pinel L, Rodríguez-Antigüedad A. [15th Post-ECTRIMS Meeting: a review of the latest developments presented at the 2022 ECTRIMS Congress (Part I)]. Rev Neurol 2023; 77:19-30. [PMID: 37365721 PMCID: PMC10663806 DOI: 10.33588/rn.7701.2023167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Indexed: 06/28/2023]
Abstract
INTRODUCTION On 4 and 5 November 2022, Madrid hosted the 15th edition of the Post-ECTRIMS Meeting, where neurologists specialised in multiple sclerosis (MS) outlined the most relevant novelties presented at the 2022 ECTRIMS Congress, held in Amsterdam from 26 to 28 October. AIM To synthesise the content presented at the 15th edition of the Post-ECTRIMS Meeting, in an article broken down into two parts. DEVELOPMENT In this first part, the initial events involved in the onset of MS, the role played by lymphocytes and the migration of immune system cells into the central nervous system are presented. It describes emerging biomarkers in body fluids and imaging findings that are predictive of disease progression and useful in the differential diagnosis of MS. It also discusses advances in imaging techniques which, together with a better understanding of the agents involved in demyelination and remyelination processes, provide a basis for dealing with remyelination in the clinical setting. Finally, the mechanisms triggering the inflammatory reaction and neurodegeneration involved in MS pathology are reviewed.
Collapse
Affiliation(s)
- Óscar Fernández
- Hospital Regional Universitario de Málaga. MálagaHospital Regional Universitario de MálagaHospital Regional Universitario de MálagaMálagaSpain
| | - Xavier Montalban
- Hospital Universitari Vall d’Hebron-CEMCATHospital Universitari Vall d’Hebron-CEMCATHospital Universitari Vall d’Hebron-CEMCATBarcelonaSpain
| | - Eduardo Agüera
- Hospital Universitario Reina SofíaHospital Universitario Reina SofíaHospital Universitario Reina SofíaBarcelonaSpain
| | - Yolanda Aladro
- Hospital Universitario de Getafe. Getafe, MadridHospital Universitario de GetafeHospital Universitario de GetafeMadridSpain
| | - Ana Alonso
- Hospital Regional Universitario de Málaga. MálagaHospital Regional Universitario de MálagaHospital Regional Universitario de MálagaMálagaSpain
| | - Rafael Arroyo
- Hospital Universitario QuirónsaludHospital Universitario QuirónsaludHospital Universitario QuirónsaludBarcelonaSpain
| | - Luis Brieva
- Hospital Universitari Arnau de Vilanova- Universitat de Lleida. LleidaHospital Universitari Arnau de Vilanova- Universitat de LleidaHospital Universitari Arnau de Vilanova- Universitat de LleidaLleidaSpain
| | - Carmen Calles
- Hospital Universitario Son Espases. Palma de MallorcaHospital Universitario Son EspasesHospital Universitario Son EspasesPalma de MallorcaSpain
| | - Lucienne Costa-Frossard
- Hospital Universitario Ramón y CajalHospital Universitario Ramón y CajalHospital Universitario Ramón y CajalBarcelonaSpain
| | - Sara Eichau
- Hospital Universitario Virgen Macarena. SevillaHospital Universitario Virgen MacarenaHospital Universitario Virgen MacarenaSevillaSpain
| | - José M. García-Domínguez
- Hospital Universitario Gregorio MarañónHospital Universitario Gregorio MarañónHospital Universitario Gregorio MarañónBarcelonaSpain
| | - Miguel Á. Hernández
- Hospital Nuestra Señora de Candelaria. Santa Cruz de TenerifeHospital Nuestra Señora de CandelariaHospital Nuestra Señora de CandelariaSanta Cruz de TenerifeSpain
| | - Lamberto Landete
- Hospital Universitario Doctor Peset. ValenciaHospital Universitario Doctor PesetHospital Universitario Doctor PesetValenciaSpain
| | - Miguel Llaneza
- Complejo Hospitalario Universitario de Ferrol. El Ferrol, La CoruñaComplejo Hospitalario Universitario de FerrolComplejo Hospitalario Universitario de FerrolEl FerrolSpain
| | - Sara Llufriu
- Hospital Clínic de Barcelona e IDIBAPS. BarcelonaHospital Clínic de Barcelona e IDIBAPSHospital Clínic de Barcelona e IDIBAPSBarcelonaSpain
| | - José E. Meca-Lallana
- Hospital Regional Universitario de Málaga. MálagaHospital Regional Universitario de MálagaHospital Regional Universitario de MálagaMálagaSpain
| | - Virginia Meca-Lallana
- Hospital Clínico Universitario Virgen de la Arrixaca. MurciaHospital Clínico Universitario Virgen de la ArrixacaHospital Clínico Universitario Virgen de la ArrixacaMurciaSpain
| | - Neus Mongay-Ochoa
- Hospital Universitari Vall d’Hebron-CEMCATHospital Universitari Vall d’Hebron-CEMCATHospital Universitari Vall d’Hebron-CEMCATBarcelonaSpain
| | - Ester Moral
- Hospital Sant Joan Despí Moisès Broggi. Sant Joan Despí, BarcelonaHospital Sant Joan Despí Moisès BroggiHospital Sant Joan Despí Moisès BroggiBarcelonaSpain
| | - Celia Oreja-Guevara
- Hospital Clínico San Carlos-IdISSC-UCM. MadridHospital Clínico San Carlos-IdISSC-UCMHospital Clínico San Carlos-IdISSC-UCMMadridSpain
| | - Lluís Ramió-Torrentà
- Departamento de Cièncias Médicas. Universitat de Girona. GironaUniversitat de GironaUniversitat de GironaGironaSpain
| | - Nieves Téllez
- Hospital Clínico Universitario de Valladolid. ValladolidHospital Clínico Universitario de ValladolidHospital Clínico Universitario de ValladolidValladolidSpain
| | - Lucía Romero-Pinel
- Hospital Universitari de Bellvitge- IDIBELL. L’Hospitalet de Llobregat, BarcelonaHospital Universitari de Bellvitge- IDIBELLHospital Universitari de Bellvitge- IDIBELLBarcelonaSpain
| | | |
Collapse
|
14
|
Hou Y, Yeung J, Xu H, Su C, Wang F, Zhang R. From Answers to Insights: Unveiling the Strengths and Limitations of ChatGPT and Biomedical Knowledge Graphs. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.06.09.23291208. [PMID: 37398259 PMCID: PMC10312889 DOI: 10.1101/2023.06.09.23291208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Large Language Models (LLMs) have demonstrated exceptional performance in various natural language processing tasks, utilizing their language generation capabilities and knowledge acquisition potential from unstructured text. However, when applied to the biomedical domain, LLMs encounter limitations, resulting in erroneous and inconsistent answers. Knowledge Graphs (KGs) have emerged as valuable resources for structured information representation and organization. Specifically, Biomedical Knowledge Graphs (BKGs) have attracted significant interest in managing large-scale and heterogeneous biomedical knowledge. This study evaluates the capabilities of ChatGPT and existing BKGs in question answering, knowledge discovery, and reasoning. Results indicate that while ChatGPT with GPT-4.0 surpasses both GPT-3.5 and BKGs in providing existing information, BKGs demonstrate superior information reliability. Additionally, ChatGPT exhibits limitations in performing novel discoveries and reasoning, particularly in establishing structured links between entities compared to BKGs. To overcome these limitations, future research should focus on integrating LLMs and BKGs to leverage their respective strengths. Such an integrated approach would optimize task performance and mitigate potential risks, thereby advancing knowledge in the biomedical field and contributing to overall well-being.
Collapse
Affiliation(s)
- Yu Hou
- Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| | - Jeremy Yeung
- Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| | - Hua Xu
- Section of Biomedical Informatics and Data Science, Yale University, New Haven, Connecticut, USA
| | - Chang Su
- Department of Health Service Administration and Policy, Temple University, Philadelphia, PA, USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Rui Zhang
- Department of Surgery, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
15
|
Hänsel K, Dudgeon SN, Cheung KH, Durant TJS, Schulz WL. From Data to Wisdom: Biomedical Knowledge Graphs for Real-World Data Insights. J Med Syst 2023; 47:65. [PMID: 37195430 PMCID: PMC10191934 DOI: 10.1007/s10916-023-01951-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 04/15/2023] [Indexed: 05/18/2023]
Abstract
Graph data models are an emerging approach to structure clinical and biomedical information. These models offer intriguing opportunities for novel approaches in healthcare, such as disease phenotyping, risk prediction, and personalized precision care. The combination of data and information in a graph model to create knowledge graphs has rapidly expanded in biomedical research, but the integration of real-world data from the electronic health record has been limited. To broadly apply knowledge graphs to EHR and other real-world data, a deeper understanding of how to represent these data in a standardized graph model is needed. We provide an overview of the state-of-the-art research for clinical and biomedical data integration and summarize the potential to accelerate healthcare and precision medicine research through insight generation from integrated knowledge graphs.
Collapse
Affiliation(s)
- Katrin Hänsel
- Department of Laboratory Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Sarah N Dudgeon
- Department of Laboratory Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Kei-Hoi Cheung
- Section of Biomedical Informatics, Department of Emergency Medicine, Yale School of Medicine, 55 Park Street, PS 210, New Haven, CT, 06510, USA
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Thomas J S Durant
- Department of Laboratory Medicine, Yale School of Medicine, New Haven, CT, USA
| | - Wade L Schulz
- Department of Laboratory Medicine, Yale School of Medicine, New Haven, CT, USA.
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
| |
Collapse
|
16
|
Soman K, Nelson CA, Cerono G, Goldman SM, Baranzini SE, Brown EG. Early detection of Parkinson's disease through enriching the electronic health record using a biomedical knowledge graph. Front Med (Lausanne) 2023; 10:1081087. [PMID: 37250641 PMCID: PMC10217780 DOI: 10.3389/fmed.2023.1081087] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 04/18/2023] [Indexed: 05/31/2023] Open
Abstract
Introduction Early diagnosis of Parkinson's disease (PD) is important to identify treatments to slow neurodegeneration. People who develop PD often have symptoms before the disease manifests and may be coded as diagnoses in the electronic health record (EHR). Methods To predict PD diagnosis, we embedded EHR data of patients onto a biomedical knowledge graph called Scalable Precision medicine Open Knowledge Engine (SPOKE) and created patient embedding vectors. We trained and validated a classifier using these vectors from 3,004 PD patients, restricting records to 1, 3, and 5 years before diagnosis, and 457,197 non-PD group. Results The classifier predicted PD diagnosis with moderate accuracy (AUC = 0.77 ± 0.06, 0.74 ± 0.05, 0.72 ± 0.05 at 1, 3, and 5 years) and performed better than other benchmark methods. Nodes in the SPOKE graph, among cases, revealed novel associations, while SPOKE patient vectors revealed the basis for individual risk classification. Discussion The proposed method was able to explain the clinical predictions using the knowledge graph, thereby making the predictions clinically interpretable. Through enriching EHR data with biomedical associations, SPOKE may be a cost-efficient and personalized way to predict PD diagnosis years before its occurrence.
Collapse
Affiliation(s)
- Karthik Soman
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| | - Charlotte A. Nelson
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| | - Gabriel Cerono
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| | - Samuel M. Goldman
- Division of Occupational and Environmental Medicine, University of California, San Francisco, San Francisco, CA, United States
| | - Sergio E. Baranzini
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| | - Ethan G. Brown
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
17
|
Su C, Hou Y, Zhou M, Rajendran S, Maasch JRA, Abedi Z, Zhang H, Bai Z, Cuturrufo A, Guo W, Chaudhry FF, Ghahramani G, Tang J, Cheng F, Li Y, Zhang R, DeKosky ST, Bian J, Wang F. Biomedical discovery through the integrative biomedical knowledge hub (iBKH). iScience 2023; 26:106460. [PMID: 37020958 PMCID: PMC10068563 DOI: 10.1016/j.isci.2023.106460] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 09/20/2022] [Accepted: 03/16/2023] [Indexed: 04/01/2023] Open
Abstract
The abundance of biomedical knowledge gained from biological experiments and clinical practices is an invaluable resource for biomedicine. The emerging biomedical knowledge graphs (BKGs) provide an efficient and effective way to manage the abundant knowledge in biomedical and life science. In this study, we created a comprehensive BKG called the integrative Biomedical Knowledge Hub (iBKH) by harmonizing and integrating information from diverse biomedical resources. To make iBKH easily accessible for biomedical research, we developed a web-based, user-friendly graphical portal that allows fast and interactive knowledge retrieval. Additionally, we also implemented an efficient and scalable graph learning pipeline for discovering novel biomedical knowledge in iBKH. As a proof of concept, we performed our iBKH-based method for computational in-silico drug repurposing for Alzheimer's disease. The iBKH is publicly available.
Collapse
Affiliation(s)
- Chang Su
- Department of Health Service Administration and Policy, College of Public Health, Temple University, Philadelphia, PA 19122, USA
| | - Yu Hou
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA
- Department of Surgery, University of Minnesota, Minneapolis, MN 55455, USA
| | - Manqi Zhou
- Department of Computational Biology, Cornell University, Ithaca, NY 14850, USA
| | - Suraj Rajendran
- Tri-Institutional Computational Biology & Medicine Program, Cornell University, New York, NY 10065, USA
| | | | - Zehra Abedi
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA
| | - Haotan Zhang
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA
| | - Zilong Bai
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA
| | | | - Winston Guo
- Department of Medicine, Weill Cornell Medicine, New York, NY 10021, USA
| | - Fayzan F. Chaudhry
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA
| | - Gregory Ghahramani
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10065, USA
| | - Jian Tang
- Mila-Quebec AI Institute and HEC Montreal, Montreal, QC H2S 3H1, Canada
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA
- Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA
- Case Comprehensive Cancer Center, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Yue Li
- School of Computer Science, McGill University, Montreal, QC H3A 0C6, Canada
| | - Rui Zhang
- Department of Surgery, University of Minnesota, Minneapolis, MN 55455, USA
| | - Steven T. DeKosky
- Department of Neurology, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - Jiang Bian
- Department of Health Outcomes & Biomedical Informatics, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY 10065, USA
| |
Collapse
|
18
|
Sanders LM, Scott RT, Yang JH, Qutub AA, Garcia Martin H, Berrios DC, Hastings JJA, Rask J, Mackintosh G, Hoarfrost AL, Chalk S, Kalantari J, Khezeli K, Antonsen EL, Babdor J, Barker R, Baranzini SE, Beheshti A, Delgado-Aparicio GM, Glicksberg BS, Greene CS, Haendel M, Hamid AA, Heller P, Jamieson D, Jarvis KJ, Komarova SV, Komorowski M, Kothiyal P, Mahabal A, Manor U, Mason CE, Matar M, Mias GI, Miller J, Myers JG, Nelson C, Oribello J, Park SM, Parsons-Wingerter P, Prabhu RK, Reynolds RJ, Saravia-Butler A, Saria S, Sawyer A, Singh NK, Snyder M, Soboczenski F, Soman K, Theriot CA, Van Valen D, Venkateswaran K, Warren L, Worthey L, Zitnik M, Costes SV. Biological research and self-driving labs in deep space supported by artificial intelligence. NAT MACH INTELL 2023. [DOI: 10.1038/s42256-023-00618-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
19
|
Glen AK, Ma C, Mendoza L, Womack F, Wood EC, Sinha M, Acevedo L, Kvarfordt LG, Peene RC, Liu S, Hoffman AS, Roach JC, Deutsch EW, Ramsey SA, Koslicki D. ARAX: a graph-based modular reasoning tool for translational biomedicine. Bioinformatics 2023; 39:btad082. [PMID: 36752514 PMCID: PMC10027432 DOI: 10.1093/bioinformatics/btad082] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/17/2022] [Accepted: 02/07/2023] [Indexed: 04/12/2023] Open
Abstract
MOTIVATION With the rapidly growing volume of knowledge and data in biomedical databases, improved methods for knowledge-graph-based computational reasoning are needed in order to answer translational questions. Previous efforts to solve such challenging computational reasoning problems have contributed tools and approaches, but progress has been hindered by the lack of an expressive analysis workflow language for translational reasoning and by the lack of a reasoning engine-supporting that language-that federates semantically integrated knowledge-bases. RESULTS We introduce ARAX, a new reasoning system for translational biomedicine that provides a web browser user interface and an application programming interface (API). ARAX enables users to encode translational biomedical questions and to integrate knowledge across sources to answer the user's query and facilitate exploration of results. For ARAX, we developed new approaches to query planning, knowledge-gathering, reasoning and result ranking and dynamically integrate knowledge providers for answering biomedical questions. To illustrate ARAX's application and utility in specific disease contexts, we present several use-case examples. AVAILABILITY AND IMPLEMENTATION The source code and technical documentation for building the ARAX server-side software and its built-in knowledge database are freely available online (https://github.com/RTXteam/RTX). We provide a hosted ARAX service with a web browser interface at arax.rtx.ai and a web API endpoint at arax.rtx.ai/api/arax/v1.3/ui/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Amy K Glen
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Chunyu Ma
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - Luis Mendoza
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Finn Womack
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - E C Wood
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Meghamala Sinha
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Liliana Acevedo
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Lindsey G Kvarfordt
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Ross C Peene
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Shaopeng Liu
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
| | - Andrew S Hoffman
- Interdisciplinary Hub for Digitalization and Society, Radboud University, Nijmegen 6500GL, The Netherlands
| | - Jared C Roach
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | - Stephen A Ramsey
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
- Department of Biomedical Sciences, Oregon State University, Corvallis, OR 97331, USA
| | - David Koslicki
- Huck Institutes of the Life Sciences, Pennsylvania State University, State College, PA 16802, USA
- Department of Biology, Pennsylvania State University, State College, PA 16801, USA
- Department of Computer Science and Engineering, Pennsylvania State University, State College, PA 16802, USA
| |
Collapse
|
20
|
Morris JH, Soman K, Akbas RE, Zhou X, Smith B, Meng EC, Huang CC, Cerono G, Schenk G, Rizk-Jackson A, Harroud A, Sanders L, Costes SV, Bharat K, Chakraborty A, Pico AR, Mardirossian T, Keiser M, Tang A, Hardi J, Shi Y, Musen M, Israni S, Huang S, Rose PW, Nelson CA, Baranzini SE. The scalable precision medicine open knowledge engine (SPOKE): a massive knowledge graph of biomedical information. Bioinformatics 2023; 39:btad080. [PMID: 36759942 PMCID: PMC9940622 DOI: 10.1093/bioinformatics/btad080] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 01/17/2023] [Accepted: 02/08/2023] [Indexed: 02/11/2023] Open
Abstract
MOTIVATION Knowledge graphs (KGs) are being adopted in industry, commerce and academia. Biomedical KG presents a challenge due to the complexity, size and heterogeneity of the underlying information. RESULTS In this work, we present the Scalable Precision Medicine Open Knowledge Engine (SPOKE), a biomedical KG connecting millions of concepts via semantically meaningful relationships. SPOKE contains 27 million nodes of 21 different types and 53 million edges of 55 types downloaded from 41 databases. The graph is built on the framework of 11 ontologies that maintain its structure, enable mappings and facilitate navigation. SPOKE is built weekly by python scripts which download each resource, check for integrity and completeness, and then create a 'parent table' of nodes and edges. Graph queries are translated by a REST API and users can submit searches directly via an API or a graphical user interface. Conclusions/Significance: SPOKE enables the integration of seemingly disparate information to support precision medicine efforts. AVAILABILITY AND IMPLEMENTATION The SPOKE neighborhood explorer is available at https://spoke.rbvi.ucsf.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- John H Morris
- Department of Pharmaceutical Chemistry, School of Pharmacy, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Karthik Soman
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Rabia E Akbas
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Xiaoyuan Zhou
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Brett Smith
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Elaine C Meng
- Department of Pharmaceutical Chemistry, School of Pharmacy, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Conrad C Huang
- Department of Pharmaceutical Chemistry, School of Pharmacy, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Gabriel Cerono
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Gundolf Schenk
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Angela Rizk-Jackson
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Adil Harroud
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Lauren Sanders
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Sylvain V Costes
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Krish Bharat
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Arjun Chakraborty
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Alexander R Pico
- Data Science and Biotechnology, Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Taline Mardirossian
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143-2550, USA
| | - Michael Keiser
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143-2550, USA
| | - Alice Tang
- UCSF-UC Berkeley Bioengineering Graduate Program, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Josef Hardi
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA 94305-5479, USA
| | - Yongmei Shi
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Mark Musen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA 94305-5479, USA
| | - Sharat Israni
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Sui Huang
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Peter W Rose
- San Diego Supercomputer Center, University of California, San Diego, La Jolla, CA 92093, USA
| | - Charlotte A Nelson
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Sergio E Baranzini
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
21
|
Santangelo BE, Gillenwater LA, Salem NM, Hunter LE. Molecular cartooning with knowledge graphs. FRONTIERS IN BIOINFORMATICS 2022; 2:1054578. [PMID: 36568701 PMCID: PMC9772836 DOI: 10.3389/fbinf.2022.1054578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/23/2022] [Indexed: 12/13/2022] Open
Abstract
Molecular "cartoons," such as pathway diagrams, provide a visual summary of biomedical research results and hypotheses. Their ubiquitous appearance within the literature indicates their universal application in mechanistic communication. A recent survey of pathway diagrams identified 64,643 pathway figures published between 1995 and 2019 with 1,112,551 mentions of 13,464 unique human genes participating in a wide variety of biological processes. Researchers generally create these diagrams using generic diagram editing software that does not itself embody any biomedical knowledge. Biomedical knowledge graphs (KGs) integrate and represent knowledge in a semantically consistent way, systematically capturing biomedical knowledge similar to that in molecular cartoons. KGs have the potential to provide context and precise details useful in drawing such figures. However, KGs cannot generally be translated directly into figures. They include substantial material irrelevant to the scientific point of a given figure and are often more detailed than is appropriate. How could KGs be used to facilitate the creation of molecular diagrams? Here we present a new approach towards cartoon image creation that utilizes the semantic structure of knowledge graphs to aid the production of molecular diagrams. We introduce a set of "semantic graphical actions" that select and transform the relational information between heterogeneous entities (e.g., genes, proteins, pathways, diseases) in a KG to produce diagram schematics that meet the scientific communication needs of the user. These semantic actions search, select, filter, transform, group, arrange, connect and extract relevant subgraphs from KGs based on meaning in biological terms, e.g., a protein upstream of a target in a pathway. To demonstrate the utility of this approach, we show how semantic graphical actions on KGs could have been used to produce three existing pathway diagrams in diverse biomedical domains: Down Syndrome, COVID-19, and neuroinflammation. Our focus is on recapitulating the semantic content of the figures, not the layout, glyphs, or other aesthetic aspects. Our results suggest that the use of KGs and semantic graphical actions to produce biomedical diagrams will reduce the effort required and improve the quality of this visual form of scientific communication.
Collapse
|
22
|
Li MM, Huang K, Zitnik M. Graph representation learning in biomedicine and healthcare. Nat Biomed Eng 2022; 6:1353-1369. [PMID: 36316368 PMCID: PMC10699434 DOI: 10.1038/s41551-022-00942-x] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 08/09/2022] [Indexed: 11/11/2022]
Abstract
Networks-or graphs-are universal descriptors of systems of interacting elements. In biomedicine and healthcare, they can represent, for example, molecular interactions, signalling pathways, disease co-morbidities or healthcare systems. In this Perspective, we posit that representation learning can realize principles of network medicine, discuss successes and current limitations of the use of representation learning on graphs in biomedicine and healthcare, and outline algorithmic strategies that leverage the topology of graphs to embed them into compact vectorial spaces. We argue that graph representation learning will keep pushing forward machine learning for biomedicine and healthcare applications, including the identification of genetic variants underlying complex traits, the disentanglement of single-cell behaviours and their effects on health, the assistance of patients in diagnosis and treatment, and the development of safe and effective medicines.
Collapse
Affiliation(s)
- Michelle M Li
- Bioinformatics and Integrative Genomics Program, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Kexin Huang
- Health Data Science Program, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Marinka Zitnik
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Harvard Data Science Initiative, Cambridge, MA, USA.
| |
Collapse
|
23
|
Bonner S, Barrett IP, Ye C, Swiers R, Engkvist O, Bender A, Hoyt CT, Hamilton WL. A review of biomedical datasets relating to drug discovery: a knowledge graph perspective. Brief Bioinform 2022; 23:6712301. [PMID: 36151740 DOI: 10.1093/bib/bbac404] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 07/14/2022] [Accepted: 08/20/2022] [Indexed: 12/14/2022] Open
Abstract
Drug discovery and development is a complex and costly process. Machine learning approaches are being investigated to help improve the effectiveness and speed of multiple stages of the drug discovery pipeline. Of these, those that use Knowledge Graphs (KG) have promise in many tasks, including drug repurposing, drug toxicity prediction and target gene-disease prioritization. In a drug discovery KG, crucial elements including genes, diseases and drugs are represented as entities, while relationships between them indicate an interaction. However, to construct high-quality KGs, suitable data are required. In this review, we detail publicly available sources suitable for use in constructing drug discovery focused KGs. We aim to help guide machine learning and KG practitioners who are interested in applying new techniques to the drug discovery field, but who may be unfamiliar with the relevant data sources. The datasets are selected via strict criteria, categorized according to the primary type of information contained within and are considered based upon what information could be extracted to build a KG. We then present a comparative analysis of existing public drug discovery KGs and an evaluation of selected motivating case studies from the literature. Additionally, we raise numerous and unique challenges and issues associated with the domain and its datasets, while also highlighting key future research directions. We hope this review will motivate KGs use in solving key and emerging questions in the drug discovery domain.
Collapse
Affiliation(s)
- Stephen Bonner
- Data Sciences and Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Ian P Barrett
- Data Sciences and Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Cheng Ye
- Data Sciences and Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Rowan Swiers
- Data Sciences and Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweeden
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, UK
| | | | - William L Hamilton
- School of Computer Science, McGill University, Canada.,Mila-Quebec AI Institute, Montreal, Canada
| |
Collapse
|
24
|
Davidson J, Vashisht R, Butte AJ. From Genes to Geography, from Cells to Community, from Biomolecules to Behaviors: The Importance of Social Determinants of Health. Biomolecules 2022; 12:biom12101449. [PMID: 36291658 PMCID: PMC9599320 DOI: 10.3390/biom12101449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 10/04/2022] [Accepted: 10/06/2022] [Indexed: 12/05/2022] Open
Abstract
Much scientific work over the past few decades has linked health outcomes and disease risk to genomics, to derive a better understanding of disease mechanisms at the genetic and molecular level. However, genomics alone does not quite capture the full picture of one’s overall health. Modern computational biomedical research is moving in the direction of including social/environmental factors that ultimately affect quality of life and health outcomes at both the population and individual level. The future of studying disease now lies at the hands of the social determinants of health (SDOH) to answer pressing clinical questions and address healthcare disparities across population groups through its integration into electronic health records (EHRs). In this perspective article, we argue that the SDOH are the future of disease risk and health outcomes studies due to their vast coverage of a patient’s overall health. SDOH data availability in EHRs has improved tremendously over the years with EHR toolkits, diagnosis codes, wearable devices, and census tract information to study disease risk. We discuss the availability of SDOH data, challenges in SDOH implementation, its future in real-world evidence studies, and the next steps to report study outcomes in an equitable and actionable way.
Collapse
Affiliation(s)
- Jaysón Davidson
- Pharmaceutical Science and Pharmacogenomics Graduate Program, University of California San Francisco, San Francisco, CA 94143, USA
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA 94143, USA
- Correspondence: jayso’
| | - Rohit Vashisht
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA 94143, USA
| | - Atul J. Butte
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA 94143, USA
| |
Collapse
|
25
|
Luo ZH, Zhu LD, Wang YM, Hu Qian S, Li M, Zhang W, Chen ZX. DSEATM: drug set enrichment analysis uncovering disease mechanisms by biomedical text mining. Brief Bioinform 2022; 23:6605028. [PMID: 35679594 DOI: 10.1093/bib/bbac228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 05/09/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
Disease pathogenesis is always a major topic in biomedical research. With the exponential growth of biomedical information, drug effect analysis for specific phenotypes has shown great promise in uncovering disease-associated pathways. However, this method has only been applied to a limited number of drugs. Here, we extracted the data of 4634 diseases, 3671 drugs, 112 809 disease-drug associations and 81 527 drug-gene associations by text mining of 29 168 919 publications. On this basis, we proposed a 'Drug Set Enrichment Analysis by Text Mining (DSEATM)' pipeline and applied it to 3250 diseases, which outperformed the state-of-the-art method. Furthermore, diseases pathways enriched by DSEATM were similar to those obtained using the TCGA cancer RNA-seq differentially expressed genes. In addition, the drug number, which showed a remarkable positive correlation of 0.73 with the AUC, plays a determining role in the performance of DSEATM. Taken together, DSEATM is an auspicious and accurate disease research tool that offers fresh insights.
Collapse
Affiliation(s)
- Zhi-Hui Luo
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Li-Da Zhu
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Ya-Min Wang
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Sheng Hu Qian
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Menglu Li
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| | - Zhen-Xia Chen
- Hubei Hongshan Laboratory, College of Biomedicine and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Hubei Key Laboratory of Agricultural Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Interdisciplinary Sciences Institute, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Institute of Nutrition and Health, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China.,Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Huazhong Agricultural University, Wuhan, 430070, Hubei, PR China
| |
Collapse
|
26
|
Rintala TJ, Ghosh A, Fortino V. Network approaches for modeling the effect of drugs and diseases. Brief Bioinform 2022; 23:6608969. [PMID: 35704883 PMCID: PMC9294412 DOI: 10.1093/bib/bbac229] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/29/2022] [Accepted: 05/17/2021] [Indexed: 12/12/2022] Open
Abstract
The network approach is quickly becoming a fundamental building block of computational methods aiming at elucidating the mechanism of action (MoA) and therapeutic effect of drugs. By modeling the effect of drugs and diseases on different biological networks, it is possible to better explain the interplay between disease perturbations and drug targets as well as how drug compounds induce favorable biological responses and/or adverse effects. Omics technologies have been extensively used to generate the data needed to study the mechanisms of action of drugs and diseases. These data are often exploited to define condition-specific networks and to study whether drugs can reverse disease perturbations. In this review, we describe network data mining algorithms that are commonly used to study drug’s MoA and to improve our understanding of the basis of chronic diseases. These methods can support fundamental stages of the drug development process, including the identification of putative drug targets, the in silico screening of drug compounds and drug combinations for the treatment of diseases. We also discuss recent studies using biological and omics-driven networks to search for possible repurposed FDA-approved drug treatments for SARS-CoV-2 infections (COVID-19).
Collapse
Affiliation(s)
- T J Rintala
- Institute of Biomedicine, University of Eastern Finland, 70210 Kuopio, Finland
| | - Arindam Ghosh
- Institute of Biomedicine, University of Eastern Finland, 70210 Kuopio, Finland
| | - V Fortino
- Institute of Biomedicine, University of Eastern Finland, 70210 Kuopio, Finland
| |
Collapse
|
27
|
Ricciuto A, Rauter I, McGovern DPB, Mader RM, Reinisch W. Precision Medicine in Inflammatory Bowel Diseases: Challenges and Considerations for the Path Forward. Gastroenterology 2022; 162:1815-1821. [PMID: 35278416 DOI: 10.1053/j.gastro.2022.02.049] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Revised: 02/01/2022] [Accepted: 02/18/2022] [Indexed: 12/29/2022]
Affiliation(s)
- Amanda Ricciuto
- Division of Gastroenterology, Hepatology and Nutrition, The Hospital for Sick Children, Toronto, Ontario, Canada; Department of Paediatrics, University of Toronto, Toronto, Ontario, Canada
| | | | - Dermot P B McGovern
- F. Widjaja Foundation Inflammatory Bowel and Immunobiology Research Institute, Cedars-Sinai Medical Center, Los Angeles, California
| | - Robert M Mader
- Division of Oncology, Department of Medicine I, Medical University of Vienna, Vienna, Austria
| | | |
Collapse
|
28
|
Baranzini SE, Börner K, Morris J, Nelson CA, Soman K, Schleimer E, Keiser M, Musen M, Pearce R, Reza T, Smith B, Herr BW, Oskotsky B, Rizk‐Jackson A, Rankin KP, Sanders SJ, Bove R, Rose PW, Israni S, Huang S. A biomedical open knowledge network harnesses the power of AI to understand deep human biology. AI MAG 2022; 43:46-58. [PMID: 36093122 PMCID: PMC9456356 DOI: 10.1002/aaai.12037] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Knowledge representation and reasoning (KR&R) has been successfully implemented in many fields to enable computers to solve complex problems with AI methods. However, its application to biomedicine has been lagging in part due to the daunting complexity of molecular and cellular pathways that govern human physiology and pathology. In this article we describe concrete uses of SPOKE, an open knowledge network that connects curated information from 37 specialized and human-curated databases into a single property graph, with 3 million nodes and 15 million edges to date. Applications discussed in this article include drug discovery, COVID-19 research and chronic disease diagnosis and management.
Collapse
Affiliation(s)
- Sergio E. Baranzini
- Weill Institute for Neurosciences Department of Neurology University of California San Francisco San Francisco California USA
- Bakar Institute for Computational Health Sciences University of California San Francisco San Francisco California USA
| | - Katy Börner
- Department of Intelligent Systems Engineering Indiana University Bloomington Indiana USA
| | - John Morris
- Department of Pharmaceutical Chemistry University of California San Francisco San Francisco California USA
| | - Charlotte A. Nelson
- Weill Institute for Neurosciences Department of Neurology University of California San Francisco San Francisco California USA
| | - Karthik Soman
- Weill Institute for Neurosciences Department of Neurology University of California San Francisco San Francisco California USA
| | - Erica Schleimer
- Weill Institute for Neurosciences Department of Neurology University of California San Francisco San Francisco California USA
| | - Michael Keiser
- Department of Pharmaceutical Chemistry University of California San Francisco San Francisco California USA
- Institute for Neurodegenerative Diseases University of California San Francisco San Francisco California USA
| | - Mark Musen
- Department of Medicine (Biomedical Informatics) and of Biomedical Data Science Stanford University School of Medicine Stanford California USA
| | - Roger Pearce
- Center for Applied Scientific Computing (CASC) Lawrence Livermore National Laboratory Livermore California USA
| | - Tahsin Reza
- Center for Applied Scientific Computing (CASC) Lawrence Livermore National Laboratory Livermore California USA
| | - Brett Smith
- Institute for Systems Biology Seattle Washington USA
| | - Bruce W. Herr
- Department of Intelligent Systems Engineering Indiana University Bloomington Indiana USA
| | - Boris Oskotsky
- Bakar Institute for Computational Health Sciences University of California San Francisco San Francisco California USA
| | - Angela Rizk‐Jackson
- Bakar Institute for Computational Health Sciences University of California San Francisco San Francisco California USA
| | - Katherine P. Rankin
- Weill Institute for Neurosciences Department of Neurology University of California San Francisco San Francisco California USA
- Bakar Institute for Computational Health Sciences University of California San Francisco San Francisco California USA
| | - Stephan J. Sanders
- Bakar Institute for Computational Health Sciences University of California San Francisco San Francisco California USA
- Weill Institute for Neurosciences Department of Psychiatry and Behavioral Sciences University of California San Francisco San Francisco California USA
| | - Riley Bove
- Weill Institute for Neurosciences Department of Neurology University of California San Francisco San Francisco California USA
- Bakar Institute for Computational Health Sciences University of California San Francisco San Francisco California USA
| | - Peter W. Rose
- San Diego Supercomputer Center University of California San Diego La Jolla California USA
| | - Sharat Israni
- Bakar Institute for Computational Health Sciences University of California San Francisco San Francisco California USA
| | - Sui Huang
- Institute for Systems Biology Seattle Washington USA
| |
Collapse
|
29
|
Chen WC, Boreta L, Braunstein SE, Rabow MW, Kaplan LE, Tenenbaum JD, Morin O, Park CC, Hong JC. Association of mental health diagnosis with race and all-cause mortality after a cancer diagnosis: Large-scale analysis of electronic health record data. Cancer 2022; 128:344-352. [PMID: 34550601 PMCID: PMC8738115 DOI: 10.1002/cncr.33903] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 05/19/2021] [Accepted: 05/20/2021] [Indexed: 01/17/2023]
Abstract
BACKGROUND Disparity in mental health care among cancer patients remains understudied. METHODS A large, retrospective, single tertiary-care institution cohort study was conducted based on deidentified electronic health record data of 54,852 adult cancer patients without prior mental health diagnosis (MHD) diagnosed at the University of California, San Francisco between January 2012 and September 2019. The exposure of interest was early-onset MHD with or without psychotropic medication (PM) within 12 months of cancer diagnosis and primary outcome was all-cause mortality. RESULTS There were 8.2% of patients who received a new MHD at a median of 197 days (interquartile range, 61-553) after incident cancer diagnosis; 31.0% received a PM prescription; and 3.7% a mental health-related visit (MHRV). There were 62.6% of patients who were non-Hispanic White (NHW), 10.8% were Asian, 9.8% were Hispanic, and 3.8% were Black. Compared with NHWs, minority cancer patients had reduced adjusted odds of MHDs, PM prescriptions, and MHRVs, particularly for generalized anxiety (Asian odds ratio [OR], 0.66, 95% CI, 0.55-0.78; Black OR, 0.60, 95% CI, 0.45-0.79; Hispanic OR, 0.72, 95% CI, 0.61-0.85) and selective serotonin-reuptake inhibitors (Asian OR, 0.43, 95% CI, 0.37-0.50; Black OR, 0.51, 95% CI, 0.40-0.61; Hispanic OR, 0.79, 95% CI, 0.70-0.89). New early MHD with PM was associated with elevated all-cause mortality (12-24 months: hazard ratio [HR], 1.43, 95% CI, 1.25-1.64) that waned by 24 to 36 months (HR, 1.18, 95% CI, 0.95-1.45). CONCLUSIONS New mental health diagnosis with PM was a marker of early mortality among cancer patients. Minority cancer patients were less likely to receive documentation of MHDs or treatment, which may represent missed opportunities to identify and treat cancer-related mental health conditions.
Collapse
Affiliation(s)
- William C Chen
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, CA
| | - Lauren Boreta
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, CA
| | - Steve E Braunstein
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, CA
| | - Michael W Rabow
- Department of Internal Medicine, Division of Palliative Medicine, and Department of Urology, University of California San Francisco, California
| | - Lawrence E Kaplan
- Department of Psychiatry, University of California San Francisco, California
| | | | - Olivier Morin
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, CA
| | - Catherine C Park
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, CA
| | - Julian C Hong
- Department of Radiation Oncology, University of California, San Francisco, San Francisco, CA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA
| |
Collapse
|
30
|
Yin Z, Wong STC. Artificial intelligence unifies knowledge and actions in drug repositioning. Emerg Top Life Sci 2021; 5:803-813. [PMID: 34881780 PMCID: PMC8923082 DOI: 10.1042/etls20210223] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 11/08/2021] [Accepted: 11/09/2021] [Indexed: 11/17/2022]
Abstract
Drug repositioning aims to reuse existing drugs, shelved drugs, or drug candidates that failed clinical trials for other medical indications. Its attraction is sprung from the reduction in risk associated with safety testing of new medications and the time to get a known drug into the clinics. Artificial Intelligence (AI) has been recently pursued to speed up drug repositioning and discovery. The essence of AI in drug repositioning is to unify the knowledge and actions, i.e. incorporating real-world and experimental data to map out the best way forward to identify effective therapeutics against a disease. In this review, we share positive expectations for the evolution of AI and drug repositioning and summarize the role of AI in several methods of drug repositioning.
Collapse
Affiliation(s)
- Zheng Yin
- Department of Systems Medicine and Bioengineering, Houston Methodist Cancer Center and Ting Tsung & Wei Fong Chao Center for BRAIN, Houston Methodist Research Institute, Weill Cornell Medicine, Houston, TX 77030, U.S.A
| | - Stephen T C Wong
- Department of Systems Medicine and Bioengineering, Houston Methodist Cancer Center and Ting Tsung & Wei Fong Chao Center for BRAIN, Houston Methodist Research Institute, Weill Cornell Medicine, Houston, TX 77030, U.S.A
| |
Collapse
|
31
|
Nelson CA, Bove R, Butte AJ, Baranzini SE. Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis. J Am Med Inform Assoc 2021; 29:424-434. [PMID: 34915552 PMCID: PMC8800523 DOI: 10.1093/jamia/ocab270] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 10/22/2021] [Accepted: 11/26/2021] [Indexed: 11/28/2022] Open
Abstract
OBJECTIVE Early identification of chronic diseases is a pillar of precision medicine as it can lead to improved outcomes, reduction of disease burden, and lower healthcare costs. Predictions of a patient's health trajectory have been improved through the application of machine learning approaches to electronic health records (EHRs). However, these methods have traditionally relied on "black box" algorithms that can process large amounts of data but are unable to incorporate domain knowledge, thus limiting their predictive and explanatory power. Here, we present a method for incorporating domain knowledge into clinical classifications by embedding individual patient data into a biomedical knowledge graph. MATERIALS AND METHODS A modified version of the Page rank algorithm was implemented to embed millions of deidentified EHRs into a biomedical knowledge graph (SPOKE). This resulted in high-dimensional, knowledge-guided patient health signatures (ie, SPOKEsigs) that were subsequently used as features in a random forest environment to classify patients at risk of developing a chronic disease. RESULTS Our model predicted disease status of 5752 subjects 3 years before being diagnosed with multiple sclerosis (MS) (AUC = 0.83). SPOKEsigs outperformed predictions using EHRs alone, and the biological drivers of the classifiers provided insight into the underpinnings of prodromal MS. CONCLUSION Using data from EHR as input, SPOKEsigs describe patients at both the clinical and biological levels. We provide a clinical use case for detecting MS up to 5 years prior to their documented diagnosis in the clinic and illustrate the biological features that distinguish the prodromal MS state.
Collapse
Affiliation(s)
- Charlotte A Nelson
- Integrated Program in Quantitative Biology, University of California San Francisco, San Francisco, California, USA,Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA
| | - Riley Bove
- Department of Neurology, UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California, USA
| | - Atul J Butte
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, California, USA,Department of Pediatrics, University of California San Francisco, San Francisco, California, USA
| | - Sergio E Baranzini
- Corresponding Author: Sergio E. Baranzini, PhD, Department of Neurology, UCSF Weill Institute for Neurosciences, University of California San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94143, USA;
| |
Collapse
|
32
|
Hügle T, Kalweit M. [Artificial intelligence-supported treatment in rheumatology : Principles, current situation and perspectives]. Z Rheumatol 2021; 80:914-927. [PMID: 34618208 PMCID: PMC8651581 DOI: 10.1007/s00393-021-01096-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/21/2021] [Indexed: 11/02/2022]
Abstract
Computer-guided clinical decision support systems have been finding their way into practice for some time, mostly integrated into electronic medical records. The primary goals are to improve the quality of treatment, save time and avoid errors. These are mostly rule-based algorithms that recognize drug interactions or provide reminder functions. Through artificial intelligence (AI), clinical decision support systems can be disruptively further developed. New knowledge is constantly being created from data through machine learning in order to predict the individual course of a patient's disease, identify phenotypes or support treatment decisions. Such algorithms already exist for rheumatological diseases. Automated image recognition and disease prediction in rheumatoid arthritis are the most advanced; however, these have not yet been sufficiently tested or integrated into existing decision support systems. Rather than dictating the AI-assisted choice of treatment to the doctor, future clinical decision systems are seen as hybrid decision support, always involving both the expert and the patient. There is also a great need for security through comprehensible and auditable algorithms to sustainably guarantee the quality and transparency of AI-assisted treatment recommendations in the long term.
Collapse
Affiliation(s)
- Thomas Hügle
- Abteilung Rheumatologie, Universitätsspital Lausanne (CHUV) und Universität Lausanne, Avenue Pierre-Decker 4, 1011 Lausanne, Schweiz
| | - Maria Kalweit
- Institut für Informatik, Albert-Ludwigs-Universität Freiburg, Universität Freiburg im Breisgau, Georges-Koehler-Allee 80, 79110 Freiburg im Breisgau, Deutschland
| |
Collapse
|
33
|
Piekos SN, Gaddam S, Bhardwaj P, Radhakrishnan P, Guha RV, Oro AE. Biomedical Data Commons (BMDC) prioritizes B-lymphocyte non-coding genetic variants in Type 1 Diabetes. PLoS Comput Biol 2021; 17:e1009382. [PMID: 34543288 PMCID: PMC8483327 DOI: 10.1371/journal.pcbi.1009382] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Revised: 09/30/2021] [Accepted: 08/25/2021] [Indexed: 11/18/2022] Open
Abstract
The repurposing of biomedical data is inhibited by its fragmented and multi-formatted nature that requires redundant investment of time and resources by data scientists. This is particularly true for Type 1 Diabetes (T1D), one of the most intensely studied common childhood diseases. Intense investigation of the contribution of pancreatic β-islet and T-lymphocytes in T1D has been made. However, genetic contributions from B-lymphocytes, which are known to play a role in a subset of T1D patients, remain relatively understudied. We have addressed this issue through the creation of Biomedical Data Commons (BMDC), a knowledge graph that integrates data from multiple sources into a single queryable format. This increases the speed of analysis by multiple orders of magnitude. We develop a pipeline using B-lymphocyte multi-dimensional epigenome and connectome data and deploy BMDC to assess genetic variants in the context of Type 1 Diabetes (T1D). Pipeline-identified variants are primarily common, non-coding, poorly conserved, and are of unknown clinical significance. While variants and their chromatin connectivity are cell-type specific, they are associated with well-studied disease genes in T-lymphocytes. Candidates include established variants in the HLA-DQB1 and HLA-DRB1 and IL2RA loci that have previously been demonstrated to protect against T1D in humans and mice providing validation for this method. Others are included in the well-established T1D GRS2 genetic risk scoring method. More intriguingly, other prioritized variants are completely novel and form the basis for future mechanistic and clinical validation studies The BMDC community-based platform can be expanded and repurposed to increase the accessibility, reproducibility, and productivity of biomedical information for diverse applications including the prioritization of cell type-specific disease alleles from complex phenotypes.
Collapse
Affiliation(s)
- Samantha N. Piekos
- Program in Epithelial Biology, Stanford University, Stanford, California, United States of America
- Google Data Commons, Mountain View, California, United States of America
| | - Sadhana Gaddam
- Program in Epithelial Biology, Stanford University, Stanford, California, United States of America
| | - Pranav Bhardwaj
- Department of Statistics, Stanford University, Stanford, California, United States of America
| | | | - Ramanathan V. Guha
- Google Data Commons, Mountain View, California, United States of America
| | - Anthony E. Oro
- Program in Epithelial Biology, Stanford University, Stanford, California, United States of America
| |
Collapse
|
34
|
Deane KD, Holers VM. Rheumatoid Arthritis Pathogenesis, Prediction, and Prevention: An Emerging Paradigm Shift. Arthritis Rheumatol 2021; 73:181-193. [PMID: 32602263 PMCID: PMC7772259 DOI: 10.1002/art.41417] [Citation(s) in RCA: 119] [Impact Index Per Article: 39.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2020] [Accepted: 06/08/2020] [Indexed: 12/12/2022]
Abstract
Rheumatoid arthritis (RA) is currently diagnosed and treated when an individual presents with signs and symptoms of inflammatory arthritis (IA) as well as other features, such as autoantibodies and/or imaging findings, that provide sufficient confidence that the individual has RA-like IA (e.g., meeting established classification criteria) that warrants therapeutic intervention. However, it is now known that there is a stage of seropositive RA during which circulating biomarkers and other factors (e.g., joint symptoms) can be used to predict if and when an individual who does not currently have IA may develop future clinically apparent IA and classifiable RA. Indeed, the discovery of the "pre-RA" stage of seropositive disease has led to the development of several clinical trials in which individuals are studied to identify ways to delay or prevent the onset of clinically apparent IA/RA. This review focuses on several issues pertinent to understanding the prevention of RA. These include discussion of the pathogenesis of pre-RA development, prediction of the likelihood and timing of future classifiable RA, and a review of completed and ongoing clinical trials in RA prevention. Furthermore, this review discusses challenges and opportunities to be addressed to effect a paradigm shift in RA, where in the near future, proactive risk assessment focused on prevention of RA will become a public health strategy in much the same manner as cardiovascular disease is managed today.
Collapse
Affiliation(s)
- Kevin D. Deane
- Division of Rheumatology, University of Colorado Denver School of Medicine, Anschutz Medical Campus, Aurora, Colorado, USA
| | - V. Michael Holers
- Division of Rheumatology, University of Colorado Denver School of Medicine, Anschutz Medical Campus, Aurora, Colorado, USA
| |
Collapse
|
35
|
Nelson CA, Acuna AU, Paul AM, Scott RT, Butte AJ, Cekanaviciute E, Baranzini SE, Costes SV. Knowledge Network Embedding of Transcriptomic Data from Spaceflown Mice Uncovers Signs and Symptoms Associated with Terrestrial Diseases. Life (Basel) 2021; 11:life11010042. [PMID: 33445483 PMCID: PMC7828077 DOI: 10.3390/life11010042] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 01/01/2021] [Accepted: 01/04/2021] [Indexed: 12/17/2022] Open
Abstract
There has long been an interest in understanding how the hazards from spaceflight may trigger or exacerbate human diseases. With the goal of advancing our knowledge on physiological changes during space travel, NASA GeneLab provides an open-source repository of multi-omics data from real and simulated spaceflight studies. Alone, this data enables identification of biological changes during spaceflight, but cannot infer how that may impact an astronaut at the phenotypic level. To bridge this gap, Scalable Precision Medicine Oriented Knowledge Engine (SPOKE), a heterogeneous knowledge graph connecting biological and clinical data from over 30 databases, was used in combination with GeneLab transcriptomic data from six studies. This integration identified critical symptoms and physiological changes incurred during spaceflight.
Collapse
Affiliation(s)
- Charlotte A. Nelson
- Integrated Program in Quantitative Biology, University of California San Francisco, San Francisco, CA 94143, USA;
| | - Ana Uriarte Acuna
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA; (A.U.A.); (A.M.P.); (R.T.S.); (E.C.)
- KBR, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Amber M. Paul
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA; (A.U.A.); (A.M.P.); (R.T.S.); (E.C.)
- NASA Postdoctoral Program, Universities Space Research Association (USRA), Mountain View, CA 94043, USA
| | - Ryan T. Scott
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA; (A.U.A.); (A.M.P.); (R.T.S.); (E.C.)
- KBR, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Atul J. Butte
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA 94143, USA;
- Department of Pediatrics, University of California San Francisco, San Francisco, CA 94143, USA
| | - Egle Cekanaviciute
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA; (A.U.A.); (A.M.P.); (R.T.S.); (E.C.)
| | - Sergio E. Baranzini
- Integrated Program in Quantitative Biology, University of California San Francisco, San Francisco, CA 94143, USA;
- Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA 94143, USA;
- Weill Institute for Neuroscience, Department of Neurology, University of California San Francisco, San Francisco, CA 94143, USA
- Correspondence: (S.E.B.); (S.V.C.)
| | - Sylvain V. Costes
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA; (A.U.A.); (A.M.P.); (R.T.S.); (E.C.)
- Correspondence: (S.E.B.); (S.V.C.)
| |
Collapse
|
36
|
LaPelusa M, Donoviel D, Branzini SE, Carlson PE, Culler S, Cheema AK, Kaddurah-Daouk R, Kelly D, de Cremoux I, Knight R, Krajmalnik-Brown R, Mayo SL, Mazmanian SK, Mayer EA, Petrosino JF, Garrison K. Microbiome for Mars: surveying microbiome connections to healthcare with implications for long-duration human spaceflight, virtual workshop, July 13, 2020. MICROBIOME 2021; 9:2. [PMID: 33397500 PMCID: PMC7781430 DOI: 10.1186/s40168-020-00951-5] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Accepted: 12/06/2020] [Indexed: 06/12/2023]
Abstract
The inaugural "Microbiome for Mars" virtual workshop took place on July 13, 2020. This event assembled leaders in microbiome research and development to discuss their work and how it may relate to long-duration human space travel. The conference focused on surveying current microbiome research, future endeavors, and how this growing field could broadly impact human health and space exploration. This report summarizes each speaker's presentation in the order presented at the workshop.
Collapse
Affiliation(s)
- Michael LaPelusa
- Department of Medicine, Vanderbilt University Medical Center, One Hundred Oaks - North 719 Thompson Lane Suite 20400, Nashville, TN, 37204, USA.
| | - Dorit Donoviel
- Department of Pharmacology and Chemical Biology, Center for Space Medicine, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Sergio E Branzini
- Department of Neurology, Weill Institute for Neurosciences, University of California, San Francisco, CA, 94158, USA
| | - Paul E Carlson
- Laboratory of Mucosal Pathogens and Cellular Immunology, Division of Bacterial, Parasitic, and Allergenic Products, Office of Vaccines Research and Review, Center for Biologics Evaluation and Research, United States Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Stephanie Culler
- Persephone Biosciences Inc, JLABS, 3210 Merryfield Row, San Diego, CA, 92121, USA
| | - Amrita K Cheema
- Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University Medical Center, Washington, DC, 20007, USA
| | - Rima Kaddurah-Daouk
- Department of Psychiatry and Behavioral Sciences, Department of Medicine and the Duke Institute for Brain Sciences, Duke University, Durham, NC, 27708, USA
| | - Denise Kelly
- Seventure Partners, 5-7 rue de Monttessuy, 75340 Cedex 07, Paris, France
| | | | - Rob Knight
- Departments of Pediatrics, Bioengineering, and Computer Science & Engineering, University of California San Diego, 9500 Gilman Drive, MC 0763, La Jolla, CA, 92093-0763, USA
| | - Rosa Krajmalnik-Brown
- Biodesign Center for Health Through Microbiomes, Arizona State University, Tempe, AZ, USA
- School of Sustainable Engineering and the Built Environment, Arizona State University, Tempe, AZ, USA
| | - Stephen L Mayo
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E. California Bl, Pasadena, CA, 91125, USA
| | - Sarkis K Mazmanian
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E. California Bl, Pasadena, CA, 91125, USA
| | - Emeran A Mayer
- G. Oppenheimer Family Center for Neurobiology of Stress and Resilience, Ingestive Behavior and Obesity Program, University of California Los Angeles, Los Angeles, CA, USA
- Vatche and Tamar Manoukian Division of Digestive Diseases, University of California Los Angeles, Los Angeles, CA, USA
- David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Joseph F Petrosino
- Department of Molecular Virology and Microbiology, Alkek Center for Metagenomics and Microbiome Research, Baylor College of Medicine, Houston, Texas, USA
| | - Keith Garrison
- Department of Medicine, The University of Texas at Houston Health Sciences Center, 6431 Fannin St, Houston, TX, 77030, USA.
| |
Collapse
|
37
|
Hamamoto R, Suvarna K, Yamada M, Kobayashi K, Shinkai N, Miyake M, Takahashi M, Jinnai S, Shimoyama R, Sakai A, Takasawa K, Bolatkan A, Shozu K, Dozen A, Machino H, Takahashi S, Asada K, Komatsu M, Sese J, Kaneko S. Application of Artificial Intelligence Technology in Oncology: Towards the Establishment of Precision Medicine. Cancers (Basel) 2020; 12:E3532. [PMID: 33256107 PMCID: PMC7760590 DOI: 10.3390/cancers12123532] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Revised: 11/21/2020] [Accepted: 11/24/2020] [Indexed: 02/07/2023] Open
Abstract
In recent years, advances in artificial intelligence (AI) technology have led to the rapid clinical implementation of devices with AI technology in the medical field. More than 60 AI-equipped medical devices have already been approved by the Food and Drug Administration (FDA) in the United States, and the active introduction of AI technology is considered to be an inevitable trend in the future of medicine. In the field of oncology, clinical applications of medical devices using AI technology are already underway, mainly in radiology, and AI technology is expected to be positioned as an important core technology. In particular, "precision medicine," a medical treatment that selects the most appropriate treatment for each patient based on a vast amount of medical data such as genome information, has become a worldwide trend; AI technology is expected to be utilized in the process of extracting truly useful information from a large amount of medical data and applying it to diagnosis and treatment. In this review, we would like to introduce the history of AI technology and the current state of medical AI, especially in the oncology field, as well as discuss the possibilities and challenges of AI technology in the medical field.
Collapse
Affiliation(s)
- Ryuji Hamamoto
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
- Department of NCC Cancer Science, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo 113-8510, Japan
| | - Kruthi Suvarna
- Indian Institute of Technology Bombay, Powai, Mumbai 400 076, India;
| | - Masayoshi Yamada
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Department of Endoscopy, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku Tokyo 104-0045, Japan
| | - Kazuma Kobayashi
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
- Department of NCC Cancer Science, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo 113-8510, Japan
| | - Norio Shinkai
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
- Department of NCC Cancer Science, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo 113-8510, Japan
| | - Mototaka Miyake
- Department of Diagnostic Radiology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan;
| | - Masamichi Takahashi
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Department of Neurosurgery and Neuro-Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan
| | - Shunichi Jinnai
- Department of Dermatologic Oncology, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan;
| | - Ryo Shimoyama
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
| | - Akira Sakai
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Department of NCC Cancer Science, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo 113-8510, Japan
| | - Ken Takasawa
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
| | - Amina Bolatkan
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
| | - Kanto Shozu
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
| | - Ai Dozen
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
| | - Hidenori Machino
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
| | - Satoshi Takahashi
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
| | - Ken Asada
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
| | - Masaaki Komatsu
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
| | - Jun Sese
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Humanome Lab, 2-4-10 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan
| | - Syuzo Kaneko
- Division of Molecular Modification and Cancer Biology, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; (M.Y.); (K.K.); (N.S.); (M.T.); (R.S.); (A.S.); (K.T.); (A.B.); (K.S.); (A.D.); (H.M.); (S.T.); (K.A.); (M.K.); (J.S.); (S.K.)
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
| |
Collapse
|
38
|
Afshinnekoo E, Scott RT, MacKay MJ, Pariset E, Cekanaviciute E, Barker R, Gilroy S, Hassane D, Smith SM, Zwart SR, Nelman-Gonzalez M, Crucian BE, Ponomarev SA, Orlov OI, Shiba D, Muratani M, Yamamoto M, Richards SE, Vaishampayan PA, Meydan C, Foox J, Myrrhe J, Istasse E, Singh N, Venkateswaran K, Keune JA, Ray HE, Basner M, Miller J, Vitaterna MH, Taylor DM, Wallace D, Rubins K, Bailey SM, Grabham P, Costes SV, Mason CE, Beheshti A. Fundamental Biological Features of Spaceflight: Advancing the Field to Enable Deep-Space Exploration. Cell 2020; 183:1162-1184. [PMID: 33242416 PMCID: PMC8441988 DOI: 10.1016/j.cell.2020.10.050] [Citation(s) in RCA: 166] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 10/28/2020] [Accepted: 10/29/2020] [Indexed: 12/14/2022]
Abstract
Research on astronaut health and model organisms have revealed six features of spaceflight biology that guide our current understanding of fundamental molecular changes that occur during space travel. The features include oxidative stress, DNA damage, mitochondrial dysregulation, epigenetic changes (including gene regulation), telomere length alterations, and microbiome shifts. Here we review the known hazards of human spaceflight, how spaceflight affects living systems through these six fundamental features, and the associated health risks of space exploration. We also discuss the essential issues related to the health and safety of astronauts involved in future missions, especially planned long-duration and Martian missions.
Collapse
Affiliation(s)
- Ebrahim Afshinnekoo
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10021, USA; The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY 10021, USA
| | - Ryan T Scott
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Matthew J MacKay
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10021, USA; The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY 10021, USA
| | - Eloise Pariset
- Universities Space Research Association (USRA), Mountain View, CA 94043, USA; Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Egle Cekanaviciute
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Richard Barker
- Department of Botany, University of Wisconsin, Madison, WI 53706, USA
| | - Simon Gilroy
- Department of Botany, University of Wisconsin, Madison, WI 53706, USA
| | | | - Scott M Smith
- Human Health and Performance Directorate, NASA Johnson Space Center, Houston, TX 77058, USA
| | - Sara R Zwart
- Department of Preventive Medicine and Community Health, University of Texas Medical Branch, Galveston, TX 77555, USA
| | - Mayra Nelman-Gonzalez
- KBR, Human Health and Performance Directorate, NASA Johnson Space Center, Houston, TX 77058, USA
| | - Brian E Crucian
- Human Health and Performance Directorate, NASA Johnson Space Center, Houston, TX 77058, USA
| | - Sergey A Ponomarev
- Institute for the Biomedical Problems, Russian Academy of Sciences, 123007 Moscow, Russia
| | - Oleg I Orlov
- Institute for the Biomedical Problems, Russian Academy of Sciences, 123007 Moscow, Russia
| | - Dai Shiba
- JEM Utilization Center, Human Spaceflight Technology Directorate, Japan Aerospace Exploration Agency (JAXA), Ibaraki 305-8505, Japan
| | - Masafumi Muratani
- Transborder Medical Research Center, and Department of Genome Biology, Faculty of Medicine, University of Tsukuba, Ibaraki 305-8575, Japan
| | - Masayuki Yamamoto
- Department of Medical Biochemistry, Tohoku University Graduate School of Medicine, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8575, Japan; Department of Integrative Genomics, Tohoku Medical Megabank Organization, Tohoku University, 2-1 Seiryo-machi, Aoba-ku, Sendai, Miyagi 980-8573, Japan
| | - Stephanie E Richards
- Bionetics, NASA Kennedy Space Center, Kennedy Space Center, Merritt Island, FL 32899, USA
| | - Parag A Vaishampayan
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Cem Meydan
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10021, USA; The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY 10021, USA
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10021, USA; The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY 10021, USA
| | - Jacqueline Myrrhe
- European Space Agency, Research and Payloads Group, Data Exploitation and Utilisation Strategy Office, 2200 AG Noordwijk, the Netherlands
| | - Eric Istasse
- European Space Agency, Research and Payloads Group, Data Exploitation and Utilisation Strategy Office, 2200 AG Noordwijk, the Netherlands
| | - Nitin Singh
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109, USA
| | - Kasthuri Venkateswaran
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA 91109, USA
| | - Jessica A Keune
- Space Medicine Operations Division, NASA Johnson Space Center, Houston, TX 77058, USA
| | - Hami E Ray
- ASRC Federal Space and Defense, Inc., Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA
| | - Mathias Basner
- Department of Psychiatry, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Jack Miller
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA; Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Martha Hotz Vitaterna
- Center for Sleep and Circadian Biology, Northwestern University, Evanston, IL 60208, USA; Department of Neurobiology, Northwestern University, Evanston, IL 60208, USA
| | - Deanne M Taylor
- Department of Biomedical Informatics, The Children's Hospital of Philadelphia, PA 19104, USA; Center for Mitochondrial and Epigenomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; The Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Douglas Wallace
- Center for Mitochondrial and Epigenomic Medicine, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; The Department of Pediatrics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Kathleen Rubins
- Astronaut Office, NASA Johnson Space Center, Houston, TX 77058, USA
| | - Susan M Bailey
- Department of Environmental & Radiological Health Sciences, Colorado State University, Fort Collins, CO 80523, USA.
| | - Peter Grabham
- Center for Radiological Research, Department of Oncology, College of Physicians and Surgeons, Columbia University, New York, NY 10027, USA.
| | - Sylvain V Costes
- Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA.
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY 10021, USA; The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY 10021, USA; WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY 10021, USA; The Feil Family Brain and Mind Research Institute, Weill Cornell Medicine, NY 10021, USA.
| | - Afshin Beheshti
- KBR, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.
| |
Collapse
|
39
|
Ammar N, Shaban-Nejad A. Explainable Artificial Intelligence Recommendation System by Leveraging the Semantics of Adverse Childhood Experiences: Proof-of-Concept Prototype Development. JMIR Med Inform 2020; 8:e18752. [PMID: 33146623 PMCID: PMC7673979 DOI: 10.2196/18752] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 08/25/2020] [Accepted: 10/08/2020] [Indexed: 12/26/2022] Open
Abstract
Background The study of adverse childhood experiences and their consequences has emerged over the past 20 years. Although the conclusions from these studies are available, the same is not true of the data. Accordingly, it is a complex problem to build a training set and develop machine-learning models from these studies. Classic machine learning and artificial intelligence techniques cannot provide a full scientific understanding of the inner workings of the underlying models. This raises credibility issues due to the lack of transparency and generalizability. Explainable artificial intelligence is an emerging approach for promoting credibility, accountability, and trust in mission-critical areas such as medicine by combining machine-learning approaches with explanatory techniques that explicitly show what the decision criteria are and why (or how) they have been established. Hence, thinking about how machine learning could benefit from knowledge graphs that combine “common sense” knowledge as well as semantic reasoning and causality models is a potential solution to this problem. Objective In this study, we aimed to leverage explainable artificial intelligence, and propose a proof-of-concept prototype for a knowledge-driven evidence-based recommendation system to improve mental health surveillance. Methods We used concepts from an ontology that we have developed to build and train a question-answering agent using the Google DialogFlow engine. In addition to the question-answering agent, the initial prototype includes knowledge graph generation and recommendation components that leverage third-party graph technology. Results To showcase the framework functionalities, we here present a prototype design and demonstrate the main features through four use case scenarios motivated by an initiative currently implemented at a children’s hospital in Memphis, Tennessee. Ongoing development of the prototype requires implementing an optimization algorithm of the recommendations, incorporating a privacy layer through a personal health library, and conducting a clinical trial to assess both usability and usefulness of the implementation. Conclusions This semantic-driven explainable artificial intelligence prototype can enhance health care practitioners’ ability to provide explanations for the decisions they make.
Collapse
Affiliation(s)
- Nariman Ammar
- University of Tennessee Health Science Center - Oak Ridge National Laboratory, Center for Biomedical Informatics, Department of Pediatrics, College of Medicine, Memphis, TN, United States
| | - Arash Shaban-Nejad
- University of Tennessee Health Science Center - Oak Ridge National Laboratory, Center for Biomedical Informatics, Department of Pediatrics, College of Medicine, Memphis, TN, United States
| |
Collapse
|
40
|
Drukker L, Noble JA, Papageorghiou AT. Introduction to artificial intelligence in ultrasound imaging in obstetrics and gynecology. ULTRASOUND IN OBSTETRICS & GYNECOLOGY : THE OFFICIAL JOURNAL OF THE INTERNATIONAL SOCIETY OF ULTRASOUND IN OBSTETRICS AND GYNECOLOGY 2020; 56:498-505. [PMID: 32530098 PMCID: PMC7702141 DOI: 10.1002/uog.22122] [Citation(s) in RCA: 89] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 05/10/2020] [Accepted: 06/01/2020] [Indexed: 05/05/2023]
Abstract
Artificial intelligence (AI) uses data and algorithms to aim to draw conclusions that are as good as, or even better than, those drawn by humans. AI is already part of our daily life; it is behind face recognition technology, speech recognition in virtual assistants (such as Amazon Alexa, Apple's Siri, Google Assistant and Microsoft Cortana) and self-driving cars. AI software has been able to beat world champions in chess, Go and recently even Poker. Relevant to our community, it is a prominent source of innovation in healthcare, already helping to develop new drugs, support clinical decisions and provide quality assurance in radiology. The list of medical image-analysis AI applications with USA Food and Drug Administration or European Union (soon to fall under European Union Medical Device Regulation) approval is growing rapidly and covers diverse clinical needs, such as detection of arrhythmia using a smartwatch or automatic triage of critical imaging studies to the top of the radiologist's worklist. Deep learning, a leading tool of AI, performs particularly well in image pattern recognition and, therefore, can be of great benefit to doctors who rely heavily on images, such as sonologists, radiographers and pathologists. Although obstetric and gynecological ultrasound are two of the most commonly performed imaging studies, AI has had little impact on this field so far. Nevertheless, there is huge potential for AI to assist in repetitive ultrasound tasks, such as automatically identifying good-quality acquisitions and providing instant quality assurance. For this potential to thrive, interdisciplinary communication between AI developers and ultrasound professionals is necessary. In this article, we explore the fundamentals of medical imaging AI, from theory to applicability, and introduce some key terms to medical professionals in the field of ultrasound. We believe that wider knowledge of AI will help accelerate its integration into healthcare. © 2020 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of the International Society of Ultrasound in Obstetrics and Gynecology.
Collapse
Affiliation(s)
- L. Drukker
- Nuffield Department of Women's & Reproductive HealthUniversity of Oxford, John Radcliffe HospitalOxfordUK
| | - J. A. Noble
- Institute of Biomedical EngineeringUniversity of OxfordOxfordUK
| | - A. T. Papageorghiou
- Nuffield Department of Women's & Reproductive HealthUniversity of Oxford, John Radcliffe HospitalOxfordUK
| |
Collapse
|
41
|
Bellucci G, Ballerini C, Mechelli R, Bigi R, Rinaldi V, Reniè R, Buscarinu MC, Baranzini SE, Madireddy L, Matarese G, Salvetti M, Ristori G. SARS-CoV-2 meta-interactome suggests disease-specific, autoimmune pathophysiologies and therapeutic targets. F1000Res 2020; 9:992. [PMID: 33456761 PMCID: PMC7791351 DOI: 10.12688/f1000research.25593.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/06/2020] [Indexed: 12/15/2022] Open
Abstract
Background: Severe coronavirus disease 2019 (COVID-19) is associated with multiple comorbidities and is characterized by an auto-aggressive inflammatory state leading to massive collateral damage. To identify preventive and therapeutic strategies against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), it is important to ascertain the molecular interactions between virus and host, and how they translate into disease pathophysiology. Methods: We matched virus-human protein interactions of human coronaviruses and other respiratory viruses with lists of genes associated with autoimmune diseases and comorbidities associated to worse COVID-19 course. We then selected the genes included in the statistically significant intersection between SARS-CoV-2 network and disease associated gene sets, identifying a meta-interactome. We analyzed the meta-interactome genes expression in samples derived from lungs of infected humans, and their regulation by IFN-β. Finally, we performed a drug repurposing screening to target the network's most critical nodes. Results: We found a significant enrichment of SARS-CoV-2 interactors in immunological pathways and a strong association with autoimmunity and three prognostically relevant conditions (type 2 diabetes, coronary artery diseases, asthma), that present more independent physiopathological subnetworks. We observed a reduced expression of meta-interactome genes in human lungs after SARS-CoV-2 infection, and a regulatory potential of type I interferons. We also underscored multiple repurposable drugs to tailor the therapeutic strategies. Conclusions: Our data underscored a plausible genetic background that may contribute to the distinct observed pathophysiologies of severe COVID-19. Also, these results may help identify the most promising therapeutic targets and treatments for this condition.
Collapse
Affiliation(s)
- Gianmarco Bellucci
- Department of Neurosciences, Mental Health and Sensory Organs, Sapienza University of Rome, Rome, 00189, Italy
| | - Chiara Ballerini
- Department of Neurosciences, Mental Health and Sensory Organs, Sapienza University of Rome, Rome, 00189, Italy
| | - Rosella Mechelli
- San Raffaele Roma Open University; IRCCS San Raffaele Pisana, Rome, 00166, Italy
| | - Rachele Bigi
- Department of Neurosciences, Mental Health and Sensory Organs, Sapienza University of Rome, Rome, 00189, Italy
| | - Virginia Rinaldi
- Department of Neurosciences, Mental Health and Sensory Organs, Sapienza University of Rome, Rome, 00189, Italy
| | - Roberta Reniè
- Department of Neurosciences, Mental Health and Sensory Organs, Sapienza University of Rome, Rome, 00189, Italy
| | - Maria Chiara Buscarinu
- Department of Neurosciences, Mental Health and Sensory Organs, Sapienza University of Rome, Rome, 00189, Italy
| | - Sergio E. Baranzini
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California, 94158, USA
| | - Lohith Madireddy
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, California, 94158, USA
| | - Giuseppe Matarese
- Dipartimento di Medicina Molecolare e Biotecnologie Mediche, University of Naples Federico II, Naples, 80131, Italy
- Istituto di Endocrinologia e Oncologia Sperimentale, Consiglio Nazionale Delle Ricerche (IEOS-CNR), Naples, 80131, Italy
| | - Marco Salvetti
- Department of Neurosciences, Mental Health and Sensory Organs, Sapienza University of Rome, Rome, 00189, Italy
- Neuromed: IRCCS Istituto Neurologico Mediterraneo (INM), Pozzilli, 86077, Italy
| | - Giovanni Ristori
- Department of Neurosciences, Mental Health and Sensory Organs, Sapienza University of Rome, Rome, 00189, Italy
| |
Collapse
|
42
|
Abstract
Knowledge-based biomedical data science involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey recent progress in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as progress on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing to construct knowledge graphs, and the expansion of novel knowledge-based approaches to clinical and biological domains.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Computational Bioscience Program and Department of Pharmacology, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado 80045, USA
| | - Ignacio J Tripodi
- Department of Computer Science, University of Colorado, Boulder, Colorado 80309, USA
| | - Harrison Pielke-Lombardo
- Computational Bioscience Program and Department of Pharmacology, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado 80045, USA
| | - Lawrence E Hunter
- Computational Bioscience Program and Department of Pharmacology, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado 80045, USA
| |
Collapse
|
43
|
Stafford IS, Kellermann M, Mossotto E, Beattie RM, MacArthur BD, Ennis S. A systematic review of the applications of artificial intelligence and machine learning in autoimmune diseases. NPJ Digit Med 2020; 3:30. [PMID: 32195365 PMCID: PMC7062883 DOI: 10.1038/s41746-020-0229-3] [Citation(s) in RCA: 102] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Accepted: 01/17/2020] [Indexed: 02/07/2023] Open
Abstract
Autoimmune diseases are chronic, multifactorial conditions. Through machine learning (ML), a branch of the wider field of artificial intelligence, it is possible to extract patterns within patient data, and exploit these patterns to predict patient outcomes for improved clinical management. Here, we surveyed the use of ML methods to address clinical problems in autoimmune disease. A systematic review was conducted using MEDLINE, embase and computers and applied sciences complete databases. Relevant papers included "machine learning" or "artificial intelligence" and the autoimmune diseases search term(s) in their title, abstract or key words. Exclusion criteria: studies not written in English, no real human patient data included, publication prior to 2001, studies that were not peer reviewed, non-autoimmune disease comorbidity research and review papers. 169 (of 702) studies met the criteria for inclusion. Support vector machines and random forests were the most popular ML methods used. ML models using data on multiple sclerosis, rheumatoid arthritis and inflammatory bowel disease were most common. A small proportion of studies (7.7% or 13/169) combined different data types in the modelling process. Cross-validation, combined with a separate testing set for more robust model evaluation occurred in 8.3% of papers (14/169). The field may benefit from adopting a best practice of validation, cross-validation and independent testing of ML models. Many models achieved good predictive results in simple scenarios (e.g. classification of cases and controls). Progression to more complex predictive models may be achievable in future through integration of multiple data types.
Collapse
Affiliation(s)
- I. S. Stafford
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - M. Kellermann
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
| | - E. Mossotto
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - R. M. Beattie
- Department of Paediatric Gastroenterology, Southampton Children’s Hospital, Southampton, UK
| | - B. D. MacArthur
- Institute for Life Sciences, University of Southampton, Southampton, UK
| | - S. Ennis
- Department of Human Genetics and Genomic Medicine, University of Southampton, Southampton, UK
| |
Collapse
|
44
|
Abstract
PURPOSE OF REVIEW We critically evaluate the future potential of machine learning (ML), deep learning (DL), and artificial intelligence (AI) in precision medicine. The goal of this work is to show progress in ML in digital health, to exemplify future needs and trends, and to identify any essential prerequisites of AI and ML for precision health. RECENT FINDINGS High-throughput technologies are delivering growing volumes of biomedical data, such as large-scale genome-wide sequencing assays; libraries of medical images; or drug perturbation screens of healthy, developing, and diseased tissue. Multi-omics data in biomedicine is deep and complex, offering an opportunity for data-driven insights and automated disease classification. Learning from these data will open our understanding and definition of healthy baselines and disease signatures. State-of-the-art applications of deep neural networks include digital image recognition, single-cell clustering, and virtual drug screens, demonstrating breadths and power of ML in biomedicine. SUMMARY Significantly, AI and systems biology have embraced big data challenges and may enable novel biotechnology-derived therapies to facilitate the implementation of precision medicine approaches.
Collapse
Affiliation(s)
- Fabian V. Filipp
- Cancer Systems Biology, Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstraße 1, 85764 München, Germany
- School of Life Sciences Weihenstephan, Technical University München, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
| |
Collapse
|