1
|
Meier MJ, Harrill J, Johnson K, Thomas RS, Tong W, Rager JE, Yauk CL. Progress in toxicogenomics to protect human health. Nat Rev Genet 2024:10.1038/s41576-024-00767-1. [PMID: 39223311 DOI: 10.1038/s41576-024-00767-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/23/2024] [Indexed: 09/04/2024]
Abstract
Toxicogenomics measures molecular features, such as transcripts, proteins, metabolites and epigenomic modifications, to understand and predict the toxicological effects of environmental and pharmaceutical exposures. Transcriptomics has become an integral tool in contemporary toxicology research owing to innovations in gene expression profiling that can provide mechanistic and quantitative information at scale. These data can be used to predict toxicological hazards through the use of transcriptomic biomarkers, network inference analyses, pattern-matching approaches and artificial intelligence. Furthermore, emerging approaches, such as high-throughput dose-response modelling, can leverage toxicogenomic data for human health protection even in the absence of predicting specific hazards. Finally, single-cell transcriptomics and multi-omics provide detailed insights into toxicological mechanisms. Here, we review the progress since the inception of toxicogenomics in applying transcriptomics towards toxicology testing and highlight advances that are transforming risk assessment.
Collapse
Affiliation(s)
- Matthew J Meier
- Environmental Health Science and Research Bureau, Health Canada, Ottawa, Ontario, Canada
| | - Joshua Harrill
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, USA
| | - Kamin Johnson
- Predictive Safety Center, Corteva Agriscience, Indianapolis, IN, USA
| | - Russell S Thomas
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, United States Food and Drug Administration, Jefferson, AR, USA
- Curriculum in Toxicology & Environmental Medicine, School of Medicine, University of North Carolina, Chapel Hill, NC, USA
| | - Julia E Rager
- Curriculum in Toxicology & Environmental Medicine, School of Medicine, University of North Carolina, Chapel Hill, NC, USA
- The Center for Environmental Medicine, Asthma and Lung Biology, School of Medicine, The University of North Carolina, Chapel Hill, NC, USA
- Department of Environmental Sciences and Engineering, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- The Institute for Environmental Health Solutions, Gillings School of Global Public Health, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Carole L Yauk
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada.
| |
Collapse
|
2
|
Callahan TJ, Tripodi IJ, Stefanski AL, Cappelletti L, Taneja SB, Wyrwa JM, Casiraghi E, Matentzoglu NA, Reese J, Silverstein JC, Hoyt CT, Boyce RD, Malec SA, Unni DR, Joachimiak MP, Robinson PN, Mungall CJ, Cavalleri E, Fontana T, Valentini G, Mesiti M, Gillenwater LA, Santangelo B, Vasilevsky NA, Hoehndorf R, Bennett TD, Ryan PB, Hripcsak G, Kahn MG, Bada M, Baumgartner WA, Hunter LE. An open source knowledge graph ecosystem for the life sciences. Sci Data 2024; 11:363. [PMID: 38605048 PMCID: PMC11009265 DOI: 10.1038/s41597-024-03171-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 03/21/2024] [Indexed: 04/13/2024] Open
Abstract
Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| | - Ignacio J Tripodi
- Computer Science Department, Interdisciplinary Quantitative Biology, University of Colorado Boulder, Boulder, CO, 80301, USA
| | - Adrianne L Stefanski
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Luca Cappelletti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Jordan M Wyrwa
- Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Elena Casiraghi
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Justin Reese
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Jonathan C Silverstein
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Charles Tapley Hoyt
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, 02115, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Scott A Malec
- Division of Translational Informatics, University of New Mexico School of Medicine, Albuquerque, NM, 87131, USA
| | - Deepak R Unni
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Marcin P Joachimiak
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Peter N Robinson
- Berlin Institute of Health at Charité-Universitatsmedizin, 10117, Berlin, Germany
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Emanuele Cavalleri
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Tommaso Fontana
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Giorgio Valentini
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- ELLIS, European Laboratory for Learning and Intelligent Systems, Milan Unit, Italy
| | - Marco Mesiti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Lucas A Gillenwater
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Brook Santangelo
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Nicole A Vasilevsky
- Data Collaboration Center, Critical Path Institute, 1840 E River Rd. Suite 100, Tucson, AZ, 85718, USA
| | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Tellen D Bennett
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Patrick B Ryan
- Janssen Research and Development, Raritan, NJ, 08869, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Michael G Kahn
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Michael Bada
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - William A Baumgartner
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| | - Lawrence E Hunter
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| |
Collapse
|
3
|
Santangelo B, Bada M, Hunter L, Lozupone C. Hypothesizing mechanistic links between microbes and disease using knowledge graphs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.01.569645. [PMID: 38106100 PMCID: PMC10723325 DOI: 10.1101/2023.12.01.569645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Knowledge graphs have found broad biomedical applications, providing useful representations of complex knowledge. Although plentiful evidence exists linking the gut microbiome to disease, mechanistic understanding of those relationships remains generally elusive. Here we demonstrate the potential of knowledge graphs to hypothesize plausible mechanistic accounts of host-microbe interactions in disease. To do so, we constructed a knowledge graph of linked microbes, genes and metabolites called MGMLink. Using a semantically constrained shortest path search through the graph and a novel path prioritization methodology based on cosine similarity, we show that this knowledge supports inference of mechanistic hypotheses that explain observed relationships between microbes and disease phenotypes. We discuss specific applications of this methodology in inflammatory bowel disease and Parkinson's disease. This approach enables mechanistic hypotheses surrounding the complex interactions between gut microbes and disease to be generated in a scalable and comprehensive manner.
Collapse
|
4
|
Taneja SB, Callahan TJ, Paine MF, Kane-Gill SL, Kilicoglu H, Joachimiak MP, Boyce RD. Developing a Knowledge Graph for Pharmacokinetic Natural Product-Drug Interactions. J Biomed Inform 2023; 140:104341. [PMID: 36933632 PMCID: PMC10150409 DOI: 10.1016/j.jbi.2023.104341] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 01/09/2023] [Accepted: 03/13/2023] [Indexed: 03/17/2023]
Abstract
BACKGROUND Pharmacokinetic natural product-drug interactions (NPDIs) occur when botanical or other natural products are co-consumed with pharmaceutical drugs. With the growing use of natural products, the risk for potential NPDIs and consequent adverse events has increased. Understanding mechanisms of NPDIs is key to preventing or minimizing adverse events. Although biomedical knowledge graphs (KGs) have been widely used for drug-drug interaction applications, computational investigation of NPDIs is novel. We constructed NP-KG as a first step toward computational discovery of plausible mechanistic explanations for pharmacokinetic NPDIs that can be used to guide scientific research. METHODS We developed a large-scale, heterogeneous KG with biomedical ontologies, linked data, and full texts of the scientific literature. To construct the KG, biomedical ontologies and drug databases were integrated with the Phenotype Knowledge Translator framework. The semantic relation extraction systems, SemRep and Integrated Network and Dynamic Reasoning Assembler, were used to extract semantic predications (subject-relation-object triples) from full texts of the scientific literature related to the exemplar natural products green tea and kratom. A literature-based graph constructed from the predications was integrated into the ontology-grounded KG to create NP-KG. NP-KG was evaluated with case studies of pharmacokinetic green tea- and kratom-drug interactions through KG path searches and meta-path discovery to determine congruent and contradictory information in NP-KG compared to ground truth data. We also conducted an error analysis to identify knowledge gaps and incorrect predications in the KG. RESULTS The fully integrated NP-KG consisted of 745,512 nodes and 7,249,576 edges. Evaluation of NP-KG resulted in congruent (38.98% for green tea, 50% for kratom), contradictory (15.25% for green tea, 21.43% for kratom), and both congruent and contradictory (15.25% for green tea, 21.43% for kratom) information compared to ground truth data. Potential pharmacokinetic mechanisms for several purported NPDIs, including the green tea-raloxifene, green tea-nadolol, kratom-midazolam, kratom-quetiapine, and kratom-venlafaxine interactions were congruent with the published literature. CONCLUSION NP-KG is the first KG to integrate biomedical ontologies with full texts of the scientific literature focused on natural products. We demonstrate the application of NP-KG to identify known pharmacokinetic interactions between natural products and pharmaceutical drugs mediated by drug metabolizing enzymes and transporters. Future work will incorporate context, contradiction analysis, and embedding-based methods to enrich NP-KG. NP-KG is publicly available at https://doi.org/10.5281/zenodo.6814507. The code for relation extraction, KG construction, and hypothesis generation is available at https://github.com/sanyabt/np-kg.
Collapse
Affiliation(s)
- Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA 15206, USA.
| | - Tiffany J Callahan
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Mary F Paine
- Department of Pharmaceutical Sciences, College of Pharmacy and Pharmaceutical Sciences, Washington State University, Spokane, WA 99202, USA
| | | | - Halil Kilicoglu
- School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, IL 61820, USA
| | - Marcin P Joachimiak
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| |
Collapse
|
5
|
A Rule-Based Inference Framework to Explore and Explain the Biological Related Mechanisms of Potential Drug-Drug Interactions. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:9093262. [PMID: 36035294 PMCID: PMC9402322 DOI: 10.1155/2022/9093262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 07/24/2022] [Accepted: 07/28/2022] [Indexed: 11/17/2022]
Abstract
As more drugs are developed and the incidence of polypharmacy increases, it is becoming critically important to anticipate potential DDIs before they occur in the clinic, along with those for which effects might go unobserved. However, traditional methods for DDI identification are unable to coalesce interaction mechanisms out of vast lists of potential or known DDIs, much less study them accurately. Computational methods have great promise but have realized only limited clinical utility. This work develops a rule-based inference framework to predict DDI mechanisms and support determination of their clinical relevance. Given a drug pair, our framework interrogates and describes DDI mechanisms based on a knowledge graph that integrates extensive available biomedical resources through semantic web technologies and backward chaining inference, effectively identifying facts within the graph that prove and explain the mechanisms of the drugs' interaction. The framework was evaluated through a case study combining a chemotherapy agent, irinotecan, and a widely used antibiotic, levofloxacin. The mutual interactions identified indicate that our framework can effectively explore and explain the mechanisms of potential DDIs. This approach has the potential to improve drug discovery and design and to support rapid and cost-effective identification of DDIs along with their putative mechanisms, a key step in determining clinical relevance and supporting clinical decision-making.
Collapse
|
6
|
Pan JZ, Edelstein E, Bansky P, Wyner A. A Knowledge Graph Based Approach to Social Science
Surveys. DATA INTELLIGENCE 2021. [DOI: 10.1162/dint_a_00107] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Recent success of knowledge graphs has spurred interest in applying them in open science, such as on intelligent survey systems for scientists. However, efforts to understand the quality of candidate survey questions provided by these methods have been limited. Indeed, existing methods do not consider the type of on-the-fly content planning that is possible for face-to-face surveys and hence do not guarantee that selection of subsequent questions is based on response to previous questions in a survey. To address this limitation, we propose a dynamic and informative solution for an intelligent survey system that is based on knowledge graphs. To illustrate our proposal, we look into social science surveys, focusing on ordering the questions of a questionnaire component by their level of acceptance, along with conditional triggers that further customise participants' experience. Our main findings are: (i) evaluation of the proposed approach shows that the dynamic component can be beneficial in terms of lowering the number of questions asked per variable, thus allowing more informative data to be collected in a survey of equivalent length; and (ii) a primary advantage of the proposed approach is that it enables grouping of participants according to their responses, so that participants are not only served appropriate follow-up questions, but their responses to these questions may be analysed in the context of some initial categorisation. We believe that the proposed approach can easily be applied to other social science surveys based on grouping definitions in their contexts. The knowledge-graph-based intelligent survey approach proposed in our work allows online questionnaires to approach face-to-face interaction in their level of informativity and responsiveness, as well as duplicating certain advantages of interview-based data collection.
Collapse
Affiliation(s)
- Jeff Z. Pan
- School of Informatics, University of Edinburgh, Edinburgh EH8 9YL, UK
- Department of Computing Science, University of Aberdeen, Aberdeen AB24 3FX, UK
| | - Elspeth Edelstein
- School of Language, Literature, Music and Visual Culture, University of Aberdeen, Aberdeen AB24 3FX, UK
| | - Patrik Bansky
- Department of Computing Science, University of Aberdeen, Aberdeen AB24 3FX, UK
| | - Adam Wyner
- School of Law and Department of Computer Science, Swansea University, Swansea, West Glamorgan SA2 8PP, UK
| |
Collapse
|
7
|
McGrath SP, Benton ML, Tavakoli M, Tatonetti NP. Predictions, Pivots, and a Pandemic: a Review of 2020's Top Translational Bioinformatics Publications. Yearb Med Inform 2021; 30:219-225. [PMID: 34479393 PMCID: PMC8416221 DOI: 10.1055/s-0041-1726540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
OBJECTIVES Provide an overview of the emerging themes and notable papers which were published in 2020 in the field of Bioinformatics and Translational Informatics (BTI) for the International Medical Informatics Association Yearbook. METHODS A team of 16 individuals scanned the literature from the past year. Using a scoring rubric, papers were evaluated on their novelty, importance, and objective quality. 1,224 Medical Subject Headings (MeSH) terms extracted from these papers were used to identify themes and research focuses. The authors then used the scoring results to select notable papers and trends presented in this manuscript. RESULTS The search phase identified 263 potential papers and central themes of coronavirus disease 2019 (COVID-19), machine learning, and bioinformatics were examined in greater detail. CONCLUSIONS When addressing a once in a centruy pandemic, scientists worldwide answered the call, with informaticians playing a critical role. Productivity and innovations reached new heights in both TBI and science, but significant research gaps remain.
Collapse
Affiliation(s)
- Scott P. McGrath
- CITRIS Health, University of California Berkeley, Berkeley, CA, USA
| | | | - Maryam Tavakoli
- MTERMS Lab, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
8
|
Abstract
Knowledge-based biomedical data science involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey recent progress in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as progress on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing to construct knowledge graphs, and the expansion of novel knowledge-based approaches to clinical and biological domains.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Computational Bioscience Program and Department of Pharmacology, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado 80045, USA
| | - Ignacio J Tripodi
- Department of Computer Science, University of Colorado, Boulder, Colorado 80309, USA
| | - Harrison Pielke-Lombardo
- Computational Bioscience Program and Department of Pharmacology, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado 80045, USA
| | - Lawrence E Hunter
- Computational Bioscience Program and Department of Pharmacology, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado 80045, USA
| |
Collapse
|