1
|
Zakka C, Shad R, Chaurasia A, Dalal AR, Kim JL, Moor M, Fong R, Phillips C, Alexander K, Ashley E, Boyd J, Boyd K, Hirsch K, Langlotz C, Lee R, Melia J, Nelson J, Sallam K, Tullis S, Vogelsong MA, Cunningham JP, Hiesinger W. Almanac - Retrieval-Augmented Language Models for Clinical Medicine. NEJM AI 2024; 1:10.1056/aioa2300068. [PMID: 38343631 PMCID: PMC10857783 DOI: 10.1056/aioa2300068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
BACKGROUND Large language models (LLMs) have recently shown impressive zero-shot capabilities, whereby they can use auxiliary data, without the availability of task-specific training examples, to complete a variety of natural language tasks, such as summarization, dialogue generation, and question answering. However, despite many promising applications of LLMs in clinical medicine, adoption of these models has been limited by their tendency to generate incorrect and sometimes even harmful statements. METHODS We tasked a panel of eight board-certified clinicians and two health care practitioners with evaluating Almanac, an LLM framework augmented with retrieval capabilities from curated medical resources for medical guideline and treatment recommendations. The panel compared responses from Almanac and standard LLMs (ChatGPT-4, Bing, and Bard) versus a novel data set of 314 clinical questions spanning nine medical specialties. RESULTS Almanac showed a significant improvement in performance compared with the standard LLMs across axes of factuality, completeness, user preference, and adversarial safety. CONCLUSIONS Our results show the potential for LLMs with access to domain-specific corpora to be effective in clinical decision-making. The findings also underscore the importance of carefully testing LLMs before deployment to mitigate their shortcomings. (Funded by the National Institutes of Health, National Heart, Lung, and Blood Institute.).
Collapse
Affiliation(s)
- Cyril Zakka
- Department of Cardiothoracic Surgery, Stanford Medicine, Stanford, CA
| | - Rohan Shad
- Division of Cardiovascular Surgery, Penn Medicine, Philadelphia
| | - Akash Chaurasia
- Department of Computer Science, Stanford University, Stanford, CA
| | - Alex R Dalal
- Department of Cardiothoracic Surgery, Stanford Medicine, Stanford, CA
| | - Jennifer L Kim
- Department of Cardiothoracic Surgery, Stanford Medicine, Stanford, CA
| | - Michael Moor
- Department of Computer Science, Stanford University, Stanford, CA
| | - Robyn Fong
- Department of Computer Science, Stanford University, Stanford, CA
| | - Curran Phillips
- Department of Cardiothoracic Surgery, Stanford Medicine, Stanford, CA
| | - Kevin Alexander
- Division of Cardiovascular Medicine, Stanford Medicine, Stanford, CA
| | - Euan Ashley
- Division of Cardiovascular Medicine, Stanford Medicine, Stanford, CA
| | - Jack Boyd
- Department of Cardiothoracic Surgery, Stanford Medicine, Stanford, CA
| | - Kathleen Boyd
- Department of Pediatrics, Stanford Medicine, Stanford, CA
| | - Karen Hirsch
- Department of Neurology, Stanford Medicine, Stanford, CA
| | - Curt Langlotz
- Department of Radiology and Biomedical Informatics, Stanford Medicine, Stanford, CA
| | - Rita Lee
- Department of Cardiothoracic Surgery, Stanford Medicine, Stanford, CA
| | - Joanna Melia
- Division of Gastroenterology and Hepatology, Johns Hopkins Medicine, Baltimore
| | - Joanna Nelson
- Division of Infectious Diseases, Stanford Medicine, Stanford, CA
| | - Karim Sallam
- Division of Cardiovascular Medicine, Stanford Medicine, Stanford, CA
| | - Stacey Tullis
- Department of Cardiothoracic Surgery, Stanford Medicine, Stanford, CA
| | | | | | - William Hiesinger
- Department of Cardiothoracic Surgery, Stanford Medicine, Stanford, CA
| |
Collapse
|
2
|
Choi J, Vendrow EB, Moor M, Spain DA. Development and Validation of a Model to Quantify Injury Severity in Real Time. JAMA Netw Open 2023; 6:e2336196. [PMID: 37812422 PMCID: PMC10562944 DOI: 10.1001/jamanetworkopen.2023.36196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 08/22/2023] [Indexed: 10/10/2023] Open
Abstract
Importance Quantifying injury severity is integral to trauma care benchmarking, decision-making, and research, yet the most prevalent metric to quantify injury severity-Injury Severity Score (ISS)- is impractical to use in real time. Objective To develop and validate a practical model that uses a limited number of injury patterns to quantify injury severity in real time through 3 intuitive outcomes. Design, Setting, and Participants In this cohort study for prediction model development and validation, training, development, and internal validation cohorts comprised 223 545, 74 514, and 74 514 admission encounters, respectively, of adults (age ≥18 years) with a primary diagnosis of traumatic injury hospitalized more than 2 days (2017-2018 National Inpatient Sample). The external validation cohort comprised 3855 adults admitted to a level I trauma center who met criteria for the 2 highest of the institution's 3 trauma activation levels. Main Outcomes and Measures Three outcomes were hospital length of stay, probability of discharge disposition to a facility, and probability of inpatient mortality. The prediction performance metric for length of stay was mean absolute error. Prediction performance metrics for discharge disposition and inpatient mortality were average precision, precision, recall, specificity, F1 score, and area under the receiver operating characteristic curve (AUROC). Calibration was evaluated using calibration plots. Shapley addictive explanations analysis and bee swarm plots facilitated model explainability analysis. Results The Length of Stay, Disposition, Mortality (LDM) Injury Index (the model) comprised a multitask deep learning model trained, developed, and internally validated on a data set of 372 573 traumatic injury encounters (mean [SD] age = 68.7 [19.3] years, 56.6% female). The model used 176 potential injuries to output 3 interpretable outcomes: the predicted hospital length of stay, probability of discharge to a facility, and probability of inpatient mortality. For the external validation set, the ISS predicted length of stay with mean absolute error was 4.16 (95% CI, 4.13-4.20) days. Compared with the ISS, the model had comparable external validation set discrimination performance (facility discharge AUROC: 0.67 [95% CI, 0.67-0.68] vs 0.65 [95% CI, 0.65-0.66]; recall: 0.59 [95% CI, 0.58-0.61] vs 0.59 [95% CI, 0.58-0.60]; specificity: 0.66 [95% CI, 0.66-0.66] vs 0.62 [95%CI, 0.60-0.63]; mortality AUROC: 0.83 [95% CI, 0.81-0.84] vs 0.82 [95% CI, 0.82-0.82]; recall: 0.74 [95% CI, 0.72-0.77] vs 0.75 [95% CI, 0.75-0.76]; specificity: 0.81 [95% CI, 0.81-0.81] vs 0.76 [95% CI, 0.75-0.77]). The model had excellent calibration for predicting facility discharge disposition, but overestimated inpatient mortality. Explainability analysis found the inputs influencing model predictions matched intuition. Conclusions and Relevance In this cohort study using a limited number of injury patterns, the model quantified injury severity using 3 intuitive outcomes. Further study is required to evaluate the model at scale.
Collapse
Affiliation(s)
- Jeff Choi
- Department of Surgery, Stanford University, Stanford, California
| | - Edward B. Vendrow
- Department of Computer Science, Stanford University, Stanford, California
| | - Michael Moor
- Department of Computer Science, Stanford University, Stanford, California
| | - David A. Spain
- Department of Surgery, Stanford University, Stanford, California
| |
Collapse
|
3
|
Moor M, Bennett N, Plečko D, Horn M, Rieck B, Meinshausen N, Bühlmann P, Borgwardt K. Predicting sepsis using deep learning across international sites: a retrospective development and validation study. EClinicalMedicine 2023; 62:102124. [PMID: 37588623 PMCID: PMC10425671 DOI: 10.1016/j.eclinm.2023.102124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 06/29/2023] [Accepted: 07/17/2023] [Indexed: 08/18/2023] Open
Abstract
Background When sepsis is detected, organ damage may have progressed to irreversible stages, leading to poor prognosis. The use of machine learning for predicting sepsis early has shown promise, however international validations are missing. Methods This was a retrospective, observational, multi-centre cohort study. We developed and externally validated a deep learning system for the prediction of sepsis in the intensive care unit (ICU). Our analysis represents the first international, multi-centre in-ICU cohort study for sepsis prediction using deep learning to our knowledge. Our dataset contains 136,478 unique ICU admissions, representing a refined and harmonised subset of four large ICU databases comprising data collected from ICUs in the US, the Netherlands, and Switzerland between 2001 and 2016. Using the international consensus definition Sepsis-3, we derived hourly-resolved sepsis annotations, amounting to 25,694 (18.8%) patient stays with sepsis. We compared our approach to clinical baselines as well as machine learning baselines and performed an extensive internal and external statistical validation within and across databases, reporting area under the receiver-operating-characteristic curve (AUC). Findings Averaged over sites, our model was able to predict sepsis with an AUC of 0.846 (95% confidence interval [CI], 0.841-0.852) on a held-out validation cohort internal to each site, and an AUC of 0.761 (95% CI, 0.746-0.770) when validating externally across sites. Given access to a small fine-tuning set (10% per site), the transfer to target sites was improved to an AUC of 0.807 (95% CI, 0.801-0.813). Our model raised 1.4 false alerts per true alert and detected 80% of the septic patients 3.7 h (95% CI, 3.0-4.3) prior to the onset of sepsis, opening a vital window for intervention. Interpretation By monitoring clinical and laboratory measurements in a retrospective simulation of a real-time prediction scenario, a deep learning system for the detection of sepsis generalised to previously unseen ICU cohorts, internationally. Funding This study was funded by the Personalized Health and Related Technologies (PHRT) strategic focus area of the ETH domain.
Collapse
Affiliation(s)
- Michael Moor
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
- SIB Swiss Institute of Bioinformatics, Switzerland
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Nicolas Bennett
- Seminar for Statistics, Department of Mathematics, ETH Zurich, Switzerland
| | - Drago Plečko
- Seminar for Statistics, Department of Mathematics, ETH Zurich, Switzerland
| | - Max Horn
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
- SIB Swiss Institute of Bioinformatics, Switzerland
| | - Bastian Rieck
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
- SIB Swiss Institute of Bioinformatics, Switzerland
| | | | - Peter Bühlmann
- Seminar for Statistics, Department of Mathematics, ETH Zurich, Switzerland
| | - Karsten Borgwardt
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
- SIB Swiss Institute of Bioinformatics, Switzerland
| |
Collapse
|
4
|
Zakka C, Chaurasia A, Shad R, Dalal AR, Kim JL, Moor M, Alexander K, Ashley E, Boyd J, Boyd K, Hirsch K, Langlotz C, Nelson J, Hiesinger W. Almanac: Retrieval-Augmented Language Models for Clinical Medicine. Res Sq 2023:rs.3.rs-2883198. [PMID: 37205549 PMCID: PMC10187428 DOI: 10.21203/rs.3.rs-2883198/v1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Large-language models have recently demonstrated impressive zero-shot capabilities in a variety of natural language tasks such as summarization, dialogue generation, and question-answering. Despite many promising applications in clinical medicine, adoption of these models in real-world settings has been largely limited by their tendency to generate incorrect and sometimes even toxic statements. In this study, we develop Almanac, a large language model framework augmented with retrieval capabilities for medical guideline and treatment recommendations. Performance on a novel dataset of clinical scenarios (n= 130) evaluated by a panel of 5 board-certified and resident physicians demonstrates significant increases in factuality (mean of 18% at p-value < 0.05) across all specialties, with improvements in completeness and safety. Our results demonstrate the potential for large language models to be effective tools in the clinical decision-making process, while also emphasizing the importance of careful testing and deployment to mitigate their shortcomings.
Collapse
Affiliation(s)
- Cyril Zakka
- Department of Cardiothoracic Surgery, Stanford Medicine
| | - Akash Chaurasia
- Department of Cardiothoracic Surgery, Stanford Medicine
- Department of Computer Science, Stanford University
| | - Rohan Shad
- Division of Cardiovascular Surgery, Penn Medicine
| | - Alex R. Dalal
- Department of Cardiothoracic Surgery, Stanford Medicine
| | | | - Michael Moor
- Department of Computer Science, Stanford University
| | | | - Euan Ashley
- Division of Cardiovascular Medicine, Stanford Medicine
| | - Jack Boyd
- Department of Cardiothoracic Surgery, Stanford Medicine
| | | | | | - Curt Langlotz
- Department of Radiology and Biomedical Informatics, Stanford Medicine
| | | | | |
Collapse
|
5
|
Moor M, Banerjee O, Abad ZSH, Krumholz HM, Leskovec J, Topol EJ, Rajpurkar P. Foundation models for generalist medical artificial intelligence. Nature 2023; 616:259-265. [PMID: 37045921 DOI: 10.1038/s41586-023-05881-4] [Citation(s) in RCA: 140] [Impact Index Per Article: 140.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 02/22/2023] [Indexed: 04/14/2023]
Abstract
The exceptionally rapid development of highly flexible, reusable artificial intelligence (AI) models is likely to usher in newfound capabilities in medicine. We propose a new paradigm for medical AI, which we refer to as generalist medical AI (GMAI). GMAI models will be capable of carrying out a diverse set of tasks using very little or no task-specific labelled data. Built through self-supervision on large, diverse datasets, GMAI will flexibly interpret different combinations of medical modalities, including data from imaging, electronic health records, laboratory results, genomics, graphs or medical text. Models will in turn produce expressive outputs such as free-text explanations, spoken recommendations or image annotations that demonstrate advanced medical reasoning abilities. Here we identify a set of high-impact potential applications for GMAI and lay out specific technical capabilities and training datasets necessary to enable them. We expect that GMAI-enabled applications will challenge current strategies for regulating and validating AI devices for medicine and will shift practices associated with the collection of large medical datasets.
Collapse
Affiliation(s)
- Michael Moor
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Oishi Banerjee
- Department of Biomedical Informatics, Harvard University, Cambridge, MA, USA
| | - Zahra Shakeri Hossein Abad
- Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Harlan M Krumholz
- Yale University School of Medicine, Center for Outcomes Research and Evaluation, Yale New Haven Hospital, New Haven, CT, USA
| | - Jure Leskovec
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Eric J Topol
- Scripps Research Translational Institute, La Jolla, CA, USA.
| | - Pranav Rajpurkar
- Department of Biomedical Informatics, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
6
|
Frei ER, Gossner MM, Vitasse Y, Queloz V, Dubach V, Gessler A, Ginzler C, Hagedorn F, Meusburger K, Moor M, Samblás Vives E, Rigling A, Uitentuis I, von Arx G, Wohlgemuth T. European beech dieback after premature leaf senescence during the 2018 drought in northern Switzerland. Plant Biol (Stuttg) 2022; 24:1132-1145. [PMID: 36103113 PMCID: PMC10092601 DOI: 10.1111/plb.13467] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Accepted: 09/04/2022] [Indexed: 06/15/2023]
Abstract
During the particularly severe hot summer drought in 2018, widespread premature leaf senescence was observed in several broadleaved tree species in Central Europe, particularly in European beech (Fagus sylvatica L.). For beech, it is yet unknown whether the drought evoked a decline towards tree mortality or whether trees can recover in the longer term. In this study, we monitored crown dieback, tree mortality and secondary drought damage symptoms in 963 initially live beech trees that exhibited either premature or normal leaf senescence in 2018 in three regions in northern Switzerland from 2018 to 2021. We related the observed damage to multiple climate- and stand-related parameters. Cumulative tree mortality continuously increased up to 7.2% and 1.3% in 2021 for trees with premature and normal leaf senescence in 2018, respectively. Mean crown dieback in surviving trees peaked at 29.2% in 2020 and 8.1% in 2019 for trees with premature and normal leaf senescence, respectively. Thereafter, trees showed first signs of recovery. Crown damage was more pronounced and recovery was slower for trees that showed premature leaf senescence in 2018, for trees growing on drier sites, and for larger trees. The presence of bleeding cankers peaked at 24.6% in 2019 and 10.7% in 2020 for trees with premature and normal leaf senescence, respectively. The presence of bark beetle holes peaked at 22.8% and 14.8% in 2021 for trees with premature and normal leaf senescence, respectively. Both secondary damage symptoms occurred more frequently in trees that had higher proportions of crown dieback and/or showed premature senescence in 2018. Our findings demonstrate context-specific differences in beech mortality and recovery reflecting the importance of regional and local climate and soil conditions. Adapting management to increase forest resilience is gaining importance, given the expected further beech decline on dry sites in northern Switzerland.
Collapse
Affiliation(s)
- E. R. Frei
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- WSL Institute for Snow and Avalanche Research SLFDavos DorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
- Climate Change and Extremes in Alpine Regions Research Centre CERCDavos DorfSwitzerland
| | - M. M. Gossner
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
- Department of Environmental Systems ScienceETH ZurichZurichSwitzerland
| | - Y. Vitasse
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
| | - V. Queloz
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
| | - V. Dubach
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
| | - A. Gessler
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
- Department of Environmental Systems ScienceETH ZurichZurichSwitzerland
| | - C. Ginzler
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
| | - F. Hagedorn
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
| | - K. Meusburger
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
| | - M. Moor
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
| | - E. Samblás Vives
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- Autonomous University of Barcelona (UAB)Cerdanyola del VallesSpain
| | - A. Rigling
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
- Department of Environmental Systems ScienceETH ZurichZurichSwitzerland
| | - I. Uitentuis
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
| | - G. von Arx
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
- Oeschger Centre for Climate Change ResearchUniversity of BernBernSwitzerland
| | - T. Wohlgemuth
- Swiss Federal Institute for Forest, Snow and Landscape Research WSLBirmensdorfSwitzerland
- SwissForestLabBirmensdorfSwitzerland
| |
Collapse
|
7
|
Moor M, Rieck B, Horn M, Jutzeler CR, Borgwardt K. Early Prediction of Sepsis in the ICU Using Machine Learning: A Systematic Review. Front Med (Lausanne) 2021; 8:607952. [PMID: 34124082 PMCID: PMC8193357 DOI: 10.3389/fmed.2021.607952] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2020] [Accepted: 03/04/2021] [Indexed: 12/12/2022] Open
Abstract
Background: Sepsis is among the leading causes of death in intensive care units (ICUs) worldwide and its recognition, particularly in the early stages of the disease, remains a medical challenge. The advent of an affluence of available digital health data has created a setting in which machine learning can be used for digital biomarker discovery, with the ultimate goal to advance the early recognition of sepsis. Objective: To systematically review and evaluate studies employing machine learning for the prediction of sepsis in the ICU. Data Sources: Using Embase, Google Scholar, PubMed/Medline, Scopus, and Web of Science, we systematically searched the existing literature for machine learning-driven sepsis onset prediction for patients in the ICU. Study Eligibility Criteria: All peer-reviewed articles using machine learning for the prediction of sepsis onset in adult ICU patients were included. Studies focusing on patient populations outside the ICU were excluded. Study Appraisal and Synthesis Methods: A systematic review was performed according to the PRISMA guidelines. Moreover, a quality assessment of all eligible studies was performed. Results: Out of 974 identified articles, 22 and 21 met the criteria to be included in the systematic review and quality assessment, respectively. A multitude of machine learning algorithms were applied to refine the early prediction of sepsis. The quality of the studies ranged from "poor" (satisfying ≤ 40% of the quality criteria) to "very good" (satisfying ≥ 90% of the quality criteria). The majority of the studies (n = 19, 86.4%) employed an offline training scenario combined with a horizon evaluation, while two studies implemented an online scenario (n = 2, 9.1%). The massive inter-study heterogeneity in terms of model development, sepsis definition, prediction time windows, and outcomes precluded a meta-analysis. Last, only two studies provided publicly accessible source code and data sources fostering reproducibility. Limitations: Articles were only eligible for inclusion when employing machine learning algorithms for the prediction of sepsis onset in the ICU. This restriction led to the exclusion of studies focusing on the prediction of septic shock, sepsis-related mortality, and patient populations outside the ICU. Conclusions and Key Findings: A growing number of studies employs machine learning to optimize the early prediction of sepsis through digital biomarker discovery. This review, however, highlights several shortcomings of the current approaches, including low comparability and reproducibility. Finally, we gather recommendations how these challenges can be addressed before deploying these models in prospective analyses. Systematic Review Registration Number: CRD42020200133.
Collapse
Affiliation(s)
- Michael Moor
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule Zürich (ETH Zurich), Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Bastian Rieck
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule Zürich (ETH Zurich), Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Max Horn
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule Zürich (ETH Zurich), Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Catherine R. Jutzeler
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule Zürich (ETH Zurich), Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Karsten Borgwardt
- Machine Learning and Computational Biology Lab, Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule Zürich (ETH Zurich), Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
8
|
Abstract
The last decade saw an enormous boost in the field of computational topology: methods and concepts from algebraic and differential topology, formerly confined to the realm of pure mathematics, have demonstrated their utility in numerous areas such as computational biology personalised medicine, and time-dependent data analysis, to name a few. The newly-emerging domain comprising topology-based techniques is often referred to as topological data analysis (TDA). Next to their applications in the aforementioned areas, TDA methods have also proven to be effective in supporting, enhancing, and augmenting both classical machine learning and deep learning models. In this paper, we review the state of the art of a nascent field we refer to as "topological machine learning," i.e., the successful symbiosis of topology-based methods and machine learning algorithms, such as deep neural networks. We identify common threads, current applications, and future challenges.
Collapse
Affiliation(s)
- Felix Hensel
- Machine Learning and Computational Biology Laboratory, ETH Zurich, Zurich, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Michael Moor
- Machine Learning and Computational Biology Laboratory, ETH Zurich, Zurich, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Bastian Rieck
- Machine Learning and Computational Biology Laboratory, ETH Zurich, Zurich, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
9
|
Gumbsch T, Bock C, Moor M, Rieck B, Borgwardt K. Enhancing statistical power in temporal biomarker discovery through representative shapelet mining. Bioinformatics 2021; 36:i840-i848. [PMID: 33381811 PMCID: PMC7773478 DOI: 10.1093/bioinformatics/btaa815] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Motivation Temporal biomarker discovery in longitudinal data is based on detecting reoccurring trajectories, the so-called shapelets. The search for shapelets requires considering all subsequences in the data. While the accompanying issue of multiple testing has been mitigated in previous work, the redundancy and overlap of the detected shapelets results in an a priori unbounded number of highly similar and structurally meaningless shapelets. As a consequence, current temporal biomarker discovery methods are impractical and underpowered. Results We find that the pre- or post-processing of shapelets does not sufficiently increase the power and practical utility. Consequently, we present a novel method for temporal biomarker discovery: Statistically Significant Submodular Subset Shapelet Mining (S5M) that retrieves short subsequences that are (i) occurring in the data, (ii) are statistically significantly associated with the phenotype and (iii) are of manageable quantity while maximizing structural diversity. Structural diversity is achieved by pruning non-representative shapelets via submodular optimization. This increases the statistical power and utility of S5M compared to state-of-the-art approaches on simulated and real-world datasets. For patients admitted to the intensive care unit (ICU) showing signs of severe organ failure, we find temporal patterns in the sequential organ failure assessment score that are associated with in-ICU mortality. Availability and implementation S5M is an option in the python package of S3M: github.com/BorgwardtLab/S3M.
Collapse
Affiliation(s)
- Thomas Gumbsch
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Christian Bock
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Michael Moor
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Bastian Rieck
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Karsten Borgwardt
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| |
Collapse
|
10
|
Abstract
With the biomedical field generating large quantities of time series data, there has been a growing interest in developing and refining machine learning methods that allow its mining and exploitation. Classification is one of the most important and challenging machine learning tasks related to time series. Many biomedical phenomena, such as the brain's activity or blood pressure, change over time. The objective of this chapter is to provide a gentle introduction to time series classification. In the first part we describe the characteristics of time series data and challenges in its analysis. The second part provides an overview of common machine learning methods used for time series classification. A real-world use case, the early recognition of sepsis, demonstrates the applicability of the methods discussed.
Collapse
Affiliation(s)
- Christian Bock
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Michael Moor
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Catherine R Jutzeler
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Karsten Borgwardt
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland. .,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.
| |
Collapse
|
11
|
Kangru T, Moor M, Otto T, Riives J, Vaher K. Modern robot-integrated manufacturing cell according to the needs of Industry 4.0; pp. 407–412. Proceedings of the Estonian Academy of Sciences 2021. [DOI: 10.3176/proc.2021.4.06] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
12
|
Hyland SL, Faltys M, Hüser M, Lyu X, Gumbsch T, Esteban C, Bock C, Horn M, Moor M, Rieck B, Zimmermann M, Bodenham D, Borgwardt K, Rätsch G, Merz TM. Early prediction of circulatory failure in the intensive care unit using machine learning. Nat Med 2020; 26:364-373. [DOI: 10.1038/s41591-020-0789-4] [Citation(s) in RCA: 113] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Accepted: 02/04/2020] [Indexed: 01/12/2023]
|
13
|
Bock C, Gumbsch T, Moor M, Rieck B, Roqueiro D, Borgwardt K. Association mapping in biomedical time series via statistically significant shapelet mining. Bioinformatics 2019; 34:i438-i446. [PMID: 29949972 PMCID: PMC6022601 DOI: 10.1093/bioinformatics/bty246] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Motivation Most modern intensive care units record the physiological and vital signs of patients. These data can be used to extract signatures, commonly known as biomarkers, that help physicians understand the biological complexity of many syndromes. However, most biological biomarkers suffer from either poor predictive performance or weak explanatory power. Recent developments in time series classification focus on discovering shapelets, i.e. subsequences that are most predictive in terms of class membership. Shapelets have the advantage of combining a high predictive performance with an interpretable component-their shape. Currently, most shapelet discovery methods do not rely on statistical tests to verify the significance of individual shapelets. Therefore, identifying associations between the shapelets of physiological biomarkers and patients that exhibit certain phenotypes of interest enables the discovery and subsequent ranking of physiological signatures that are interpretable, statistically validated and accurate predictors of clinical endpoints. Results We present a novel and scalable method for scanning time series and identifying discriminative patterns that are statistically significant. The significance of a shapelet is evaluated while considering the problem of multiple hypothesis testing and mitigating it by efficiently pruning untestable shapelet candidates with Tarone's method. We demonstrate the utility of our method by discovering patterns in three of a patient's vital signs: heart rate, respiratory rate and systolic blood pressure that are indicators of the severity of a future sepsis event, i.e. an inflammatory response to an infective agent that can lead to organ failure and death, if not treated in time. Availability and implementation We make our method and the scripts that are required to reproduce the experiments publicly available at https://github.com/BorgwardtLab/S3M. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christian Bock
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Switzerland
| | - Thomas Gumbsch
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Switzerland
| | - Michael Moor
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Switzerland
| | - Bastian Rieck
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Switzerland
| | - Damian Roqueiro
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Switzerland
| | - Karsten Borgwardt
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Switzerland
| |
Collapse
|
14
|
Moor M, Brodine S, Garfein R, Rashidi H, Fraga M, Kritz-Silverstein D, Alcaraz J, Elder J. Individual and community factors contributing to anemia among women and
children living in a rural community in Baja California, Mexico. Ann Glob Health 2015. [DOI: 10.1016/j.aogh.2015.02.767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022] Open
|
15
|
Buckon CE, Thomas SS, Jakobson-Huston S, Moor M, Sussman M, Aiona M. Comparison of three ankle-foot orthosis configurations for children with spastic diplegia. Dev Med Child Neurol 2004; 46:590-8. [PMID: 15344518 DOI: 10.1017/s0012162204001008] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
This study compared the functional efficacy of three commonly prescribed ankle-foot orthosis (AFO) configurations (solid [SAFO], hinged [HAFO], and posterior leaf spring [PLS]). Sixteen independently ambulatory children (10 males, six females; mean age 8 years 4 months, SD 2 years 4 months; range 4 years 4 months to 11 years 6 months) with spastic diplegia participated in this study. Four children were classified at level I of the Gross Motor Function Classification System (GMFCS; Palisano et al. 1997); the remaining 12 were at level II. Children were assessed barefoot (BF) at baseline (baseline assessment of energy consumption was performed with shoes on, no AFO) and in each orthotic configuration after three months of use, using gait analysis, oxygen consumption, and functional outcome measures. AFO use did not markedly alter joint kinematics or kinetics at the pelvis, hip, or knee. All AFO configurations normalized ankle kinematics in stance, increased step/stride length, decreased cadence, and decreased energy cost of walking. Functionally, all AFO configurations improved the execution of walking/running/jumping skills, upper extremity coordination, and fine motor speed/dexterity. However, the quality of gross motor skill performance and independence in mobility were unchanged. These results suggest that most children with spastic diplegia benefit functionally from AFO use. However, some children at GMFCS level II demonstrated a subtle but detrimental effect on function with HAFO use, shown by an increase in peak knee extensor moment in early stance, excessive ankle dorsiflexion, decreased walking velocity, and greater energy cost. Therefore, constraining ankle motion by using a PLS or SAFO should be considered for most, but not all, children with spastic diplegia.
Collapse
Affiliation(s)
- Cathleen E Buckon
- Clinical Research Department, Shriners Hospitals for Children, Portland, OR 97239, USA.
| | | | | | | | | | | |
Collapse
|
16
|
Martin SW, Bishop FE, Kerr BM, Moor M, Moore M, Sheffels P, Rashed M, Slatter JG, Berthon-Cédille L, Lepage F, Descombe JJ, Picard M, Baillie TA, Levy RH. Pharmacokinetics and metabolism of the novel anticonvulsant agent N-(2,6-dimethylphenyl)-5-methyl-3-isoxazolecarboxamide (D2624) in rats and humans. Drug Metab Dispos 1997; 25:40-6. [PMID: 9010628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
N-(2,6-dimethylphenyl)-5-methyl-3-isoxazolecarboxamide (D2624) belongs to a new series of experimental anticonvulsants related to lidocaine. This study was undertaken to understand the pharmacokinetics and metabolism of D2624 in rats and humans, with emphasis on the possible formation of 2,6-dimethylaniline (2,6-DMA). After oral administration of stable isotope-labeled parent drug to rats and GC/MS analysis of plasma samples, two metabolites were identified: D3017, which is the primary alcohol, and 2,6-DMA, formed by amide bond hydrolysis of either D2624 or D3017. In urine, three metabolites of D2624 were identified: namely D3017,2,6-DMA, and D3270 (which is the carboxylic acid derivative of D3017). Based on plasma AUC analysis, D3017 and 2,6-DMA accounted for > 90% of the dose of D2624. After oral administration, D2624 was found to be well absorbed (93%), but underwent extensive first-pass metabolism in the rat, thus resulting in 5.3% bioavailability. Rat and human liver microsomal preparations were capable of metabolizing D2624 to D3017 and 2,6-DMA. The formation of D3017 was NADPH-dependent, whereas 2,6-DMA formation was NADPH-independent and probably was catalyzed by amidase(s) enzymes. In a single-dose (25-225 mg) human volunteer study, the parent drug (D2624) was not detected in plasma at any dose, whereas 2,6-DMA was detected only at the two highest doses (150 and 225 mg). D3017 was detected after all doses of parent drug, with approximate dose proportionality in AUC and a half-life of 1.3-2.2 hr. The metabolic behavior observed in humans suggests there is a marked species difference in the oxidative and hydrolytic pathways of D2624.
Collapse
Affiliation(s)
- S W Martin
- Department of Medicinal Chemistry, University of Washington, Seattle 98195-7610, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Moor M, Honegger UE, Wiesmann UN. Organ-specific, qualitative changes in the phospholipid composition of rats after chronic administration of the antidepressant drug desipramine. Biochem Pharmacol 1988; 37:2035-9. [PMID: 2837221 DOI: 10.1016/0006-2952(88)90553-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Rats were chronically treated with daily i.p. injections of 10 mg/kg desipramine for 21 days. A 30% decrease in the number of beta-adrenoceptors was observed in brain. A receptor desensitization of similar extent was noted in submaxillary glands and lung. No change in beta-adrenoceptor number was present in heart. Total phospholipid contents were not altered in these organs after chronic drug treatment. However, organ-specific changes were found in the phospholipid composition of submaxillary glands, lung and liver but not in whole brain and heart. The changes were variable but an increase in phosphatidylinositol and decreases in phosphatidylethanolamine and sphingomyelin were consistent. Possible alterations in the phospholipid composition of the brain might have been masked by the large and stable pool of myelin phospholipids. A casual relationship between changes in the phospholipid composition and beta-adrenoceptor desensitization is discussed.
Collapse
Affiliation(s)
- M Moor
- Department of Pharmacology, University of Bern, Switzerland
| | | | | |
Collapse
|
18
|
Abstract
Adipose tissue kinetics of chlorpromazine and imipramine, two drugs which are more lipophilic than thiopental, were studied in the rat. After single i.v. doses, the time-course of drug distribution was followed in adipose and various other tissues, until their concentrations in adipose tissues declined. Under these conditions the two drugs behaved almost identically. Among the tissues analyzed, the lowest concentrations were found in adipose tissue, with the exception of plasma. At its maximum concentration after about 30 minutes, total adipose tissue contained only 3% of the dose of administered drugs. Adipose/plasma and adipose/lung concentration ratios were 2-5 and 0.05, respectively. After maximum tolerated oral doses of imipramine for 3 weeks, similar steady state concentration ratios (plasma:adipose:brain:lung 1:3:12:96) were observed. In adipose tissue the imipramine/desmethylimipramine ratio was about 1, and the desmethylimipramine steady state levels did not increase with time. Literature data indicate that many basic lipophilic drugs are not stored in adipose tissue. This is now clearly shown for chlorpromazine and imipramine, even under extreme, subchronic conditions in the case of imipramine.
Collapse
|
19
|
Moor M. The career/parent dilemma. Nurse and mother--on the job. RN (For Managers) 1982; 45:134. [PMID: 6921853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
|