1
|
Hegde N, Vardhan M, Nathani D, Rosenzweig E, Speed C, Karthikesalingam A, Seneviratne M. Infusing behavior science into large language models for activity coaching. PLOS Digit Health 2024; 3:e0000431. [PMID: 38564502 PMCID: PMC10986996 DOI: 10.1371/journal.pdig.0000431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 12/14/2023] [Indexed: 04/04/2024]
Abstract
Large language models (LLMs) have shown promise for task-oriented dialogue across a range of domains. The use of LLMs in health and fitness coaching is under-explored. Behavior science frameworks such as COM-B, which conceptualizes behavior change in terms of capability (C), Opportunity (O) and Motivation (M), can be used to architect coaching interventions in a way that promotes sustained change. Here we aim to incorporate behavior science principles into an LLM using two knowledge infusion techniques: coach message priming (where exemplar coach responses are provided as context to the LLM), and dialogue re-ranking (where the COM-B category of the LLM output is matched to the inferred user need). Simulated conversations were conducted between the primed or unprimed LLM and a member of the research team, and then evaluated by 8 human raters. Ratings for the primed conversations were significantly higher in terms of empathy and actionability. The same raters also compared a single response generated by the unprimed, primed and re-ranked models, finding a significant uplift in actionability and empathy from the re-ranking technique. This is a proof of concept of how behavior science frameworks can be infused into automated conversational agents for a more principled coaching experience.
Collapse
|
2
|
Ktena I, Wiles O, Albuquerque I, Rebuffi SA, Tanno R, Roy AG, Azizi S, Belgrave D, Kohli P, Cemgil T, Karthikesalingam A, Gowal S. Generative models improve fairness of medical classifiers under distribution shifts. Nat Med 2024; 30:1166-1173. [PMID: 38600282 PMCID: PMC11031395 DOI: 10.1038/s41591-024-02838-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 01/26/2024] [Indexed: 04/12/2024]
Abstract
Domain generalization is a ubiquitous challenge for machine learning in healthcare. Model performance in real-world conditions might be lower than expected because of discrepancies between the data encountered during deployment and development. Underrepresentation of some groups or conditions during model development is a common cause of this phenomenon. This challenge is often not readily addressed by targeted data acquisition and 'labeling' by expert clinicians, which can be prohibitively expensive or practically impossible because of the rarity of conditions or the available clinical expertise. We hypothesize that advances in generative artificial intelligence can help mitigate this unmet need in a steerable fashion, enriching our training dataset with synthetic examples that address shortfalls of underrepresented conditions or subgroups. We show that diffusion models can automatically learn realistic augmentations from data in a label-efficient manner. We demonstrate that learned augmentations make models more robust and statistically fair in-distribution and out of distribution. To evaluate the generality of our approach, we studied three distinct medical imaging contexts of varying difficulty: (1) histopathology, (2) chest X-ray and (3) dermatology images. Complementing real samples with synthetic ones improved the robustness of models in all three medical tasks and increased fairness by improving the accuracy of clinical diagnosis within underrepresented groups, especially out of distribution.
Collapse
|
3
|
Maier-Hein L, Reinke A, Godau P, Tizabi MD, Buettner F, Christodoulou E, Glocker B, Isensee F, Kleesiek J, Kozubek M, Reyes M, Riegler MA, Wiesenfarth M, Kavur AE, Sudre CH, Baumgartner M, Eisenmann M, Heckmann-Nötzel D, Rädsch T, Acion L, Antonelli M, Arbel T, Bakas S, Benis A, Blaschko MB, Cardoso MJ, Cheplygina V, Cimini BA, Collins GS, Farahani K, Ferrer L, Galdran A, van Ginneken B, Haase R, Hashimoto DA, Hoffman MM, Huisman M, Jannin P, Kahn CE, Kainmueller D, Kainz B, Karargyris A, Karthikesalingam A, Kofler F, Kopp-Schneider A, Kreshuk A, Kurc T, Landman BA, Litjens G, Madani A, Maier-Hein K, Martel AL, Mattson P, Meijering E, Menze B, Moons KGM, Müller H, Nichyporuk B, Nickel F, Petersen J, Rajpoot N, Rieke N, Saez-Rodriguez J, Sánchez CI, Shetty S, van Smeden M, Summers RM, Taha AA, Tiulpin A, Tsaftaris SA, Van Calster B, Varoquaux G, Jäger PF. Metrics reloaded: recommendations for image analysis validation. Nat Methods 2024; 21:195-212. [PMID: 38347141 DOI: 10.1038/s41592-023-02151-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 12/12/2023] [Indexed: 02/15/2024]
Abstract
Increasing evidence shows that flaws in machine learning (ML) algorithm validation are an underestimated global problem. In biomedical image analysis, chosen performance metrics often do not reflect the domain interest, and thus fail to adequately measure scientific progress and hinder translation of ML techniques into practice. To overcome this, we created Metrics Reloaded, a comprehensive framework guiding researchers in the problem-aware selection of metrics. Developed by a large international consortium in a multistage Delphi process, it is based on the novel concept of a problem fingerprint-a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), dataset and algorithm output. On the basis of the problem fingerprint, users are guided through the process of choosing and applying appropriate validation metrics while being made aware of potential pitfalls. Metrics Reloaded targets image analysis problems that can be interpreted as classification tasks at image, object or pixel level, namely image-level classification, object detection, semantic segmentation and instance segmentation tasks. To improve the user experience, we implemented the framework in the Metrics Reloaded online tool. Following the convergence of ML methodology across application domains, Metrics Reloaded fosters the convergence of validation methodology. Its applicability is demonstrated for various biomedical use cases.
Collapse
Affiliation(s)
- Lena Maier-Hein
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany.
- German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany.
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany.
- Medical Faculty, Heidelberg University, Heidelberg, Germany.
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany.
| | - Annika Reinke
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany.
- German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany.
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany.
| | - Patrick Godau
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- Faculty of Mathematics and Computer Science, Heidelberg University, Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany
| | - Minu D Tizabi
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany
| | - Florian Buettner
- German Cancer Consortium (DKTK), partner site Frankfurt/Mainz, a partnership between DKFZ and UCT Frankfurt-Marburg, Frankfurt am Main, Germany
- German Cancer Research Center (DKFZ) Heidelberg, Heidelberg, Germany
- Department of Medicine, Goethe University Frankfurt, Frankfurt am Main, Germany
- Department of Informatics, Goethe University Frankfurt, Frankfurt am Main, Germany
- Frankfurt Cancer Insititute, Frankfurt am Main, Germany
| | - Evangelia Christodoulou
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
| | - Ben Glocker
- Department of Computing, Imperial College London, South Kensington Campus, London, UK
| | - Fabian Isensee
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
- German Cancer Research Center (DKFZ) Heidelberg, HI Applied Computer Vision Lab, Heidelberg, Germany
| | - Jens Kleesiek
- Institute for AI in Medicine, University Medicine Essen, Essen, Germany
| | - Michal Kozubek
- Centre for Biomedical Image Analysis and Faculty of Informatics, Masaryk University, Brno, Czech Republic
| | - Mauricio Reyes
- ARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland
- Department of Radiation Oncology, University Hospital Bern, University of Bern, Bern, Switzerland
| | - Michael A Riegler
- Simula Metropolitan Center for Digital Engineering, Oslo, Norway
- Department of Computer Science, UiT The Arctic University of Norway, Tromsø, Norway
| | - Manuel Wiesenfarth
- German Cancer Research Center (DKFZ) Heidelberg, Division of Biostatistics, Heidelberg, Germany
| | - A Emre Kavur
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
- German Cancer Research Center (DKFZ) Heidelberg, HI Applied Computer Vision Lab, Heidelberg, Germany
| | - Carole H Sudre
- MRC Unit for Lifelong Health and Ageing at UCL and Centre for Medical Image Computing, Department of Computer Science, University College London, London, UK
- School of Biomedical Engineering and Imaging Science, King's College London, London, UK
| | - Michael Baumgartner
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
| | - Matthias Eisenmann
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
| | - Doreen Heckmann-Nötzel
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- National Center for Tumor Diseases (NCT), NCT Heidelberg, a partnership between DKFZ and University Medical Center Heidelberg, Heidelberg, Germany
| | - Tim Rädsch
- German Cancer Research Center (DKFZ) Heidelberg, Division of Intelligent Medical Systems, Heidelberg, Germany
- German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany
| | - Laura Acion
- Instituto de Cálculo, CONICET - Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Michela Antonelli
- School of Biomedical Engineering and Imaging Science, King's College London, London, UK
- Centre for Medical Image Computing, University College London, London, UK
| | - Tal Arbel
- Centre for Intelligent Machines and MILA (Québec Artificial Intelligence Institute), McGill University, Montréal, Quebec, Canada
| | - Spyridon Bakas
- Division of Computational Pathology, Department of Pathology & Laboratory Medicine, Indiana University School of Medicine, IU Health Information and Translational Sciences Building, Indianapolis, IN, USA
- Center for Biomedical Image Computing and Analytics (CBICA), University of Pennsylvania, Philadelphia, PA, USA
| | - Arriel Benis
- Department of Digital Medical Technologies, Holon Institute of Technology, Holon, Israel
- European Federation for Medical Informatics, Le Mont-sur-Lausanne, Switzerland
| | - Matthew B Blaschko
- Center for Processing Speech and Images, Department of Electrical Engineering, KU Leuven, Leuven, Belgium
| | - M Jorge Cardoso
- School of Biomedical Engineering and Imaging Science, King's College London, London, UK
| | - Veronika Cheplygina
- Department of Computer Science, IT University of Copenhagen, Copenhagen, Denmark
| | - Beth A Cimini
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Gary S Collins
- Centre for Statistics in Medicine, University of Oxford, Nuffield Orthopaedic Centre, Oxford, UK
| | - Keyvan Farahani
- Center for Biomedical Informatics and Information Technology, National Cancer Institute, Bethesda, MD, USA
| | - Luciana Ferrer
- Instituto de Investigación en Ciencias de la Computación (ICC), CONICET-UBA, Ciudad Autónoma de Buenos Aires, Buenos Aires, Argentina
| | - Adrian Galdran
- BCN Medtech, Universitat Pompeu Fabra, Barcelona, Spain
- Australian Institute for Machine Learning AIML, University of Adelaide, Adelaide, South Australia, Australia
| | - Bram van Ginneken
- Fraunhofer MEVIS, Bremen, Germany
- Radboud Institute for Health Sciences, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Robert Haase
- Technische Universität (TU) Dresden, DFG Cluster of Excellence 'Physics of Life', Dresden, Germany
- Center for Systems Biology, Dresden, Germany
- Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Leipzig University, Leipzig, Germany
| | - Daniel A Hashimoto
- Department of Surgery, Perelman School of Medicine, Philadelphia, PA, USA
- General Robotics Automation Sensing and Perception Laboratory, School of Engineering and Applied Science, University of Pennsylvania, Philadelphia, PA, USA
| | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Merel Huisman
- Department of Radiology and Nuclear Medicine, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Pierre Jannin
- Laboratoire Traitement du Signal et de l'Image - UMR_S 1099, Université de Rennes 1, Rennes, France
- INSERM, Paris, France
| | - Charles E Kahn
- Department of Radiology and Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Dagmar Kainmueller
- Max-Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Biomedical Image Analysis and HI Helmholtz Imaging, Berlin, Germany
- Digital Engineering Faculty, University of Potsdam, Potsdam, Germany
| | - Bernhard Kainz
- Department of Computing, Faculty of Engineering, Imperial College London, London, UK
- Department AIBE, Friedrich-Alexander-Universität (FAU), Erlangen-Nürnberg, Germany
| | | | | | | | - Annette Kopp-Schneider
- German Cancer Research Center (DKFZ) Heidelberg, Division of Biostatistics, Heidelberg, Germany
| | - Anna Kreshuk
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Tahsin Kurc
- Department of Biomedical Informatics, Stony Brook University, Health Science Center, Stony Brook, NY, USA
| | | | - Geert Litjens
- Department of Pathology, Radboud University Medical Center, Nijmegen, the Netherlands
| | - Amin Madani
- Department of Surgery, University Health Network, Philadelphia, PA, USA
| | - Klaus Maier-Hein
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
- Pattern Analysis and Learning Group, Department of Radiation Oncology, Heidelberg University Hospital, Heidelberg, Germany
| | - Anne L Martel
- Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
- Physical Sciences, Sunnybrook Research Institute, Toronto, Ontario, Canada
| | - Peter Mattson
- Google, 1600 Amphitheatre Pkwy, Mountain View, CA, USA
| | - Erik Meijering
- School of Computer Science and Engineering, University of New South Wales, UNSW Sydney, Kensington, New South Wales, Australia
| | - Bjoern Menze
- Department of Quantitative Biomedicine, University of Zurich, Zurich, Switzerland
| | - Karel G M Moons
- Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Henning Müller
- Information Systems Institute, University of Applied Sciences Western Switzerland (HES-SO), Sierre, Switzerland
- Medical Faculty, University of Geneva, Geneva, Switzerland
| | - Brennan Nichyporuk
- MILA (Québec Artificial Intelligence Institute), Montréal, Quebec, Canada
| | - Felix Nickel
- Department of General, Visceral and Thoracic Surgery, University Medical Center Hamburg-Eppendorf, Hamburg, Germany
| | - Jens Petersen
- German Cancer Research Center (DKFZ) Heidelberg, Division of Medical Image Computing, Heidelberg, Germany
| | - Nasir Rajpoot
- Tissue Image Analytics Laboratory, Department of Computer Science, University of Warwick, Coventry, UK
| | | | - Julio Saez-Rodriguez
- Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany
- Faculty of Medicine, Heidelberg University Hospital, Heidelberg, Germany
| | - Clara I Sánchez
- Informatics Institute, Faculty of Science, University of Amsterdam, Amsterdam, the Netherlands
| | | | - Maarten van Smeden
- Julius Center for Health Sciences and Primary Care, UMC Utrecht, Utrecht University, Utrecht, the Netherlands
| | - Ronald M Summers
- National Institutes of Health Clinical Center, Bethesda, MD, USA
| | - Abdel A Taha
- Institute of Information Systems Engineering, TU Wien, Vienna, Austria
| | - Aleksei Tiulpin
- Research Unit of Health Sciences and Technology, Faculty of Medicine, University of Oulu, Oulu, Finland
- Neurocenter Oulu, Oulu University Hospital, Oulu, Finland
| | | | - Ben Van Calster
- Department of Development and Regeneration and EPI-centre, KU Leuven, Leuven, Belgium
- Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands
| | - Gaël Varoquaux
- Parietal project team, INRIA Saclay-Île de France, Palaiseau, France
| | - Paul F Jäger
- German Cancer Research Center (DKFZ) Heidelberg, HI Helmholtz Imaging, Heidelberg, Germany.
- German Cancer Research Center (DKFZ) Heidelberg, Interactive Machine Learning Group, Heidelberg, Germany.
| |
Collapse
|
4
|
Weng WH, Sellergen A, Kiraly AP, D'Amour A, Park J, Pilgrim R, Pfohl S, Lau C, Natarajan V, Azizi S, Karthikesalingam A, Cole-Lewis H, Matias Y, Corrado GS, Webster DR, Shetty S, Prabhakara S, Eswaran K, Celi LAG, Liu Y. An intentional approach to managing bias in general purpose embedding models. Lancet Digit Health 2024; 6:e126-e130. [PMID: 38278614 DOI: 10.1016/s2589-7500(23)00227-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 10/24/2023] [Accepted: 11/02/2023] [Indexed: 01/28/2024]
Abstract
Advances in machine learning for health care have brought concerns about bias from the research community; specifically, the introduction, perpetuation, or exacerbation of care disparities. Reinforcing these concerns is the finding that medical images often reveal signals about sensitive attributes in ways that are hard to pinpoint by both algorithms and people. This finding raises a question about how to best design general purpose pretrained embeddings (GPPEs, defined as embeddings meant to support a broad array of use cases) for building downstream models that are free from particular types of bias. The downstream model should be carefully evaluated for bias, and audited and improved as appropriate. However, in our view, well intentioned attempts to prevent the upstream components-GPPEs-from learning sensitive attributes can have unintended consequences on the downstream models. Despite producing a veneer of technical neutrality, the resultant end-to-end system might still be biased or poorly performing. We present reasons, by building on previously published data, to support the reasoning that GPPEs should ideally contain as much information as the original data contain, and highlight the perils of trying to remove sensitive attributes from a GPPE. We also emphasise that downstream prediction models trained for specific tasks and settings, whether developed using GPPEs or not, should be carefully designed and evaluated to avoid bias that makes models vulnerable to issues such as distributional shift. These evaluations should be done by a diverse team, including social scientists, on a diverse cohort representing the full breadth of the patient population for which the final model is intended.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Leo A G Celi
- Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA; Department of Medicine, Beth Israel Deaconess Medical Center, Boston, MA, USA
| | - Yun Liu
- Google, Mountain View, CA, USA.
| |
Collapse
|
5
|
Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, Scales N, Tanwani A, Cole-Lewis H, Pfohl S, Payne P, Seneviratne M, Gamble P, Kelly C, Babiker A, Schärli N, Chowdhery A, Mansfield P, Demner-Fushman D, Agüera Y Arcas B, Webster D, Corrado GS, Matias Y, Chou K, Gottweis J, Tomasev N, Liu Y, Rajkomar A, Barral J, Semturs C, Karthikesalingam A, Natarajan V. Large language models encode clinical knowledge. Nature 2023; 620:172-180. [PMID: 37438534 PMCID: PMC10396962 DOI: 10.1038/s41586-023-06291-2] [Citation(s) in RCA: 179] [Impact Index Per Article: 179.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 06/05/2023] [Indexed: 07/14/2023]
Abstract
Large language models (LLMs) have demonstrated impressive capabilities, but the bar for clinical applications is high. Attempts to assess the clinical knowledge of models typically rely on automated evaluations based on limited benchmarks. Here, to address these limitations, we present MultiMedQA, a benchmark combining six existing medical question answering datasets spanning professional medicine, research and consumer queries and a new dataset of medical questions searched online, HealthSearchQA. We propose a human evaluation framework for model answers along multiple axes including factuality, comprehension, reasoning, possible harm and bias. In addition, we evaluate Pathways Language Model1 (PaLM, a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM2 on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA3, MedMCQA4, PubMedQA5 and Measuring Massive Multitask Language Understanding (MMLU) clinical topics6), including 67.6% accuracy on MedQA (US Medical Licensing Exam-style questions), surpassing the prior state of the art by more than 17%. However, human evaluation reveals key gaps. To resolve this, we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, knowledge recall and reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLMs for clinical applications.
Collapse
Affiliation(s)
| | | | - Tao Tu
- Google Research, Mountain View, CA, USA
| | | | - Jason Wei
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Yun Liu
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | |
Collapse
|
6
|
Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, Scales N, Tanwani A, Cole-Lewis H, Pfohl S, Payne P, Seneviratne M, Gamble P, Kelly C, Babiker A, Schärli N, Chowdhery A, Mansfield P, Demner-Fushman D, Agüera Y Arcas B, Webster D, Corrado GS, Matias Y, Chou K, Gottweis J, Tomasev N, Liu Y, Rajkomar A, Barral J, Semturs C, Karthikesalingam A, Natarajan V. Publisher Correction: Large language models encode clinical knowledge. Nature 2023; 620:E19. [PMID: 37500979 PMCID: PMC10412443 DOI: 10.1038/s41586-023-06455-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Affiliation(s)
| | | | - Tao Tu
- Google Research, Mountain View, CA, USA
| | | | - Jason Wei
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Yun Liu
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | |
Collapse
|
7
|
Brown A, Tomasev N, Freyberg J, Liu Y, Karthikesalingam A, Schrouff J. Detecting shortcut learning for fair medical AI using shortcut testing. Nat Commun 2023; 14:4314. [PMID: 37463884 DOI: 10.1038/s41467-023-39902-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 06/26/2023] [Indexed: 07/20/2023] Open
Abstract
Machine learning (ML) holds great promise for improving healthcare, but it is critical to ensure that its use will not propagate or amplify health disparities. An important step is to characterize the (un)fairness of ML models-their tendency to perform differently across subgroups of the population-and to understand its underlying mechanisms. One potential driver of algorithmic unfairness, shortcut learning, arises when ML models base predictions on improper correlations in the training data. Diagnosing this phenomenon is difficult as sensitive attributes may be causally linked with disease. Using multitask learning, we propose a method to directly test for the presence of shortcut learning in clinical ML systems and demonstrate its application to clinical tasks in radiology and dermatology. Finally, our approach reveals instances when shortcutting is not responsible for unfairness, highlighting the need for a holistic approach to fairness mitigation in medical AI.
Collapse
Affiliation(s)
| | | | | | - Yuan Liu
- Google Research, Palo Alto, CA, USA
| | | | | |
Collapse
|
8
|
Dvijotham KD, Winkens J, Barsbey M, Ghaisas S, Stanforth R, Pawlowski N, Strachan P, Ahmed Z, Azizi S, Bachrach Y, Culp L, Daswani M, Freyberg J, Kelly C, Kiraly A, Kohlberger T, McKinney S, Mustafa B, Natarajan V, Geras K, Witowski J, Qin ZZ, Creswell J, Shetty S, Sieniek M, Spitz T, Corrado G, Kohli P, Cemgil T, Karthikesalingam A. Enhancing the reliability and accuracy of AI-enabled diagnosis via complementarity-driven deferral to clinicians. Nat Med 2023; 29:1814-1820. [PMID: 37460754 DOI: 10.1038/s41591-023-02437-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 06/05/2023] [Indexed: 07/20/2023]
Abstract
Predictive artificial intelligence (AI) systems based on deep learning have been shown to achieve expert-level identification of diseases in multiple medical imaging settings, but can make errors in cases accurately diagnosed by clinicians and vice versa. We developed Complementarity-Driven Deferral to Clinical Workflow (CoDoC), a system that can learn to decide between the opinion of a predictive AI model and a clinical workflow. CoDoC enhances accuracy relative to clinician-only or AI-only baselines in clinical workflows that screen for breast cancer or tuberculosis (TB). For breast cancer screening, compared to double reading with arbitration in a screening program in the UK, CoDoC reduced false positives by 25% at the same false-negative rate, while achieving a 66% reduction in clinician workload. For TB triaging, compared to standalone AI and clinical workflows, CoDoC achieved a 5-15% reduction in false positives at the same false-negative rate for three of five commercially available predictive AI systems. To facilitate the deployment of CoDoC in novel futuristic clinical settings, we present results showing that CoDoC's performance gains are sustained across several axes of variation (imaging modality, clinical setting and predictive AI system) and discuss the limitations of our evaluation and where further validation would be needed. We provide an open-source implementation to encourage further research and application.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Laura Culp
- Google DeepMind, Toronto, Ontario, Canada
| | | | | | | | | | | | | | | | | | | | - Jan Witowski
- NYU Grossman School of Medicine, New York, NY, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Azizi S, Culp L, Freyberg J, Mustafa B, Baur S, Kornblith S, Chen T, Tomasev N, Mitrović J, Strachan P, Mahdavi SS, Wulczyn E, Babenko B, Walker M, Loh A, Chen PHC, Liu Y, Bavishi P, McKinney SM, Winkens J, Roy AG, Beaver Z, Ryan F, Krogue J, Etemadi M, Telang U, Liu Y, Peng L, Corrado GS, Webster DR, Fleet D, Hinton G, Houlsby N, Karthikesalingam A, Norouzi M, Natarajan V. Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging. Nat Biomed Eng 2023:10.1038/s41551-023-01049-7. [PMID: 37291435 DOI: 10.1038/s41551-023-01049-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 05/02/2023] [Indexed: 06/10/2023]
Abstract
Machine-learning models for medical tasks can match or surpass the performance of clinical experts. However, in settings differing from those of the training dataset, the performance of a model can deteriorate substantially. Here we report a representation-learning strategy for machine-learning models applied to medical-imaging tasks that mitigates such 'out of distribution' performance problem and that improves model robustness and training efficiency. The strategy, which we named REMEDIS (for 'Robust and Efficient Medical Imaging with Self-supervision'), combines large-scale supervised transfer learning on natural images and intermediate contrastive self-supervised learning on medical images and requires minimal task-specific customization. We show the utility of REMEDIS in a range of diagnostic-imaging tasks covering six imaging domains and 15 test datasets, and by simulating three realistic out-of-distribution scenarios. REMEDIS improved in-distribution diagnostic accuracies up to 11.5% with respect to strong supervised baseline models, and in out-of-distribution settings required only 1-33% of the data for retraining to match the performance of supervised models retrained using all available data. REMEDIS may accelerate the development lifecycle of machine-learning models for medical imaging.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Ting Chen
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | | | | | | - Aaron Loh
- Google Research, Mountain View, CA, USA
| | | | - Yuan Liu
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | | - Fiona Ryan
- Georgia Institute of Technology, Computer Science, Atlanta, GA, USA
| | | | - Mozziyar Etemadi
- School of Medicine/School of Engineering, Northwestern University, Chicago, IL, USA
| | | | - Yun Liu
- Google Research, Mountain View, CA, USA
| | - Lily Peng
- Google Research, Mountain View, CA, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Speed C, Arneil T, Harle R, Wilson A, Karthikesalingam A, McConnell M, Phillips J. Measure by measure: Resting heart rate across the 24-hour cycle. PLOS Digit Health 2023; 2:e0000236. [PMID: 37115739 PMCID: PMC10146540 DOI: 10.1371/journal.pdig.0000236] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 03/20/2023] [Indexed: 04/29/2023]
Abstract
BACKGROUND Photoplethysmography (PPG) sensors, typically found in wrist-worn devices, can continuously monitor heart rate (HR) in large populations in real-world settings. Resting heart rate (RHR) is an important biomarker of morbidities and mortality, but no universally accepted definition nor measurement criteria exist. In this study, we provide a working definition of RHR and describe a method for accurate measurement of this biomarker, recorded using PPG derived from wristband measurement across the 24-hour cycle. METHODS 433 healthy subjects wore a wrist device that measured activity and HR for up to 3 months. HR during inactivity was recorded and the duration of inactivity needed for HR to stabilise was ascertained. We identified the lowest HR during each 24-hour cycle (true RHR) and examined the time of day or night this occurred. The variation of HR during inactivity through the 24-hour cycle was also assessed. The sample was also subdivided according to daily activity levels for subset analysis. FINDINGS Adequate data was obtained for 19,242 days and 18,520 nights. HR stabilised in most subjects after 4 minutes of inactivity. Mean (SD) RHR for the sample was 54.5 (8.0) bpm (day) and 50.5 (7.6) bpm (night). RHR values were highest in the least active group (lowest MET quartile). A circadian variation of HR during inactivity was confirmed, with the lowest values being between 0300 and 0700 hours for most subjects. INTERPRETATION RHR measured using a PPG-based wrist-worn device is significantly lower at night than in the day, and a circadian rhythm of HR during inactivity was confirmed. Since RHR is such an important health metric, clarity on the definition and measurement methodology used is important. For most subjects, a minimum rest time of 4 minutes provides a reliable measurement of HR during inactivity and true RHR in a 24-hour cycle is best measured between 0300 and 0700 hours. Funding: This study was funded by Google.
Collapse
Affiliation(s)
- Cathy Speed
- Google, Mountain View, California, United States of America
- Cambridge Health & Performance, Cambridge, United Kingdom
- Cardiff Metropolitan University, Cardiff, United Kingdom
| | - Thomas Arneil
- Google, Mountain View, California, United States of America
| | - Robert Harle
- Google, Mountain View, California, United States of America
- Computer Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Alex Wilson
- Google, Mountain View, California, United States of America
| | | | - Michael McConnell
- Google, Mountain View, California, United States of America
- Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, California, United States of America
| | | |
Collapse
|
11
|
Ganapathi S, Palmer J, Alderman JE, Calvert M, Espinoza C, Gath J, Ghassemi M, Heller K, Mckay F, Karthikesalingam A, Kuku S, Mackintosh M, Manohar S, Mateen BA, Matin R, McCradden M, Oakden-Rayner L, Ordish J, Pearson R, Pfohl SR, Rostamzadeh N, Sapey E, Sebire N, Sounderajah V, Summers C, Treanor D, Denniston AK, Liu X. Tackling bias in AI health datasets through the STANDING Together initiative. Nat Med 2022; 28:2232-2233. [PMID: 36163296 DOI: 10.1038/s41591-022-01987-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Affiliation(s)
- Shaswath Ganapathi
- College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Jo Palmer
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
| | - Joseph E Alderman
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.,Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Melanie Calvert
- Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK.,Centre for Patient Reported Outcome Research, Institute of Applied Health Research, University of Birmingham, Birmingham, UK.,NIHR Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.,NIHR Surgical Reconstruction and Microbiology Research Centre, University of Birmingham, Birmingham, UK.,NIHR Applied Research Collaborative West Midlands University of Birmingham, Birmingham, UK
| | | | - Jacqui Gath
- Patient Partner, Birmingham, UK.,Patient Partner, Sheffield, UK
| | - Marzyeh Ghassemi
- Department of Electrical Engineering and Computer Science; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | | | - Francis Mckay
- The Ethox Centre and the Wellcome Centre for Ethics and Humanities, Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | | | - Stephanie Kuku
- Institute of Women's Health, University College London, London, UK.,Hardian Health, London, UK
| | | | | | - Bilal A Mateen
- Institute of Health Informatics, University College London, London, UK.,The Wellcome Trust, London, UK
| | - Rubeta Matin
- Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - Melissa McCradden
- Department of Bioethics, Hospital for Sick Children, Toronto, Ontario, Canada.,Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Lauren Oakden-Rayner
- Australian Institute for Machine Learning, University of Adelaide, Adelaide, South Australia, Australia
| | - Johan Ordish
- Medicines and Healthcare Products Regulatory Agency, London, UK
| | - Russell Pearson
- Medicines and Healthcare Products Regulatory Agency, London, UK
| | | | | | - Elizabeth Sapey
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.,Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
| | - Neil Sebire
- Health Data Research, London, UK.,Great Ormond Street Hospital for Children, London, UK
| | - Viknesh Sounderajah
- Institute of Global Health Innovation, Imperial College London, London, UK.,Department of Surgery and Cancer, Imperial College London, London, UK
| | - Charlotte Summers
- Wolfson Lung Injury Unit, Heart and Lung Research Institute, University of Cambridge, Cambrdige, UK.,Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Darren Treanor
- Leeds Teaching Hospitals NHS Trust, Leeds, UK.,University of Leeds, Leeds, UK.,Department of Clinical Pathology, and Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden.,Center for Medical Image Science and Visualization (CMIV), Linköping University, Linköping, Sweden
| | - Alastair K Denniston
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK.,Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.,Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK.,NIHR Birmingham Biomedical Research Centre, University of Birmingham, Birmingham, UK.,Health Data Research, London, UK
| | - Xiaoxuan Liu
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK. .,Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK. .,Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK.
| |
Collapse
|
12
|
Sounderajah V, Ashrafian H, Karthikesalingam A, Markar SR, Normahani P, Collins GS, Bossuyt PM, Darzi A. Developing Specific Reporting Standards in Artificial Intelligence Centred Research. Ann Surg 2022; 275:e547-e548. [PMID: 35120063 DOI: 10.1097/sla.0000000000005294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Affiliation(s)
- Viknesh Sounderajah
- Institute of Global Health Innovation, Imperial College London, London, United Kingdom
- Department of Surgery & Cancer, Imperial College London, London, United Kingdom
| | - Hutan Ashrafian
- Institute of Global Health Innovation, Imperial College London, London, United Kingdom
- Department of Surgery & Cancer, Imperial College London, London, United Kingdom
| | | | - Sheraz R Markar
- Department of Surgery & Cancer, Imperial College London, London, United Kingdom
| | - Pasha Normahani
- Department of Surgery & Cancer, Imperial College London, London, United Kingdom
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, United Kingdom
- NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, United Kingdom
| | - Patrick M Bossuyt
- Department of Epidemiology and Data Science, Amsterdam University Medical Centre, University of Amsterdam, the Netherlands
| | - Ara Darzi
- Institute of Global Health Innovation, Imperial College London, London, United Kingdom
- Department of Surgery & Cancer, Imperial College London, London, United Kingdom
| |
Collapse
|
13
|
Guha Roy A, Ren J, Azizi S, Loh A, Natarajan V, Mustafa B, Pawlowski N, Freyberg J, Liu Y, Beaver Z, Vo N, Bui P, Winter S, MacWilliams P, Corrado GS, Telang U, Liu Y, Cemgil T, Karthikesalingam A, Lakshminarayanan B, Winkens J. Does your dermatology classifier know what it doesn't know? Detecting the long-tail of unseen conditions. Med Image Anal 2021; 75:102274. [PMID: 34731777 DOI: 10.1016/j.media.2021.102274] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 08/12/2021] [Accepted: 10/15/2021] [Indexed: 11/15/2022]
Abstract
Supervised deep learning models have proven to be highly effective in classification of dermatological conditions. These models rely on the availability of abundant labeled training examples. However, in the real-world, many dermatological conditions are individually too infrequent for per-condition classification with supervised learning. Although individually infrequent, these conditions may collectively be common and therefore are clinically significant in aggregate. To prevent models from generating erroneous outputs on such examples, there remains a considerable unmet need for deep learning systems that can better detect such infrequent conditions. These infrequent 'outlier' conditions are seen very rarely (or not at all) during training. In this paper, we frame this task as an out-of-distribution (OOD) detection problem. We set up a benchmark ensuring that outlier conditions are disjoint between the model training, validation, and test sets. Unlike traditional OOD detection benchmarks where the task is to detect dataset distribution shift, we aim at the more challenging task of detecting subtle differences resulting from a different pathology or condition. We propose a novel hierarchical outlier detection (HOD) loss, which assigns multiple abstention classes corresponding to each training outlier class and jointly performs a coarse classification of inliers vs. outliers, along with fine-grained classification of the individual classes. We demonstrate that the proposed HOD loss based approach outperforms leading methods that leverage outlier data during training. Further, performance is significantly boosted by using recent representation learning methods (BiT, SimCLR, MICLe). Further, we explore ensembling strategies for OOD detection and propose a diverse ensemble selection process for the best result. We also perform a subgroup analysis over conditions of varying risk levels and different skin types to investigate how OOD performance changes over each subgroup and demonstrate the gains of our framework in comparison to baseline. Furthermore, we go beyond traditional performance metrics and introduce a cost matrix for model trust analysis to approximate downstream clinical impact. We use this cost matrix to compare the proposed method against the baseline, thereby making a stronger case for its effectiveness in real-world scenarios.
Collapse
Affiliation(s)
| | - Jie Ren
- Google Research, Brain Team.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Sounderajah V, Ashrafian H, Rose S, Shah NH, Ghassemi M, Golub R, Kahn CE, Esteva A, Karthikesalingam A, Mateen B, Webster D, Milea D, Ting D, Treanor D, Cushnan D, King D, McPherson D, Glocker B, Greaves F, Harling L, Ordish J, Cohen JF, Deeks J, Leeflang M, Diamond M, McInnes MDF, McCradden M, Abràmoff MD, Normahani P, Markar SR, Chang S, Liu X, Mallett S, Shetty S, Denniston A, Collins GS, Moher D, Whiting P, Bossuyt PM, Darzi A. A quality assessment tool for artificial intelligence-centered diagnostic test accuracy studies: QUADAS-AI. Nat Med 2021; 27:1663-1665. [PMID: 34635854 DOI: 10.1038/s41591-021-01517-0] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Viknesh Sounderajah
- Institute of Global Health Innovation, Imperial College London, London, UK.
- Department of Surgery and Cancer, Imperial College London, London, UK.
| | - Hutan Ashrafian
- Institute of Global Health Innovation, Imperial College London, London, UK
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - Sherri Rose
- Center for Health Policy and Center for Primary Care and Outcomes Research, Stanford University, Stanford, CA, USA
| | - Nigam H Shah
- Center for Biomedical Informatics Research, Stanford University, Stanford, CA, USA
| | - Marzyeh Ghassemi
- Institute for Medical Engineering & Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Robert Golub
- Journal of the American Medical Association (JAMA), Chicago, IL, USA
| | - Charles E Kahn
- University of Pennsylvania, Philadelphia, Pennsylvania, PA, USA
| | | | | | | | | | - Dan Milea
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
| | - Daniel Ting
- Singapore Eye Research Institute, Singapore National Eye Centre, Singapore, Singapore
| | - Darren Treanor
- Leeds Teaching Hospitals NHS Trust, Leeds, UK
- University of Leeds, Leeds, UK
- Department of Clinical Pathology, and Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden
- Center for Medical Image Science and Visualization (CMIV), Linköping University, Linköping, Sweden
| | | | - Dominic King
- Institute of Global Health Innovation, Imperial College London, London, UK
- Optum, London, UK
| | | | - Ben Glocker
- Faculty of Engineering, Department of Computing, Imperial College London, London, UK
| | - Felix Greaves
- National Institute for Health and Care Excellence, London, UK
| | - Leanne Harling
- Institute of Global Health Innovation, Imperial College London, London, UK
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - Johan Ordish
- Medicines and Healthcare Products Regulatory Agency, London, UK
| | - Jérémie F Cohen
- Department of Pediatrics, Centre of Research in Epidemiology and Statistics, Inserm UMR 1153, Necker- Enfants Malades Hospital, Assistance Publique-Hôpitaux de Paris, Université de Paris, Paris, France
| | - Jon Deeks
- Institute of Applied Health Research, University of Birmingham, Birmingham, UK
| | - Mariska Leeflang
- Department of Epidemiology and Data Science, Amsterdam University Medical Centres, University of Amsterdam, Amsterdam, The Netherlands
| | | | - Matthew D F McInnes
- Departments of Radiology and Epidemiology, University of Ottawa, The Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
| | - Melissa McCradden
- Department of Bioethics, The Hospital for Sick Kids, Toronto, Ontario, Canada
| | - Michael D Abràmoff
- Department of Ophthalmology and Visual Sciences, University of Iowa, Iowa City, IA, USA
| | - Pasha Normahani
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - Sheraz R Markar
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - Stephanie Chang
- Annals of Internal Medicine, American College of Physicians, Philadelphia, PA, USA
| | - Xiaoxuan Liu
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Health Data Research UK, London, UK
| | - Susan Mallett
- Centre for Medical Imaging, University College London, London, UK
| | | | - Alastair Denniston
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Health Data Research UK, London, UK
| | - Gary S Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
- NIHR Oxford Biomedical Research Centre, Oxford University Hospitals NHS Foundation Trust, Oxford, UK
| | - David Moher
- Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
| | - Penny Whiting
- Bristol Medical School, University of Bristol, Bristol, UK
| | - Patrick M Bossuyt
- Department of Epidemiology and Data Science, Amsterdam University Medical Centres, University of Amsterdam, Amsterdam, The Netherlands.
| | - Ara Darzi
- Institute of Global Health Innovation, Imperial College London, London, UK.
- Department of Surgery and Cancer, Imperial College London, London, UK.
| |
Collapse
|
15
|
Nikolov S, Blackwell S, Zverovitch A, Mendes R, Livne M, De Fauw J, Patel Y, Meyer C, Askham H, Romera-Paredes B, Kelly C, Karthikesalingam A, Chu C, Carnell D, Boon C, D'Souza D, Moinuddin SA, Garie B, McQuinlan Y, Ireland S, Hampton K, Fuller K, Montgomery H, Rees G, Suleyman M, Back T, Hughes CO, Ledsam JR, Ronneberger O. Clinically Applicable Segmentation of Head and Neck Anatomy for Radiotherapy: Deep Learning Algorithm Development and Validation Study. J Med Internet Res 2021; 23:e26151. [PMID: 34255661 PMCID: PMC8314151 DOI: 10.2196/26151] [Citation(s) in RCA: 95] [Impact Index Per Article: 31.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 02/10/2021] [Accepted: 04/30/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Over half a million individuals are diagnosed with head and neck cancer each year globally. Radiotherapy is an important curative treatment for this disease, but it requires manual time to delineate radiosensitive organs at risk. This planning process can delay treatment while also introducing interoperator variability, resulting in downstream radiation dose differences. Although auto-segmentation algorithms offer a potentially time-saving solution, the challenges in defining, quantifying, and achieving expert performance remain. OBJECTIVE Adopting a deep learning approach, we aim to demonstrate a 3D U-Net architecture that achieves expert-level performance in delineating 21 distinct head and neck organs at risk commonly segmented in clinical practice. METHODS The model was trained on a data set of 663 deidentified computed tomography scans acquired in routine clinical practice and with both segmentations taken from clinical practice and segmentations created by experienced radiographers as part of this research, all in accordance with consensus organ at risk definitions. RESULTS We demonstrated the model's clinical applicability by assessing its performance on a test set of 21 computed tomography scans from clinical practice, each with 21 organs at risk segmented by 2 independent experts. We also introduced surface Dice similarity coefficient, a new metric for the comparison of organ delineation, to quantify the deviation between organ at risk surface contours rather than volumes, better reflecting the clinical task of correcting errors in automated organ segmentations. The model's generalizability was then demonstrated on 2 distinct open-source data sets, reflecting different centers and countries to model training. CONCLUSIONS Deep learning is an effective and clinically applicable technique for the segmentation of the head and neck anatomy for radiotherapy. With appropriate validation studies and regulatory approvals, this system could improve the efficiency, consistency, and safety of radiotherapy pathways.
Collapse
Affiliation(s)
| | | | | | - Ruheena Mendes
- University College London Hospitals NHS Foundation Trust, London, United Kingdom
| | | | | | | | | | | | | | | | | | | | - Dawn Carnell
- University College London Hospitals NHS Foundation Trust, London, United Kingdom
| | - Cheng Boon
- Clatterbridge Cancer Centre NHS Foundation Trust, Liverpool, United Kingdom
| | - Derek D'Souza
- University College London Hospitals NHS Foundation Trust, London, United Kingdom
| | - Syed Ali Moinuddin
- University College London Hospitals NHS Foundation Trust, London, United Kingdom
| | | | | | | | | | | | | | - Geraint Rees
- University College London, London, United Kingdom
| | | | | | | | | | | |
Collapse
|
16
|
Wilson M, Chopra R, Wilson MZ, Cooper C, MacWilliams P, Liu Y, Wulczyn E, Florea D, Hughes CO, Karthikesalingam A, Khalid H, Vermeirsch S, Nicholson L, Keane PA, Balaskas K, Kelly CJ. Validation and Clinical Applicability of Whole-Volume Automated Segmentation of Optical Coherence Tomography in Retinal Disease Using Deep Learning. JAMA Ophthalmol 2021; 139:964-973. [PMID: 34236406 PMCID: PMC8444027 DOI: 10.1001/jamaophthalmol.2021.2273] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Question Is deep learning–based segmentation of macular disease in optical coherence tomography (OCT) suitable for clinical use? Findings In this diagnostic study of OCT data from 173 patients with age-related macular degeneration or diabetic macular edema, model segmentations qualitatively ranked better or comparable for clinical applicability to 1 or more expert grader segmentations in 127 scans (73%) by a panel of 3 retinal specialists. Scans with high quantitative accuracy scores were not reliably associated with higher rankings. Meaning These findings suggest that qualitative evaluation adds to quantitative approaches when assessing clinical applicability of segmentation tools and clinician satisfaction in practice. Importance Quantitative volumetric measures of retinal disease in optical coherence tomography (OCT) scans are infeasible to perform owing to the time required for manual grading. Expert-level deep learning systems for automatic OCT segmentation have recently been developed. However, the potential clinical applicability of these systems is largely unknown. Objective To evaluate a deep learning model for whole-volume segmentation of 4 clinically important pathological features and assess clinical applicability. Design, Setting, Participants This diagnostic study used OCT data from 173 patients with a total of 15 558 B-scans, treated at Moorfields Eye Hospital. The data set included 2 common OCT devices and 2 macular conditions: wet age-related macular degeneration (107 scans) and diabetic macular edema (66 scans), covering the full range of severity, and from 3 points during treatment. Two expert graders performed pixel-level segmentations of intraretinal fluid, subretinal fluid, subretinal hyperreflective material, and pigment epithelial detachment, including all B-scans in each OCT volume, taking as long as 50 hours per scan. Quantitative evaluation of whole-volume model segmentations was performed. Qualitative evaluation of clinical applicability by 3 retinal experts was also conducted. Data were collected from June 1, 2012, to January 31, 2017, for set 1 and from January 1 to December 31, 2017, for set 2; graded between November 2018 and January 2020; and analyzed from February 2020 to November 2020. Main Outcomes and Measures Rating and stack ranking for clinical applicability by retinal specialists, model-grader agreement for voxelwise segmentations, and total volume evaluated using Dice similarity coefficients, Bland-Altman plots, and intraclass correlation coefficients. Results Among the 173 patients included in the analysis (92 [53%] women), qualitative assessment found that automated whole-volume segmentation ranked better than or comparable to at least 1 expert grader in 127 scans (73%; 95% CI, 66%-79%). A neutral or positive rating was given to 135 model segmentations (78%; 95% CI, 71%-84%) and 309 expert gradings (2 per scan) (89%; 95% CI, 86%-92%). The model was rated neutrally or positively in 86% to 92% of diabetic macular edema scans and 53% to 87% of age-related macular degeneration scans. Intraclass correlations ranged from 0.33 (95% CI, 0.08-0.96) to 0.96 (95% CI, 0.90-0.99). Dice similarity coefficients ranged from 0.43 (95% CI, 0.29-0.66) to 0.78 (95% CI, 0.57-0.85). Conclusions and Relevance This deep learning–based segmentation tool provided clinically useful measures of retinal disease that would otherwise be infeasible to obtain. Qualitative evaluation was additionally important to reveal clinical applicability for both care management and research.
Collapse
Affiliation(s)
| | - Reena Chopra
- Google Health, London, United Kingdom.,National Institute for Health Research Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS (National Health Service) Foundation Trust, London, United Kingdom.,University College London Institute of Ophthalmology, London, United Kingdom
| | | | | | | | - Yun Liu
- Google Health, Palo Alto, California
| | | | - Daniela Florea
- National Institute for Health Research Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS (National Health Service) Foundation Trust, London, United Kingdom.,University College London Institute of Ophthalmology, London, United Kingdom
| | | | | | - Hagar Khalid
- National Institute for Health Research Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS (National Health Service) Foundation Trust, London, United Kingdom.,University College London Institute of Ophthalmology, London, United Kingdom
| | - Sandra Vermeirsch
- National Institute for Health Research Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS (National Health Service) Foundation Trust, London, United Kingdom.,University College London Institute of Ophthalmology, London, United Kingdom
| | - Luke Nicholson
- National Institute for Health Research Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS (National Health Service) Foundation Trust, London, United Kingdom.,University College London Institute of Ophthalmology, London, United Kingdom
| | - Pearse A Keane
- National Institute for Health Research Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS (National Health Service) Foundation Trust, London, United Kingdom.,University College London Institute of Ophthalmology, London, United Kingdom
| | - Konstantinos Balaskas
- National Institute for Health Research Biomedical Research Centre for Ophthalmology, Moorfields Eye Hospital NHS (National Health Service) Foundation Trust, London, United Kingdom.,University College London Institute of Ophthalmology, London, United Kingdom
| | | |
Collapse
|
17
|
Sounderajah V, Ashrafian H, Golub RM, Shetty S, De Fauw J, Hooft L, Moons K, Collins G, Moher D, Bossuyt PM, Darzi A, Karthikesalingam A, Denniston AK, Mateen BA, Ting D, Treanor D, King D, Greaves F, Godwin J, Pearson-Stuttard J, Harling L, McInnes M, Rifai N, Tomasev N, Normahani P, Whiting P, Aggarwal R, Vollmer S, Markar SR, Panch T, Liu X. Developing a reporting guideline for artificial intelligence-centred diagnostic test accuracy studies: the STARD-AI protocol. BMJ Open 2021; 11:e047709. [PMID: 34183345 PMCID: PMC8240576 DOI: 10.1136/bmjopen-2020-047709] [Citation(s) in RCA: 81] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Accepted: 06/08/2021] [Indexed: 12/13/2022] Open
Abstract
INTRODUCTION Standards for Reporting of Diagnostic Accuracy Study (STARD) was developed to improve the completeness and transparency of reporting in studies investigating diagnostic test accuracy. However, its current form, STARD 2015 does not address the issues and challenges raised by artificial intelligence (AI)-centred interventions. As such, we propose an AI-specific version of the STARD checklist (STARD-AI), which focuses on the reporting of AI diagnostic test accuracy studies. This paper describes the methods that will be used to develop STARD-AI. METHODS AND ANALYSIS The development of the STARD-AI checklist can be distilled into six stages. (1) A project organisation phase has been undertaken, during which a Project Team and a Steering Committee were established; (2) An item generation process has been completed following a literature review, a patient and public involvement and engagement exercise and an online scoping survey of international experts; (3) A three-round modified Delphi consensus methodology is underway, which will culminate in a teleconference consensus meeting of experts; (4) Thereafter, the Project Team will draft the initial STARD-AI checklist and the accompanying documents; (5) A piloting phase among expert users will be undertaken to identify items which are either unclear or missing. This process, consisting of surveys and semistructured interviews, will contribute towards the explanation and elaboration document and (6) On finalisation of the manuscripts, the group's efforts turn towards an organised dissemination and implementation strategy to maximise end-user adoption. ETHICS AND DISSEMINATION Ethical approval has been granted by the Joint Research Compliance Office at Imperial College London (reference number: 19IC5679). A dissemination strategy will be aimed towards five groups of stakeholders: (1) academia, (2) policy, (3) guidelines and regulation, (4) industry and (5) public and non-specific stakeholders. We anticipate that dissemination will take place in Q3 of 2021.
Collapse
Affiliation(s)
- Viknesh Sounderajah
- Department of Surgery and Cancer, Imperial College London, Paddington, UK
- Institute of Global Health Innovation, Imperial College London, London, UK
| | - Hutan Ashrafian
- Department of Surgery and Cancer, Imperial College London, Paddington, UK
- Institute of Global Health Innovation, Imperial College London, London, UK
| | - Robert M Golub
- Journal of the American Medical Association, Chicago, Illinois, USA
| | | | | | - Lotty Hooft
- Cochrane Netherlands, University Medical Center Utrecht, University of Utrecht, Utrecht, The Netherlands
- Department of Epidemiology, Julius Center for Health Sciences and Primary Care, Utrecht, The Netherlands
| | - Karel Moons
- Cochrane Netherlands, University Medical Center Utrecht, University of Utrecht, Utrecht, The Netherlands
- Department of Epidemiology, Julius Center for Health Sciences and Primary Care, Utrecht, The Netherlands
| | - Gary Collins
- Centre for Statistics in Medicine, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK
| | - David Moher
- Centre for Journalology, Ottawa Hospital Research Institute, Ottawa, Ontario, Canada
| | - Patrick M Bossuyt
- Department of Epidemiology and Data Science, Amsterdam University Medical Centres, Duivendrecht, The Netherlands
| | - Ara Darzi
- Department of Surgery and Cancer, Imperial College London, Paddington, UK
- Institute of Global Health Innovation, Imperial College London, London, UK
| | | | - Alastair K Denniston
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Health Data Research UK, London, UK
- Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK
| | | | - Daniel Ting
- Singapore Eye Research Institute, Singapore National Eye Center, Singapore
| | | | | | - Felix Greaves
- Department of Primary Care and Public Health, Imperial College London, London, UK
| | | | | | - Leanne Harling
- Department of Surgery and Cancer, Imperial College London, Paddington, UK
| | - Matthew McInnes
- Department of Radiology, University of Ottawa, Ottawa, Ontario, Canada
| | - Nader Rifai
- Harvard Medical School, Boston, Massachusetts, USA
| | | | - Pasha Normahani
- Department of Surgery and Cancer, Imperial College London, Paddington, UK
| | - Penny Whiting
- School of Social and Community Medicine, University of Bristol, Bristol, UK
| | - Ravi Aggarwal
- Department of Surgery and Cancer, Imperial College London, Paddington, UK
- Institute of Global Health Innovation, Imperial College London, London, UK
| | | | - Sheraz R Markar
- Department of Surgery and Cancer, Imperial College London, Paddington, UK
| | - Trishan Panch
- Division of Health Policy and Management, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Xiaoxuan Liu
- Institute of Inflammation and Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK
- Health Data Research UK, London, UK
- Birmingham Health Partners Centre for Regulatory Science and Innovation, University of Birmingham, Birmingham, UK
| |
Collapse
|
18
|
Roy S, Mincu D, Loreaux E, Mottram A, Protsyuk I, Harris N, Xue Y, Schrouff J, Montgomery H, Connell A, Tomasev N, Karthikesalingam A, Seneviratne M. Multitask prediction of organ dysfunction in the intensive care unit using sequential subnetwork routing. J Am Med Inform Assoc 2021; 28:1936-1946. [PMID: 34151965 PMCID: PMC8363803 DOI: 10.1093/jamia/ocab101] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 03/07/2021] [Accepted: 05/14/2021] [Indexed: 12/29/2022] Open
Abstract
Objective Multitask learning (MTL) using electronic health records allows concurrent prediction of multiple endpoints. MTL has shown promise in improving model performance and training efficiency; however, it often suffers from negative transfer – impaired learning if tasks are not appropriately selected. We introduce a sequential subnetwork routing (SeqSNR) architecture that uses soft parameter sharing to find related tasks and encourage cross-learning between them. Materials and Methods Using the MIMIC-III (Medical Information Mart for Intensive Care-III) dataset, we train deep neural network models to predict the onset of 6 endpoints including specific organ dysfunctions and general clinical outcomes: acute kidney injury, continuous renal replacement therapy, mechanical ventilation, vasoactive medications, mortality, and length of stay. We compare single-task (ST) models with naive multitask and SeqSNR in terms of discriminative performance and label efficiency. Results SeqSNR showed a modest yet statistically significant performance boost across 4 of 6 tasks compared with ST and naive multitasking. When the size of the training dataset was reduced for a given task (label efficiency), SeqSNR outperformed ST for all cases showing an average area under the precision-recall curve boost of 2.1%, 2.9%, and 2.1% for tasks using 1%, 5%, and 10% of labels, respectively. Conclusions The SeqSNR architecture shows superior label efficiency compared with ST and naive multitasking, suggesting utility in scenarios in which endpoint labels are difficult to ascertain.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Yuan Xue
- Google Health, Mountain View, California, USA
| | | | - Hugh Montgomery
- Centre for Human Health and Performance, University College London, London, United Kingdom
| | | | | | | | | |
Collapse
|
19
|
Markar SR, Vidal-Diez A, Holt PJ, Karthikesalingam A, Hanna GB. An International Comparison of the Management of Gastrointestinal Surgical Emergencies in Octogenarians-England Versus United States: A National Population-based Cohort Study. Ann Surg 2021; 273:924-932. [PMID: 31188204 DOI: 10.1097/sla.0000000000003396] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
OBJECTIVE To compare the United States and England for the utilization of surgical intervention and in-hospital mortality from 5 gastrointestinal emergencies in octogenarians. BACKGROUND The proportion of older adults is growing and will represent a substantial challenge to clinicians in the next decade. METHODS Between 2006 and 2012, the rate of surgical intervention and in-hospital mortality for 5 index conditions for octogenarians were compared between the United States and England: appendicitis, incarcerated/strangulated abdominal hernia, perforation of esophagus, small or large bowel, and peptic ulcer. Univariate and multivariate analyses were performed to adjust for underlying differences in patient demographics. RESULTS Thirty-two thousand one hundred fifty-one admissions of octogenarians in England for 5 index surgical emergencies were compared with 162,142 admissions in the USA.Surgical intervention was significantly more common in the USA than in England for all 5 conditions: appendicitis [odds ratio (OR) 4.63, 95% confidence interval (95% CI) 4.21-5.09], abdominal hernia (OR 2.06, 95% CI 1.97-2.15), perforated esophagus (OR 1.71, 95% CI 1.31-2.24), small and large bowel perforation (OR 4.33, 95% CI 4.12-4.56), and peptic ulcer perforation (OR 4.63, 95% CI 4.27-5.02). In-hospital mortality was significantly more common in England than in the USA for all 5 conditions: appendicitis (OR 3.22, 95% CI 2.73-3.78), abdominal hernia (OR 3.49, 95% CI 3.29-3.70), perforated esophagus (OR 4.06, 95% CI 3.03-5.44), small and large bowel perforation (OR 6.97, 95% CI 6.60-7.37), and peptic ulcer perforation (OR 3.67, 95% CI 3.40-3.96). CONCLUSION Surgery is used less commonly in England for emergency gastrointestinal conditions in octogenarians, which may be associated with a high rate of in-hospital mortality from these conditions compared with the USA.
Collapse
Affiliation(s)
- Sheraz R Markar
- Department of Surgery and Cancer, Imperial College, London, UK
| | - Alberto Vidal-Diez
- Department of Surgery and Cancer, Imperial College, London, UK
- Molecular and Clinical Sciences Institute, St George's University of London, Cranmer Terrace, London, UK
| | - Peter J Holt
- Molecular and Clinical Sciences Institute, St George's University of London, Cranmer Terrace, London, UK
| | - Alan Karthikesalingam
- Molecular and Clinical Sciences Institute, St George's University of London, Cranmer Terrace, London, UK
| | - George B Hanna
- Department of Surgery and Cancer, Imperial College, London, UK
| |
Collapse
|
20
|
Aggarwal R, Sounderajah V, Martin G, Ting DSW, Karthikesalingam A, King D, Ashrafian H, Darzi A. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. NPJ Digit Med 2021; 4:65. [PMID: 33828217 PMCID: PMC8027892 DOI: 10.1038/s41746-021-00438-z] [Citation(s) in RCA: 202] [Impact Index Per Article: 67.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Accepted: 02/25/2021] [Indexed: 12/19/2022] Open
Abstract
Deep learning (DL) has the potential to transform medical diagnostics. However, the diagnostic accuracy of DL is uncertain. Our aim was to evaluate the diagnostic accuracy of DL algorithms to identify pathology in medical imaging. Searches were conducted in Medline and EMBASE up to January 2020. We identified 11,921 studies, of which 503 were included in the systematic review. Eighty-two studies in ophthalmology, 82 in breast disease and 115 in respiratory disease were included for meta-analysis. Two hundred twenty-four studies in other specialities were included for qualitative review. Peer-reviewed studies that reported on the diagnostic accuracy of DL algorithms to identify pathology using medical imaging were included. Primary outcomes were measures of diagnostic accuracy, study design and reporting standards in the literature. Estimates were pooled using random-effects meta-analysis. In ophthalmology, AUC's ranged between 0.933 and 1 for diagnosing diabetic retinopathy, age-related macular degeneration and glaucoma on retinal fundus photographs and optical coherence tomography. In respiratory imaging, AUC's ranged between 0.864 and 0.937 for diagnosing lung nodules or lung cancer on chest X-ray or CT scan. For breast imaging, AUC's ranged between 0.868 and 0.909 for diagnosing breast cancer on mammogram, ultrasound, MRI and digital breast tomosynthesis. Heterogeneity was high between studies and extensive variation in methodology, terminology and outcome measures was noted. This can lead to an overestimation of the diagnostic accuracy of DL algorithms on medical imaging. There is an immediate need for the development of artificial intelligence-specific EQUATOR guidelines, particularly STARD, in order to provide guidance around key issues in this field.
Collapse
Affiliation(s)
- Ravi Aggarwal
- Institute of Global Health Innovation, Imperial College London, London, UK
| | | | - Guy Martin
- Institute of Global Health Innovation, Imperial College London, London, UK
| | - Daniel S W Ting
- Singapore Eye Research Institute, Singapore National Eye Center, Singapore, Singapore
| | | | - Dominic King
- Institute of Global Health Innovation, Imperial College London, London, UK
| | - Hutan Ashrafian
- Institute of Global Health Innovation, Imperial College London, London, UK.
| | - Ara Darzi
- Institute of Global Health Innovation, Imperial College London, London, UK
| |
Collapse
|
21
|
McKinney SM, Sieniek M, Godbole V, Godwin J, Antropova N, Ashrafian H, Back T, Chesus M, Corrado GS, Darzi A, Etemadi M, Garcia-Vicente F, Gilbert FJ, Halling-Brown M, Hassabis D, Jansen S, Karthikesalingam A, Kelly CJ, King D, Ledsam JR, Melnick D, Mostofi H, Peng L, Reicher JJ, Romera-Paredes B, Sidebottom R, Suleyman M, Tse D, Young KC, De Fauw J, Shetty S. Addendum: International evaluation of an AI system for breast cancer screening. Nature 2020; 586:E19. [PMID: 33057216 DOI: 10.1038/s41586-020-2679-9] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
| | | | | | | | | | - Hutan Ashrafian
- Department of Surgery and Cancer, Imperial College London, London, UK
- Institute of Global Health Innovation, Imperial College London, London, UK
| | | | | | | | - Ara Darzi
- Department of Surgery and Cancer, Imperial College London, London, UK
- Institute of Global Health Innovation, Imperial College London, London, UK
- Cancer Research UK Imperial Centre, Imperial College London, London, UK
| | | | | | - Fiona J Gilbert
- Department of Radiology, Cambridge Biomedical Research Centre, University of Cambridge, Cambridge, UK
| | | | | | - Sunny Jansen
- Verily Life Sciences, South San Francisco, CA, USA
| | | | | | | | | | | | | | | | | | | | - Richard Sidebottom
- The Royal Marsden Hospital, London, UK
- Thirlestaine Breast Centre, Cheltenham, UK
| | | | | | | | | | | |
Collapse
|
22
|
Grima M, Behrendt CA, Vidal-Diez A, Altreuther M, Björck M, Boyle J, Eldrup N, Karthikesalingam A, Khashram M, Loftus I, Schermerhorn M, Setacci C, Szeberin Z, Debus S, Venermo M, Holt P, Mani K. Assessment of Correlation Between Mean Size of Infrarenal Abdominal Aortic Aneurysm at Time of Intact Repair Against Repair and Rupture Rate in Nine Countries. J Vasc Surg 2020. [DOI: 10.1016/j.jvs.2020.05.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
23
|
Yim J, Chopra R, Spitz T, Winkens J, Obika A, Kelly C, Askham H, Lukic M, Huemer J, Fasler K, Moraes G, Meyer C, Wilson M, Dixon J, Hughes C, Rees G, Khaw PT, Karthikesalingam A, King D, Hassabis D, Suleyman M, Back T, Ledsam JR, Keane PA, De Fauw J. Predicting conversion to wet age-related macular degeneration using deep learning. Nat Med 2020; 26:892-899. [PMID: 32424211 DOI: 10.1038/s41591-020-0867-7] [Citation(s) in RCA: 126] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 04/01/2020] [Indexed: 12/17/2022]
Abstract
Progression to exudative 'wet' age-related macular degeneration (exAMD) is a major cause of visual deterioration. In patients diagnosed with exAMD in one eye, we introduce an artificial intelligence (AI) system to predict progression to exAMD in the second eye. By combining models based on three-dimensional (3D) optical coherence tomography images and corresponding automatic tissue maps, our system predicts conversion to exAMD within a clinically actionable 6-month time window, achieving a per-volumetric-scan sensitivity of 80% at 55% specificity, and 34% sensitivity at 90% specificity. This level of performance corresponds to true positives in 78% and 41% of individual eyes, and false positives in 56% and 17% of individual eyes at the high sensitivity and high specificity points, respectively. Moreover, we show that automatic tissue segmentation can identify anatomical changes before conversion and high-risk subgroups. This AI system overcomes substantial interobserver variability in expert predictions, performing better than five out of six experts, and demonstrates the potential of using AI to predict disease progression.
Collapse
Affiliation(s)
| | - Reena Chopra
- DeepMind, London, UK.,NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | | | | | | | | | | | - Marko Lukic
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | - Josef Huemer
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | - Katrin Fasler
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | - Gabriella Moraes
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | | | | | | | | | | | - Peng T Khaw
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | | | | | | | | | | | | | - Pearse A Keane
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK.
| | | |
Collapse
|
24
|
Markar SR, Arhi C, Wiggins T, Vidal-Diez A, Karthikesalingam A, Darzi A, Lagergren J, Hanna GB. Reintervention After Antireflux Surgery for Gastroesophageal Reflux Disease in England. Ann Surg 2020; 271:709-715. [DOI: 10.1097/sla.0000000000003131] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
25
|
Grima MJ, Behrendt CA, Vidal-Diez A, Altreuther M, Björck M, Boyle JR, Eldrup N, Karthikesalingam A, Khashram M, Loftus I, Schermerhorn M, Setacci C, Szeberin Z, Debus S, Venermo M, Holt P, Mani K. Editor's Choice - Assessment of Correlation Between Mean Size of Infrarenal Abdominal Aortic Aneurysm at Time of Intact Repair Against Repair and Rupture Rate in Nine Countries. Eur J Vasc Endovasc Surg 2020; 59:890-897. [PMID: 32217115 DOI: 10.1016/j.ejvs.2020.01.024] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 12/16/2019] [Accepted: 01/17/2020] [Indexed: 01/03/2023]
Abstract
OBJECTIVE This study aimed to analyse the mean abdominal aortic aneurysm (AAA) diameter for repair in nine countries, and to determine variation in mean AAA diameter for elective AAA repair and its relationship to rupture AAA repair rates and aneurysm related mortality in corresponding populations. METHODS Data on intact (iAAA) and ruptured infrarenal AAA (rAAA) repair for the years 2010-2012 were collected from Denmark, England, Finland, Germany, Hungary, New Zealand, Norway, Sweden, and the USA. The rate of iAAA repair and rAAA per 100 000 inhabitants above 59 years old, mean AAA diameter for iAAA repair and rAAA repair, and the national rates of rAAA were assessed. National cause of death statistics were used to estimate aneurysm related mortality. Direct standardisation methods were applied to the national mortality data. Logistic regression and analysis of variance model adjustments were made for age groups, sex, and year. RESULTS There was a variation in the mean diameter of iAAA repair (n = 34 566; range Germany = 57 mm, Denmark = 68 mm). The standardised iAAA repair rate per 100000 inhabitants varied from 10.4 (Hungary) to 66.5 (Norway), p<.01, and the standardised rAAA repair rate per 100 000 from 5.8 (USA) to 16.9 (England), p<.01. Overall, there was no significant correlation between mean diameter of iAAA repair and standardised iAAA rate (r2 = 0.04, p = .3). There was no significant correlation between rAAA repair rate (n = 12 628) with mean diameter of iAAA repair (r2 = 0.2, p = .1). CONCLUSION Despite recommendations from learned society guidelines, data indicate variations in mean diameter for AAA repair. There was no significant correlation between mean diameter of AAA repair and rates of iAAA repair and rAAA repair. These analyses are subject to differences in disease prevalence, uncertainties in rupture rates, validations of vascular registries, causes of death and registrations.
Collapse
Affiliation(s)
- Matthew J Grima
- St George's Vascular Institute, St George's Hospital NHS Foundation Trust, London, UK; Molecular and Clinical Sciences Research Institute, St George's, University of London, UK.
| | - Christian-Alexander Behrendt
- Department of Vascular Medicine, Research Group GermanVasc, University Heart and Vascular Centre Hamburg, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
| | - Alberto Vidal-Diez
- St George's Vascular Institute, St George's Hospital NHS Foundation Trust, London, UK
| | - Martin Altreuther
- Department of Vascular Surgery, St Olavs Hospital, Trondheim, Norway
| | - Martin Björck
- Department of Surgical Sciences, Vascular Surgery, Uppsala University, Uppsala, Sweden
| | - Jonathan R Boyle
- Division of Vascular and Endovascular Surgery, Addenbrooke's Hospital, Cambridge University Hospital Trust, Cambridge, UK
| | - Nikolaj Eldrup
- Department of Cardio-Thoracic and Vascular Surgery, Aarhus University Hospital, Aarhus, Denmark
| | - Alan Karthikesalingam
- St George's Vascular Institute, St George's Hospital NHS Foundation Trust, London, UK
| | - Manar Khashram
- Department of Surgery, The University of Auckland, Waikato, New Zealand
| | - Ian Loftus
- St George's Vascular Institute, St George's Hospital NHS Foundation Trust, London, UK
| | - Marc Schermerhorn
- Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Centre, Boston, MA, USA
| | - Carlo Setacci
- Vascular and Endovascular Surgery Unit, University of Siena, Siena, Italy
| | - Zoltán Szeberin
- Department of Vascular Surgery, Semmelweis University, Budapest, Hungary
| | - Sebastian Debus
- Department of Vascular Medicine, Research Group GermanVasc, University Heart and Vascular Centre Hamburg, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
| | - Maarit Venermo
- Department of Vascular Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Peter Holt
- St George's Vascular Institute, St George's Hospital NHS Foundation Trust, London, UK
| | - Kevin Mani
- Department of Surgical Sciences, Vascular Surgery, Uppsala University, Uppsala, Sweden
| |
Collapse
|
26
|
McKinney SM, Sieniek M, Godbole V, Godwin J, Antropova N, Ashrafian H, Back T, Chesus M, Corrado GS, Darzi A, Etemadi M, Garcia-Vicente F, Gilbert FJ, Halling-Brown M, Hassabis D, Jansen S, Karthikesalingam A, Kelly CJ, King D, Ledsam JR, Melnick D, Mostofi H, Peng L, Reicher JJ, Romera-Paredes B, Sidebottom R, Suleyman M, Tse D, Young KC, De Fauw J, Shetty S. International evaluation of an AI system for breast cancer screening. Nature 2020; 577:89-94. [PMID: 31894144 DOI: 10.1038/s41586-019-1799-6] [Citation(s) in RCA: 928] [Impact Index Per Article: 232.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2019] [Accepted: 11/05/2019] [Indexed: 02/07/2023]
Abstract
Screening mammography aims to identify breast cancer at earlier stages of the disease, when treatment can be more successful1. Despite the existence of screening programmes worldwide, the interpretation of mammograms is affected by high rates of false positives and false negatives2. Here we present an artificial intelligence (AI) system that is capable of surpassing human experts in breast cancer prediction. To assess its performance in the clinical setting, we curated a large representative dataset from the UK and a large enriched dataset from the USA. We show an absolute reduction of 5.7% and 1.2% (USA and UK) in false positives and 9.4% and 2.7% in false negatives. We provide evidence of the ability of the system to generalize from the UK to the USA. In an independent study of six radiologists, the AI system outperformed all of the human readers: the area under the receiver operating characteristic curve (AUC-ROC) for the AI system was greater than the AUC-ROC for the average radiologist by an absolute margin of 11.5%. We ran a simulation in which the AI system participated in the double-reading process that is used in the UK, and found that the AI system maintained non-inferior performance and reduced the workload of the second reader by 88%. This robust assessment of the AI system paves the way for clinical trials to improve the accuracy and efficiency of breast cancer screening.
Collapse
Affiliation(s)
| | | | | | | | | | - Hutan Ashrafian
- Department of Surgery and Cancer, Imperial College London, London, UK
- Institute of Global Health Innovation, Imperial College London, London, UK
| | | | | | | | - Ara Darzi
- Department of Surgery and Cancer, Imperial College London, London, UK
- Institute of Global Health Innovation, Imperial College London, London, UK
- Cancer Research UK Imperial Centre, Imperial College London, London, UK
| | | | | | - Fiona J Gilbert
- Department of Radiology, Cambridge Biomedical Research Centre, University of Cambridge, Cambridge, UK
| | | | | | - Sunny Jansen
- Verily Life Sciences, South San Francisco, CA, USA
| | | | | | | | | | | | | | | | | | | | - Richard Sidebottom
- The Royal Marsden Hospital, London, UK
- Thirlestaine Breast Centre, Cheltenham, UK
| | | | | | | | | | | |
Collapse
|
27
|
Joe Grima M, Boufi M, Loftus P, Vidal-Diez A, Loftus I, Thompson MM, Karthikesalingam A, Holt PJ. A Reliable Protocol to Study the Morphology of the Abdominal Aorta in a Three-Dimensional Modality. Eur J Vasc Endovasc Surg 2019. [DOI: 10.1016/j.ejvs.2019.06.767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
28
|
Abstract
BACKGROUND Artificial intelligence (AI) research in healthcare is accelerating rapidly, with potential applications being demonstrated across various domains of medicine. However, there are currently limited examples of such techniques being successfully deployed into clinical practice. This article explores the main challenges and limitations of AI in healthcare, and considers the steps required to translate these potentially transformative technologies from research to clinical practice. MAIN BODY Key challenges for the translation of AI systems in healthcare include those intrinsic to the science of machine learning, logistical difficulties in implementation, and consideration of the barriers to adoption as well as of the necessary sociocultural or pathway changes. Robust peer-reviewed clinical evaluation as part of randomised controlled trials should be viewed as the gold standard for evidence generation, but conducting these in practice may not always be appropriate or feasible. Performance metrics should aim to capture real clinical applicability and be understandable to intended users. Regulation that balances the pace of innovation with the potential for harm, alongside thoughtful post-market surveillance, is required to ensure that patients are not exposed to dangerous interventions nor deprived of access to beneficial innovations. Mechanisms to enable direct comparisons of AI systems must be developed, including the use of independent, local and representative test sets. Developers of AI algorithms must be vigilant to potential dangers, including dataset shift, accidental fitting of confounders, unintended discriminatory bias, the challenges of generalisation to new populations, and the unintended negative consequences of new algorithms on health outcomes. CONCLUSION The safe and timely translation of AI research into clinically validated and appropriately regulated systems that can benefit everyone is challenging. Robust clinical evaluation, using metrics that are intuitive to clinicians and ideally go beyond measures of technical accuracy to include quality of care and patient outcomes, is essential. Further work is required (1) to identify themes of algorithmic bias and unfairness while developing mitigations to address these, (2) to reduce brittleness and improve generalisability, and (3) to develop methods for improved interpretability of machine learning predictions. If these goals can be achieved, the benefits for patients are likely to be transformational.
Collapse
|
29
|
Tomašev N, Glorot X, Rae JW, Zielinski M, Askham H, Saraiva A, Mottram A, Meyer C, Ravuri S, Protsyuk I, Connell A, Hughes CO, Karthikesalingam A, Cornebise J, Montgomery H, Rees G, Laing C, Baker CR, Peterson K, Reeves R, Hassabis D, King D, Suleyman M, Back T, Nielson C, Ledsam JR, Mohamed S. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 2019; 572:116-119. [PMID: 31367026 PMCID: PMC6722431 DOI: 10.1038/s41586-019-1390-1] [Citation(s) in RCA: 482] [Impact Index Per Article: 96.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Accepted: 06/18/2019] [Indexed: 12/31/2022]
Abstract
The early prediction of deterioration could have an important role in supporting healthcare professionals, as an estimated 11% of deaths in hospital follow a failure to promptly recognize and treat deteriorating patients1. To achieve this goal requires predictions of patient risk that are continuously updated and accurate, and delivered at an individual level with sufficient context and enough time to act. Here we develop a deep learning approach for the continuous risk prediction of future deterioration in patients, building on recent work that models adverse events from electronic health records2-17 and using acute kidney injury-a common and potentially life-threatening condition18-as an exemplar. Our model was developed on a large, longitudinal dataset of electronic health records that cover diverse clinical environments, comprising 703,782 adult patients across 172 inpatient and 1,062 outpatient sites. Our model predicts 55.8% of all inpatient episodes of acute kidney injury, and 90.2% of all acute kidney injuries that required subsequent administration of dialysis, with a lead time of up to 48 h and a ratio of 2 false alerts for every true alert. In addition to predicting future acute kidney injury, our model provides confidence assessments and a list of the clinical features that are most salient to each prediction, alongside predicted future trajectories for clinically relevant blood tests9. Although the recognition and prompt treatment of acute kidney injury is known to be challenging, our approach may offer opportunities for identifying patients at risk within a time window that enables early treatment.
Collapse
Affiliation(s)
| | | | - Jack W Rae
- DeepMind, London, UK
- CoMPLEX, Computer Science, University College London, London, UK
| | | | | | | | | | | | | | | | | | | | | | | | - Hugh Montgomery
- Institute for Human Health and Performance, University College London, London, UK
| | - Geraint Rees
- Institute of Cognitive Neuroscience, University College London, London, UK
| | - Chris Laing
- University College London Hospitals, London, UK
| | | | - Kelly Peterson
- VA Salt Lake City Healthcare System, Salt Lake City, UT, USA
- Division of Epidemiology, University of Utah, Salt Lake City, UT, USA
| | - Ruth Reeves
- Department of Veterans Affairs, Nashville, TN, USA
| | | | | | | | | | - Christopher Nielson
- University of Nevada School of Medicine, Reno, NV, USA
- Department of Veterans Affairs, Salt Lake City, UT, USA
| | | | | |
Collapse
|
30
|
Connell A, Black G, Montgomery H, Martin P, Nightingale C, King D, Karthikesalingam A, Hughes C, Back T, Ayoub K, Suleyman M, Jones G, Cross J, Stanley S, Emerson M, Merrick C, Rees G, Laing C, Raine R. Implementation of a Digitally Enabled Care Pathway (Part 2): Qualitative Analysis of Experiences of Health Care Professionals. J Med Internet Res 2019; 21:e13143. [PMID: 31368443 PMCID: PMC6693304 DOI: 10.2196/13143] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 01/29/2019] [Accepted: 03/24/2019] [Indexed: 01/16/2023] Open
Abstract
Background One reason for the introduction of digital technologies into health care has been to try to improve safety and patient outcomes by providing real-time access to patient data and enhancing communication among health care professionals. However, the adoption of such technologies into clinical pathways has been less examined, and the impacts on users and the broader health system are poorly understood. We sought to address this by studying the impacts of introducing a digitally enabled care pathway for patients with acute kidney injury (AKI) at a tertiary referral hospital in the United Kingdom. A dedicated clinical response team—comprising existing nephrology and patient-at-risk and resuscitation teams—received AKI alerts in real time via Streams, a mobile app. Here, we present a qualitative evaluation of the experiences of users and other health care professionals whose work was affected by the implementation of the care pathway. Objective The aim of this study was to qualitatively evaluate the impact of mobile results viewing and automated alerting as part of a digitally enabled care pathway on the working practices of users and their interprofessional relationships. Methods A total of 19 semistructured interviews were conducted with members of the AKI response team and clinicians with whom they interacted across the hospital. Interviews were analyzed using inductive and deductive thematic analysis. Results The digitally enabled care pathway improved access to patient information and expedited early specialist care. Opportunities were identified for more constructive planning of end-of-life care due to the earlier detection and alerting of deterioration. However, the shift toward early detection also highlighted resource constraints and some clinical uncertainty about the value of intervening at this stage. The real-time availability of information altered communication flows within and between clinical teams and across professional groups. Conclusions Digital technologies allow early detection of adverse events and of patients at risk of deterioration, with the potential to improve outcomes. They may also increase the efficiency of health care professionals’ working practices. However, when planning and implementing digital information innovations in health care, the following factors should also be considered: the provision of clinical training to effectively manage early detection, resources to cope with additional workload, support to manage perceived information overload, and the optimization of algorithms to minimize unnecessary alerts.
Collapse
Affiliation(s)
- Alistair Connell
- Centre for Human Health and Performance, University College London, London, United Kingdom.,DeepMind Health, London, United Kingdom
| | - Georgia Black
- Department of Applied Health Research, University College London, London, United Kingdom
| | - Hugh Montgomery
- Centre for Human Health and Performance, University College London, London, United Kingdom
| | - Peter Martin
- Department of Applied Health Research, University College London, London, United Kingdom
| | - Claire Nightingale
- Department of Applied Health Research, University College London, London, United Kingdom.,Population Health Research Institute, St. George's, University of London, London, United Kingdom
| | | | | | | | | | | | | | - Gareth Jones
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Jennifer Cross
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Sarah Stanley
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Mary Emerson
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Charles Merrick
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Geraint Rees
- Faculty of Life Sciences, University College London, London, United Kingdom
| | | | - Rosalind Raine
- Department of Applied Health Research, University College London, London, United Kingdom
| |
Collapse
|
31
|
Connell A, Raine R, Martin P, Barbosa EC, Morris S, Nightingale C, Sadeghi-Alavijeh O, King D, Karthikesalingam A, Hughes C, Back T, Ayoub K, Suleyman M, Jones G, Cross J, Stanley S, Emerson M, Merrick C, Rees G, Montgomery H, Laing C. Implementation of a Digitally Enabled Care Pathway (Part 1): Impact on Clinical Outcomes and Associated Health Care Costs. J Med Internet Res 2019; 21:e13147. [PMID: 31368447 PMCID: PMC6693300 DOI: 10.2196/13147] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 01/29/2019] [Accepted: 01/30/2019] [Indexed: 01/22/2023] Open
Abstract
Background The development of acute kidney injury (AKI) in hospitalized patients is associated with adverse outcomes and increased health care costs. Simple automated e-alerts indicating its presence do not appear to improve outcomes, perhaps because of a lack of explicitly defined integration with a clinical response. Objective We sought to test this hypothesis by evaluating the impact of a digitally enabled intervention on clinical outcomes and health care costs associated with AKI in hospitalized patients. Methods We developed a care pathway comprising automated AKI detection, mobile clinician notification, in-app triage, and a protocolized specialist clinical response. We evaluated its impact by comparing data from pre- and postimplementation phases (May 2016 to January 2017 and May to September 2017, respectively) at the intervention site and another site not receiving the intervention. Clinical outcomes were analyzed using segmented regression analysis. The primary outcome was recovery of renal function to ≤120% of baseline by hospital discharge. Secondary clinical outcomes were mortality within 30 days of alert, progression of AKI stage, transfer to renal/intensive care units, hospital re-admission within 30 days of discharge, dependence on renal replacement therapy 30 days after discharge, and hospital-wide cardiac arrest rate. Time taken for specialist review of AKI alerts was measured. Impact on health care costs as defined by Patient-Level Information and Costing System data was evaluated using difference-in-differences (DID) analysis. Results The median time to AKI alert review by a specialist was 14.0 min (interquartile range 1.0-60.0 min). There was no impact on the primary outcome (estimated odds ratio [OR] 1.00, 95% CI 0.58-1.71; P=.99). Although the hospital-wide cardiac arrest rate fell significantly at the intervention site (OR 0.55, 95% CI 0.38-0.76; P<.001), DID analysis with the comparator site was not significant (OR 1.13, 95% CI 0.63-1.99; P=.69). There was no impact on other secondary clinical outcomes. Mean health care costs per patient were reduced by £2123 (95% CI −£4024 to −£222; P=.03), not including costs of providing the technology. Conclusions The digitally enabled clinical intervention to detect and treat AKI in hospitalized patients reduced health care costs and possibly reduced cardiac arrest rates. Its impact on other clinical outcomes and identification of the active components of the pathway requires clarification through evaluation across multiple sites.
Collapse
Affiliation(s)
- Alistair Connell
- Centre for Human Health and Performance, University College London, London, United Kingdom.,DeepMind Health, London, United Kingdom
| | - Rosalind Raine
- Department of Applied Health Research, University College London, London, United Kingdom
| | - Peter Martin
- Department of Applied Health Research, University College London, London, United Kingdom
| | - Estela Capelas Barbosa
- Department of Applied Health Research, University College London, London, United Kingdom
| | - Stephen Morris
- Department of Applied Health Research, University College London, London, United Kingdom
| | - Claire Nightingale
- Department of Applied Health Research, University College London, London, United Kingdom.,Population Health Research Institute, St George's, University of London, London, United Kingdom
| | | | | | | | | | | | | | | | - Gareth Jones
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Jennifer Cross
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Sarah Stanley
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Mary Emerson
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Charles Merrick
- Royal Free London NHS Foundation Trust, London, United Kingdom
| | - Geraint Rees
- Faculty of Life Sciences, University College London, London, United Kingdom
| | - Hugh Montgomery
- Centre for Human Health and Performance, University College London, London, United Kingdom
| | | |
Collapse
|
32
|
Markar SR, Vidal-Diez A, Sounderajah V, Mackenzie H, Hanna GB, Thompson M, Holt P, Lagergren J, Karthikesalingam A. A population-based cohort study examining the risk of abdominal cancer after endovascular abdominal aortic aneurysm repair. J Vasc Surg 2019; 69:1776-1785.e2. [DOI: 10.1016/j.jvs.2018.09.058] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Accepted: 09/09/2018] [Indexed: 10/27/2022]
|
33
|
Kosmin M, Ledsam J, Romera-Paredes B, Mendes R, Moinuddin S, de Souza D, Gunn L, Kelly C, Hughes C, Karthikesalingam A, Nutting C, Sharma R. Rapid advances in auto-segmentation of organs at risk and target volumes in head and neck cancer. Radiother Oncol 2019; 135:130-140. [DOI: 10.1016/j.radonc.2019.03.004] [Citation(s) in RCA: 52] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Revised: 02/10/2019] [Accepted: 03/04/2019] [Indexed: 11/25/2022]
|
34
|
Grima MJ, Karthikesalingam A, Holt PJ, Kerr D, Chetter I, Harrison S, Sayers R, Roy I, Vallabhaneni SR, Dominic P, Bachoo P, Griffin J, Lewis D, Hardman J, Rihan A, Brooks M, Woodburn K, Godfrey D, Nordon I, Vidal-Diez A, Stenson K, Bahia S, Patterson B, Oladokun D, De Bruin J, Loftus I, Thompson MM, Lowe C, Ashrafi M, Ghosh J, Ashleigh R. Multicentre Post-EVAR Surveillance Evaluation Study (EVAR-SCREEN). Eur J Vasc Endovasc Surg 2019; 57:521-526. [DOI: 10.1016/j.ejvs.2018.10.032] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 10/27/2018] [Indexed: 11/29/2022]
|
35
|
De Fauw J, Ledsam JR, Romera-Paredes B, Nikolov S, Tomasev N, Blackwell S, Askham H, Glorot X, O'Donoghue B, Visentin D, van den Driessche G, Lakshminarayanan B, Meyer C, Mackinder F, Bouton S, Ayoub K, Chopra R, King D, Karthikesalingam A, Hughes CO, Raine R, Hughes J, Sim DA, Egan C, Tufail A, Montgomery H, Hassabis D, Rees G, Back T, Khaw PT, Suleyman M, Cornebise J, Keane PA, Ronneberger O. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med 2018; 24:1342-1350. [PMID: 30104768 DOI: 10.1038/s41591-018-0107-6] [Citation(s) in RCA: 1032] [Impact Index Per Article: 172.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2017] [Accepted: 06/01/2018] [Indexed: 12/12/2022]
Abstract
The volume and complexity of diagnostic imaging is increasing at a pace faster than the availability of human expertise to interpret it. Artificial intelligence has shown great promise in classifying two-dimensional photographs of some common diseases and typically relies on databases of millions of annotated images. Until now, the challenge of reaching the performance of expert clinicians in a real-world clinical pathway with three-dimensional diagnostic scans has remained unsolved. Here, we apply a novel deep learning architecture to a clinically heterogeneous set of three-dimensional optical coherence tomography scans from patients referred to a major eye hospital. We demonstrate performance in making a referral recommendation that reaches or exceeds that of experts on a range of sight-threatening retinal diseases after training on only 14,884 scans. Moreover, we demonstrate that the tissue segmentations produced by our architecture act as a device-independent representation; referral accuracy is maintained when using tissue segmentations from a different type of device. Our work removes previous barriers to wider clinical use without prohibitive training data requirements across multiple pathologies in a real-world setting.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Reena Chopra
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | | | | | - Cían O Hughes
- DeepMind, London, UK
- University College London, London, UK
| | | | - Julian Hughes
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | - Dawn A Sim
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | - Catherine Egan
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | - Adnan Tufail
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | | | | | | | | | - Peng T Khaw
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK
| | | | | | - Pearse A Keane
- NIHR Biomedical Research Centre at Moorfields Eye Hospital and UCL Institute of Ophthalmology, London, UK.
| | | |
Collapse
|
36
|
Grima M, Boufi M, Law M, Jackson D, Stenson K, Patterson B, Loftus I, Thompson M, Karthikesalingam A, Holt P. The Implications of Non-compliance to Endovascular Aneurysm Repair Surveillance: A Systematic Review and Meta-analysis. J Vasc Surg 2018. [DOI: 10.1016/j.jvs.2018.03.402] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
37
|
King D, Karthikesalingam A, Hughes C, Montgomery H, Raine R, Rees G. Letter in response to Google DeepMind and healthcare in an age of algorithms. Health Technol 2018. [DOI: 10.1007/s12553-018-0228-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
38
|
Karthikesalingam A, Grima MJ, Holt PJ, Vidal-Diez A, Thompson MM, Wanhainen A, Bjorck M, Mani K. Comparative analysis of the outcomes of elective abdominal aortic aneurysm repair in England and Sweden. Br J Surg 2018; 105:520-528. [PMID: 29468657 PMCID: PMC5900926 DOI: 10.1002/bjs.10749] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Revised: 06/21/2017] [Accepted: 10/09/2017] [Indexed: 12/04/2022]
Abstract
Background There is substantial international variation in mortality after abdominal aortic aneurysm (AAA) repair; many non‐operative factors influence risk‐adjusted outcomes. This study compared 90‐day and 5‐year mortality for patients undergoing elective AAA repair in England and Sweden. Methods Patients were identified from English Hospital Episode Statistics and the Swedish Vascular Registry between 2003 and 2012. Ninety‐day mortality and 5‐year survival were compared after adjustment for age and sex. Separate within‐country analyses were performed to examine the impact of co‐morbidity, hospital teaching status and hospital annual caseload. Results The study included 36 249 patients who had AAA treatment in England, with a median age of 74 (i.q.r. 69–79) years, of whom 87·2 per cent were men. There were 7806 patients treated for AAA in Sweden, with a median of age 73 (68–78) years, of whom 82·9 per cent were men. Ninety‐day mortality rates were poorer in England than in Sweden (5·0 versus 3·9 per cent respectively; P < 0·001), but were not significantly different after 2007. Five‐year survival was poorer in England (70·5 versus 72·8 per cent; P < 0·001). Use of EVAR was initially lower in England, but surpassed that in Sweden after 2010. In both countries, poor outcome was associated with increased age. In England, institutions with higher operative annual volume had lower mortality rates. Conclusion Mortality for elective AAA repair was initially poorer in England than Sweden, but improved over time alongside greater uptake of EVAR, and now there is no difference. Centres performing a greater proportion of EVAR procedures achieved better results in England. Improving in England
Collapse
Affiliation(s)
- A Karthikesalingam
- St George's Vascular Institute, St George's University of London, London, UK.,Molecular and Clinical Sciences Research Institute, St George's University of London, London, UK
| | - M J Grima
- St George's Vascular Institute, St George's University of London, London, UK.,Molecular and Clinical Sciences Research Institute, St George's University of London, London, UK
| | - P J Holt
- St George's Vascular Institute, St George's University of London, London, UK.,Molecular and Clinical Sciences Research Institute, St George's University of London, London, UK
| | - A Vidal-Diez
- St George's Vascular Institute, St George's University of London, London, UK.,Population Health Research Institute, St George's University of London, London, UK
| | - M M Thompson
- St George's Vascular Institute, St George's University of London, London, UK
| | - A Wanhainen
- Department of Surgical Sciences, Section of Vascular Surgery, Uppsala University, Uppsala, Sweden
| | - M Bjorck
- Department of Surgical Sciences, Section of Vascular Surgery, Uppsala University, Uppsala, Sweden
| | - K Mani
- Department of Surgical Sciences, Section of Vascular Surgery, Uppsala University, Uppsala, Sweden
| |
Collapse
|
39
|
Grima MJ, Boufi M, Law M, Jackson D, Stenson K, Patterson B, Loftus I, Thompson M, Karthikesalingam A, Holt P. Editor's Choice - The Implications of Non-compliance to Endovascular Aneurysm Repair Surveillance: A Systematic Review and Meta-analysis. Eur J Vasc Endovasc Surg 2018; 55:492-502. [PMID: 29307756 PMCID: PMC6481561 DOI: 10.1016/j.ejvs.2017.11.030] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 11/27/2017] [Indexed: 10/25/2022]
Abstract
OBJECTIVE/BACKGROUND Increasingly, reports show that compliance rates with endovascular aneurysm repair (EVAR) surveillance are often suboptimal. The aim of this study was to determine the safety implications of non-compliance with surveillance. METHODS The study was carried out according to the Preferred Items for Reporting of Systematic Reviews and Meta-Analyses (PRISMA) guidelines. An electronic search was undertaken by two independent authors using Embase, MEDLINE, Cochrane, and Web of Science databases from 1990 to July 2017. Only studies that analysed infrarenal EVAR and had a definition of non-compliance described as weeks or months without imaging surveillance were analysed. Meta-analysis was carried out using the random-effects model and restricted maximum likelihood estimation. RESULTS Thirteen articles (40,730 patients) were eligible for systematic review; of these, seven studies (14,311 patients) were appropriate for comparative meta-analyses of mortality rates. Three studies (8316 patients) were eligible for the comparative meta-analyses of re-intervention rates after EVAR and four studies (12,995 patients) eligible for meta-analysis for abdominal aortic aneurysm related mortality (ARM). The estimated average non-compliance rate was 42.0% (95% confidence interval [CI] 28-56%). Although there is some evidence that non-compliant patients have better survival rates, there was no statistically significant difference in all cause mortality rates (year 1: odds ratio [OR] 5.77, 95% CI 0.74-45.14; year 3: OR 2.28, 95% CI 0.92-5.66; year 5: OR 1.81, 95% CI 0.88-3.74) and ARM (OR 1.47, 95% CI 0.99-2.19) between compliant and non-compliant patients in the first 5 years after EVAR. The re-intervention rate was statistically significantly higher in compliant patients from 3 to 5 years after EVAR (year 1: OR 6.36, 95% CI 0.23-172.73; year 3: OR 3.94, 85% CI 1.46-10.69; year 5: OR 5.34, 95% CI 1.87-15.29). CONCLUSION This systematic review and meta-analysis suggests that patients compliant with EVAR surveillance programmes may have an increased re-intervention rate but do not appear to have better survival rates than non-compliant patients.
Collapse
Affiliation(s)
- Matthew Joe Grima
- St George's Vascular Institute, St George's Hospital, NHS Foundation Trust, London, UK; Molecular and Clinical Sciences Research Institute, St George's, University of London, London, UK.
| | - Mourad Boufi
- St George's Vascular Institute, St George's Hospital, NHS Foundation Trust, London, UK; Aix-Marseille Université, CNRS, IRPHE UMR 7342, Marseille, France; APHM, Department of Vascular Surgery, University Hospital Nord, Marseille, France
| | - Martin Law
- MRC Biostatistics Unit, Institute of Public Health, Cambridge, UK
| | - Dan Jackson
- MRC Biostatistics Unit, Institute of Public Health, Cambridge, UK
| | - Kate Stenson
- St George's Vascular Institute, St George's Hospital, NHS Foundation Trust, London, UK; Molecular and Clinical Sciences Research Institute, St George's, University of London, London, UK
| | - Benjamin Patterson
- St George's Vascular Institute, St George's Hospital, NHS Foundation Trust, London, UK; Molecular and Clinical Sciences Research Institute, St George's, University of London, London, UK
| | - Ian Loftus
- St George's Vascular Institute, St George's Hospital, NHS Foundation Trust, London, UK; Molecular and Clinical Sciences Research Institute, St George's, University of London, London, UK
| | - Matt Thompson
- St George's Vascular Institute, St George's Hospital, NHS Foundation Trust, London, UK; Molecular and Clinical Sciences Research Institute, St George's, University of London, London, UK
| | - Alan Karthikesalingam
- St George's Vascular Institute, St George's Hospital, NHS Foundation Trust, London, UK; Molecular and Clinical Sciences Research Institute, St George's, University of London, London, UK
| | - Peter Holt
- St George's Vascular Institute, St George's Hospital, NHS Foundation Trust, London, UK; Molecular and Clinical Sciences Research Institute, St George's, University of London, London, UK
| |
Collapse
|
40
|
Markar SR, Mackenzie H, Wiggins T, Askari A, Karthikesalingam A, Faiz O, Griffin SM, Birkmeyer JD, Hanna GB. Influence of national centralization of oesophagogastric cancer on management and clinical outcome from emergency upper gastrointestinal conditions. Br J Surg 2017; 105:113-120. [DOI: 10.1002/bjs.10640] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Revised: 04/27/2017] [Accepted: 06/07/2017] [Indexed: 01/19/2023]
Abstract
Abstract
Background
In England in 2001 oesophagogastric cancer surgery was centralized. The aim of this study was to evaluate whether centralization of oesophagogastric cancer to high-volume centres has had an effect on mortality from different emergency upper gastrointestinal conditions.
Methods
The Hospital Episode Statistics database was used to identify patients admitted to hospitals in England (1997–2012). The influence of oesophagogastric high-volume cancer centre status (20 or more resections per year) on 30- and 90-day mortality from oesophageal perforation, paraoesophageal hernia and perforated peptic ulcer was analysed.
Results
Over the study interval, 3707, 12 441 and 56 822 patients with oesophageal perforation, paraoesophageal hernia and perforated peptic ulcer respectively were included. There was a passive centralization to high-volume cancer centres for oesophageal perforation (26·9 per cent increase), paraoesophageal hernia (19·5 per cent increase) and perforated peptic ulcer (23·0 per cent increase). Management of oesophageal perforation in high-volume centres was associated with a reduction in 30-day (HR 0·58, 95 per cent c.i. 0·45 to 0·74) and 90-day (HR 0·62, 0·49 to 0·77) mortality. High-volume cancer centre status did not affect mortality from paraoesophageal hernia or perforated peptic ulcer. Annual emergency admission volume thresholds at which mortality improved were observed for oesophageal perforation (5 patients) and paraoesophageal hernia (11). Following centralization, the proportion of patients managed in high-volume cancer centres that reached this volume threshold was 88·0 per cent for oesophageal perforation, but only 30·3 per cent for paraoesophageal hernia.
Conclusion
Centralization of low incidence conditions such as oesophageal perforation to high-volume cancer centres provides a greater level of expertise and ultimately reduces mortality.
Collapse
Affiliation(s)
- S R Markar
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - H Mackenzie
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - T Wiggins
- Department of Surgery and Cancer, Imperial College London, London, UK
| | - A Askari
- Department of Surgery and Cancer, Imperial College London, London, UK
- St Mark's Hospital and Academic Institute, Harrow, UK
| | - A Karthikesalingam
- St George's Vascular Institute, St George's, University of London, London, UK
| | - O Faiz
- Department of Surgery and Cancer, Imperial College London, London, UK
- St Mark's Hospital and Academic Institute, Harrow, UK
| | - S M Griffin
- Northern Oesophago-Gastric Unit, Royal Victoria Infirmary, Newcastle upon Tyne, UK
| | - J D Birkmeyer
- Dartmouth Institute for Health Policy and Clinical Practice, Lebanon, New Hampshire, USA
| | - G B Hanna
- Department of Surgery and Cancer, Imperial College London, London, UK
| |
Collapse
|
41
|
Grima M, Karthikesalingam A, Holt P, Vidal-Diez A, Thompson M, Wanhainen A, Bjorck M, Mani K. Comparative Analysis of the Outcomes of Elective Abdominal Aortic Aneurysm Repair in England and Sweden: Context for Contemporary Practice. Eur J Vasc Endovasc Surg 2017. [DOI: 10.1016/j.ejvs.2017.08.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
42
|
Attallah O, Karthikesalingam A, Holt PJ, Thompson MM, Sayers R, Bown MJ, Choke EC, Ma X. Using multiple classifiers for predicting the risk of endovascular aortic aneurysm repair re-intervention through hybrid feature selection. Proc Inst Mech Eng H 2017; 231:1048-1063. [PMID: 28925817 DOI: 10.1177/0954411917731592] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Feature selection is essential in medical area; however, its process becomes complicated with the presence of censoring which is the unique character of survival analysis. Most survival feature selection methods are based on Cox's proportional hazard model, though machine learning classifiers are preferred. They are less employed in survival analysis due to censoring which prevents them from directly being used to survival data. Among the few work that employed machine learning classifiers, partial logistic artificial neural network with auto-relevance determination is a well-known method that deals with censoring and perform feature selection for survival data. However, it depends on data replication to handle censoring which leads to unbalanced and biased prediction results especially in highly censored data. Other methods cannot deal with high censoring. Therefore, in this article, a new hybrid feature selection method is proposed which presents a solution to high level censoring. It combines support vector machine, neural network, and K-nearest neighbor classifiers using simple majority voting and a new weighted majority voting method based on survival metric to construct a multiple classifier system. The new hybrid feature selection process uses multiple classifier system as a wrapper method and merges it with iterated feature ranking filter method to further reduce features. Two endovascular aortic repair datasets containing 91% censored patients collected from two centers were used to construct a multicenter study to evaluate the performance of the proposed approach. The results showed the proposed technique outperformed individual classifiers and variable selection methods based on Cox's model such as Akaike and Bayesian information criterions and least absolute shrinkage and selector operator in p values of the log-rank test, sensitivity, and concordance index. This indicates that the proposed classifier is more powerful in correctly predicting the risk of re-intervention enabling doctor in selecting patients' future follow-up plan.
Collapse
Affiliation(s)
- Omneya Attallah
- 1 Department of Electronics and Communications, College of Engineering and Technology, Arab Academy for Science and Technology, Alexandria, Egypt.,2 School of Engineering and Applied Science, Aston University, Birmingham, UK
| | - Alan Karthikesalingam
- 3 St George's Vascular Institute, St George's University Hospitals NHS Foundation Trust, London, UK
| | - Peter Je Holt
- 3 St George's Vascular Institute, St George's University Hospitals NHS Foundation Trust, London, UK
| | - Matthew M Thompson
- 3 St George's Vascular Institute, St George's University Hospitals NHS Foundation Trust, London, UK
| | - Rob Sayers
- 4 NIHR Leicester Cardiovascular Biomedical Research Unit and Department of Cardiovascular Sciences, University of Leicester, Leicester, UK
| | - Matthew J Bown
- 4 NIHR Leicester Cardiovascular Biomedical Research Unit and Department of Cardiovascular Sciences, University of Leicester, Leicester, UK
| | - Eddie C Choke
- 4 NIHR Leicester Cardiovascular Biomedical Research Unit and Department of Cardiovascular Sciences, University of Leicester, Leicester, UK
| | - Xianghong Ma
- 2 School of Engineering and Applied Science, Aston University, Birmingham, UK
| |
Collapse
|
43
|
Attallah O, Karthikesalingam A, Holt PJE, Thompson MM, Sayers R, Bown MJ, Choke EC, Ma X. Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention. BMC Med Inform Decis Mak 2017; 17:115. [PMID: 28774329 PMCID: PMC5543447 DOI: 10.1186/s12911-017-0508-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Accepted: 07/24/2017] [Indexed: 12/25/2022] Open
Abstract
Background Feature selection (FS) process is essential in the medical area as it reduces the effort and time needed for physicians to measure unnecessary features. Choosing useful variables is a difficult task with the presence of censoring which is the unique characteristic in survival analysis. Most survival FS methods depend on Cox’s proportional hazard model; however, machine learning techniques (MLT) are preferred but not commonly used due to censoring. Techniques that have been proposed to adopt MLT to perform FS with survival data cannot be used with the high level of censoring. The researcher’s previous publications proposed a technique to deal with the high level of censoring. It also used existing FS techniques to reduce dataset dimension. However, in this paper a new FS technique was proposed and combined with feature transformation and the proposed uncensoring approaches to select a reduced set of features and produce a stable predictive model. Methods In this paper, a FS technique based on artificial neural network (ANN) MLT is proposed to deal with highly censored Endovascular Aortic Repair (EVAR). Survival data EVAR datasets were collected during 2004 to 2010 from two vascular centers in order to produce a final stable model. They contain almost 91% of censored patients. The proposed approach used a wrapper FS method with ANN to select a reduced subset of features that predict the risk of EVAR re-intervention after 5 years to patients from two different centers located in the United Kingdom, to allow it to be potentially applied to cross-centers predictions. The proposed model is compared with the two popular FS techniques; Akaike and Bayesian information criteria (AIC, BIC) that are used with Cox’s model. Results The final model outperforms other methods in distinguishing the high and low risk groups; as they both have concordance index and estimated AUC better than the Cox’s model based on AIC, BIC, Lasso, and SCAD approaches. These models have p-values lower than 0.05, meaning that patients with different risk groups can be separated significantly and those who would need re-intervention can be correctly predicted. Conclusion The proposed approach will save time and effort made by physicians to collect unnecessary variables. The final reduced model was able to predict the long-term risk of aortic complications after EVAR. This predictive model can help clinicians decide patients’ future observation plan. Electronic supplementary material The online version of this article (doi:10.1186/s12911-017-0508-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Omneya Attallah
- School of Engineering and Applied Science, Aston University, B4 7ET, Birmingham, UK.,Department of Electronics and Communications, College of Engineering and Technology, Arab Academy for Science and Technology, Alexandria, Egypt
| | | | | | | | - Rob Sayers
- St George's Vascular Institute, St George's University Hospitals NHS Foundation Trust, Blackshaw Road, London, SW17 0QT, UK
| | - Matthew J Bown
- Vascular Surgery Group, University of Leicester, Leicester, UK
| | - Eddie C Choke
- Vascular Surgery Group, Robert Kilpatrick Clinical Sciences Building, Leicester Royal Infirmary, University of Leicester, Leicester, LE2 7LX, UK
| | - Xianghong Ma
- School of Engineering and Applied Science, Aston University, B4 7ET, Birmingham, UK.
| |
Collapse
|
44
|
Connell A, Montgomery H, Morris S, Nightingale C, Stanley S, Emerson M, Jones G, Sadeghi-Alavijeh O, Merrick C, King D, Karthikesalingam A, Hughes C, Ledsam J, Back T, Rees G, Raine R, Laing C. Service evaluation of the implementation of a digitally-enabled care pathway for the recognition and management of acute kidney injury. F1000Res 2017; 6:1033. [PMID: 28751970 PMCID: PMC5510018 DOI: 10.12688/f1000research.11637.2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/04/2017] [Indexed: 11/27/2022] Open
Abstract
Acute Kidney Injury (AKI), an abrupt deterioration in kidney function, is defined by changes in urine output or serum creatinine. AKI is common (affecting up to 20% of acute hospital admissions in the United Kingdom), associated with significant morbidity and mortality, and expensive (excess costs to the National Health Service in England alone may exceed £1 billion per year). NHS England has mandated the implementation of an automated algorithm to detect AKI based on changes in serum creatinine, and to alert clinicians. It is uncertain, however, whether ‘alerting’ alone improves care quality. We have thus developed a digitally-enabled care pathway as a clinical service to inpatients in the Royal Free Hospital (RFH), a large London hospital. This pathway incorporates a mobile software application - the “Streams-AKI” app, developed by DeepMind Health - that applies the NHS AKI algorithm to routinely collected serum creatinine data in hospital inpatients. Streams-AKI alerts clinicians to potential AKI cases, furnishing them with a trend view of kidney function alongside other relevant data, in real-time, on a mobile device. A clinical response team comprising nephrologists and critical care nurses responds to these AKI alerts by reviewing individual patients and administering interventions according to existing clinical practice guidelines. We propose a mixed methods service evaluation of the implementation of this care pathway. This evaluation will assess how the care pathway meets the health and care needs of service users (RFH inpatients), in terms of clinical outcome, processes of care, and NHS costs. It will also seek to assess acceptance of the pathway by members of the response team and wider hospital community. All analyses will be undertaken by the service evaluation team from UCL (Department of Applied Health Research) and St George’s, University of London (Population Health Research Institute).
Collapse
Affiliation(s)
- Alistair Connell
- Centre for Human Health and Performance, University College London, 170 Tottenham Court Road, London, W1T 7HA, UK.,Institute of Sport, Exercise and Health, London, W1T 7HA, UK
| | - Hugh Montgomery
- Centre for Human Health and Performance, University College London, 170 Tottenham Court Road, London, W1T 7HA, UK.,Institute of Sport, Exercise and Health, London, W1T 7HA, UK
| | - Stephen Morris
- Department of Applied Health Research, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK
| | - Claire Nightingale
- Department of Applied Health Research, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK.,Population Health Research Institute, St George's, University of London, Cranmer Terrace, London, SW17 0RE, UK
| | - Sarah Stanley
- Royal Free London NHS Foundation Trust, Pond Street, London, NW3 2QG, UK
| | - Mary Emerson
- Royal Free London NHS Foundation Trust, Pond Street, London, NW3 2QG, UK
| | - Gareth Jones
- Royal Free London NHS Foundation Trust, Pond Street, London, NW3 2QG, UK
| | | | - Charles Merrick
- Royal Free London NHS Foundation Trust, Pond Street, London, NW3 2QG, UK
| | - Dominic King
- DeepMind Health, 5 New Street Square, London, EC4A 3TW, UK
| | | | - Cian Hughes
- DeepMind Health, 5 New Street Square, London, EC4A 3TW, UK
| | - Joseph Ledsam
- DeepMind Health, 5 New Street Square, London, EC4A 3TW, UK
| | - Trevor Back
- DeepMind Health, 5 New Street Square, London, EC4A 3TW, UK
| | - Geraint Rees
- University College London, Gower Street, London, WC1E 6BT, UK
| | - Rosalind Raine
- Department of Applied Health Research, University College London, 1-19 Torrington Place, London, WC1E 7HB, UK
| | - Christopher Laing
- Royal Free London NHS Foundation Trust, Pond Street, London, NW3 2QG, UK
| |
Collapse
|
45
|
Patel SR, Allen C, Grima MJ, Brownrigg JRW, Patterson BO, Holt PJE, Thompson MM, Karthikesalingam A. A Systematic Review of Predictors of Reintervention After EVAR: Guidance for Risk-Stratified Surveillance. Vasc Endovascular Surg 2017; 51:417-428. [PMID: 28656809 DOI: 10.1177/1538574417712648] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
BACKGROUND Current surveillance protocols after endovascular aneurysm repair (EVAR) are ineffective and costly. Stratifying surveillance by individual risk of reintervention requires an understanding of the factors involved in developing post-EVAR complications. This systematic review assessed risk factors for reintervention after EVAR and proposals for stratified surveillance. METHODS A systematic search according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines was performed using EMBASE and MEDLINE databases to identify studies reporting on risk factors predicting reintervention after EVAR and proposals for stratified surveillance. RESULTS Twenty-nine studies reporting on 39 898 patients met the primary inclusion criteria for reporting predictors of reintervention or aortic complications with or without suggestions for stratified surveillance. Five secondary studies described external validation of risk scores for reintervention or aortic complications. There was great heterogeneity in reporting risk factors identified at the pre-EVAR, intraoperative, and post-EVAR stages of treatment, although large preoperative abdominal aortic aneurysm diameter was the most commonly observed risk factor for reintervention after EVAR. CONCLUSION Existing data on predictors of post-EVAR complications are generally of poor quality and largely derived from retrospective studies. Few studies describing suggestions for stratified surveillance have been subjected to external validation. There is a need to refine risk prediction for EVAR failure and to conduct prospective comparative studies of personalized surveillance with standard practice.
Collapse
Affiliation(s)
- Shaneel R Patel
- 1 Department of Outcomes Research, St George's Vascular Institute, St George's Hospital NHS Trust, London, United Kingdom
| | - Chris Allen
- 1 Department of Outcomes Research, St George's Vascular Institute, St George's Hospital NHS Trust, London, United Kingdom
| | - Matthew J Grima
- 1 Department of Outcomes Research, St George's Vascular Institute, St George's Hospital NHS Trust, London, United Kingdom
| | - Jack R W Brownrigg
- 1 Department of Outcomes Research, St George's Vascular Institute, St George's Hospital NHS Trust, London, United Kingdom
| | - Benjamin O Patterson
- 1 Department of Outcomes Research, St George's Vascular Institute, St George's Hospital NHS Trust, London, United Kingdom
| | - Peter J E Holt
- 1 Department of Outcomes Research, St George's Vascular Institute, St George's Hospital NHS Trust, London, United Kingdom
| | - Matt M Thompson
- 1 Department of Outcomes Research, St George's Vascular Institute, St George's Hospital NHS Trust, London, United Kingdom
| | - Alan Karthikesalingam
- 1 Department of Outcomes Research, St George's Vascular Institute, St George's Hospital NHS Trust, London, United Kingdom
| |
Collapse
|
46
|
Grima MJ, Karthikesalingam A. Type II endoleaks: when and how. J Cardiovasc Surg (Torino) 2017. [PMID: 28627864 DOI: 10.23736/s0021-9509.17.10072-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Although most type II endoleaks are self-limiting, the most common indication for secondary intervention after endovascular aneurysm repair (EVAR) is type II endoleak. However, it is still debatable when to treat them. Furthermore, different intervention techniques are available to treat type II endoleaks. The aim of this review is to look at current evidence and updates on type II endoleaks after EVAR for abdominal aortic aneurysm and their management.
Collapse
Affiliation(s)
- Matthew J Grima
- St George's Vascular Institute, St George's University of London, London, UK -
| | | |
Collapse
|
47
|
Boufi M, Patterson BO, Grima MJ, Karthikesalingam A, Hudda MT, Holt PJ, Loftus IM, Thompson MM. Systematic Review of Reintervention After Thoracic Endovascular Repair for Chronic Type B Dissection. Ann Thorac Surg 2017; 103:1992-2004. [DOI: 10.1016/j.athoracsur.2016.12.036] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/14/2016] [Revised: 12/14/2016] [Accepted: 12/19/2016] [Indexed: 10/19/2022]
|
48
|
Karthikesalingam A, Vidal-Diez A, Holt PJ, Loftus IM, Schermerhorn ML, Soden PA, Landon BE, Thompson MM. Thresholds for Abdominal Aortic Aneurysm Repair in England and the United States. N Engl J Med 2016; 375:2051-2059. [PMID: 27959727 PMCID: PMC5177793 DOI: 10.1056/nejmoa1600931] [Citation(s) in RCA: 103] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
BACKGROUND Thresholds for repair of abdominal aortic aneurysms vary considerably among countries. METHODS We examined differences between England and the United States in the frequency of aneurysm repair, the mean aneurysm diameter at the time of the procedure, and rates of aneurysm rupture and aneurysm-related death. Data on the frequency of repair of intact (nonruptured) abdominal aortic aneurysms, in-hospital mortality among patients who had undergone aneurysm repair, and rates of aneurysm rupture during the period from 2005 through 2012 were extracted from the Hospital Episode Statistics database in England and the U.S. Nationwide Inpatient Sample. Data on the aneurysm diameter at the time of repair were extracted from the U.K. National Vascular Registry (2014 data) and from the U.S. National Surgical Quality Improvement Program (2013 data). Aneurysm-related mortality during the period from 2005 through 2012 was determined from data obtained from the Centers for Disease Control and Prevention and the U.K. Office of National Statistics. Data were adjusted with the use of direct standardization or conditional logistic regression for differences between England and the United States with respect to population age and sex. RESULTS During the period from 2005 through 2012, a total of 29,300 patients in England and 278,921 patients in the United States underwent repair of intact abdominal aortic aneurysms. Aneurysm repair was less common in England than in the United States (odds ratio, 0.49; 95% confidence interval [CI], 0.48 to 0.49; P<0.001), and aneurysm-related death was more common in England than in the United States (odds ratio, 3.60; 95% CI, 3.55 to 3.64; P<0.001). Hospitalization due to an aneurysm rupture occurred more frequently in England than in the United States (odds ratio, 2.23; 95% CI, 2.19 to 2.27; P<0.001), and the mean aneurysm diameter at the time of repair was larger in England (63.7 mm vs. 58.3 mm, P<0.001). CONCLUSIONS We found a lower rate of repair of abdominal aortic aneurysms and a larger mean aneurysm diameter at the time of repair in England than in the United States and lower rates of aneurysm rupture and aneurysm-related death in the United States than in England. (Funded by the Circulation Foundation and others.).
Collapse
Affiliation(s)
- Alan Karthikesalingam
- From St. George's Vascular Institute, St. George's University of London, London (A.K., A.V.-D., P.J.H., I.M.L., M.M.T.); and the Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center and Harvard Medical School (M.L.S., P.A.S.), and the Department of Health Care Policy, Harvard Medical School (B.E.L.) - both in Boston
| | - Alberto Vidal-Diez
- From St. George's Vascular Institute, St. George's University of London, London (A.K., A.V.-D., P.J.H., I.M.L., M.M.T.); and the Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center and Harvard Medical School (M.L.S., P.A.S.), and the Department of Health Care Policy, Harvard Medical School (B.E.L.) - both in Boston
| | - Peter J Holt
- From St. George's Vascular Institute, St. George's University of London, London (A.K., A.V.-D., P.J.H., I.M.L., M.M.T.); and the Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center and Harvard Medical School (M.L.S., P.A.S.), and the Department of Health Care Policy, Harvard Medical School (B.E.L.) - both in Boston
| | - Ian M Loftus
- From St. George's Vascular Institute, St. George's University of London, London (A.K., A.V.-D., P.J.H., I.M.L., M.M.T.); and the Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center and Harvard Medical School (M.L.S., P.A.S.), and the Department of Health Care Policy, Harvard Medical School (B.E.L.) - both in Boston
| | - Marc L Schermerhorn
- From St. George's Vascular Institute, St. George's University of London, London (A.K., A.V.-D., P.J.H., I.M.L., M.M.T.); and the Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center and Harvard Medical School (M.L.S., P.A.S.), and the Department of Health Care Policy, Harvard Medical School (B.E.L.) - both in Boston
| | - Peter A Soden
- From St. George's Vascular Institute, St. George's University of London, London (A.K., A.V.-D., P.J.H., I.M.L., M.M.T.); and the Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center and Harvard Medical School (M.L.S., P.A.S.), and the Department of Health Care Policy, Harvard Medical School (B.E.L.) - both in Boston
| | - Bruce E Landon
- From St. George's Vascular Institute, St. George's University of London, London (A.K., A.V.-D., P.J.H., I.M.L., M.M.T.); and the Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center and Harvard Medical School (M.L.S., P.A.S.), and the Department of Health Care Policy, Harvard Medical School (B.E.L.) - both in Boston
| | - Matthew M Thompson
- From St. George's Vascular Institute, St. George's University of London, London (A.K., A.V.-D., P.J.H., I.M.L., M.M.T.); and the Division of Vascular and Endovascular Surgery, Beth Israel Deaconess Medical Center and Harvard Medical School (M.L.S., P.A.S.), and the Department of Health Care Policy, Harvard Medical School (B.E.L.) - both in Boston
| |
Collapse
|
49
|
Brownrigg JRW, Hughes CO, Burleigh D, Karthikesalingam A, Patterson BO, Holt PJ, Thompson MM, de Lusignan S, Ray KK, Hinchliffe RJ. Diabetic microvascular triopathy, smoking, and risk of cardiovascular events - Author's reply. Lancet Diabetes Endocrinol 2016; 4:888-889. [PMID: 27793319 DOI: 10.1016/s2213-8587(16)30262-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Accepted: 09/09/2016] [Indexed: 11/30/2022]
Affiliation(s)
- Jack R W Brownrigg
- St George's Vascular Institute, St George's University of London, London, UK.
| | - Cian O Hughes
- Division of Surgery and Interventional Science, University College London, London, UK
| | - David Burleigh
- Department of Healthcare Management and Policy, University of Surrey, Guildford, UK
| | | | | | - Peter J Holt
- St George's Vascular Institute, St George's University of London, London, UK
| | - Matthew M Thompson
- St George's Vascular Institute, St George's University of London, London, UK
| | - Simon de Lusignan
- Department of Healthcare Management and Policy, University of Surrey, Guildford, UK
| | - Kausik K Ray
- Department of Primary Care and Public Health, Imperial College London, London, UK
| | - Robert J Hinchliffe
- Bristol Centre for Surgical Research, School of Social and Community Medicine, University of Bristol, Bristol, UK
| |
Collapse
|
50
|
Lewis TL, Fothergill RT, Karthikesalingam A. Ambulance smartphone tool for field triage of ruptured aortic aneurysms (FILTR): study protocol for a prospective observational validation of diagnostic accuracy. BMJ Open 2016; 6:e011308. [PMID: 27797986 PMCID: PMC5093389 DOI: 10.1136/bmjopen-2016-011308] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
INTRODUCTION Rupture of an abdominal aortic aneurysm (rAAA) carries a considerable mortality rate and is often fatal. rAAA can be treated through open or endovascular surgical intervention and it is possible that more rapid access to definitive intervention might be a key aspect of improving mortality for rAAA. Diagnosis is not always straightforward with up to 42% of rAAA initially misdiagnosed, introducing potentially harmful delay. There is a need for an effective clinical decision support tool for accurate prehospital diagnosis and triage to enable transfer to an appropriate centre. METHODS AND ANALYSIS Prospective multicentre observational study assessing the diagnostic accuracy of a prehospital smartphone triage tool for detection of rAAA. The study will be conducted across London in conjunction with London Ambulance Service (LAS). A logistic score predicting the risk of rAAA by assessing ten key parameters was developed and retrospectively validated through logistic regression analysis of ambulance records and Hospital Episode Statistics data for 2200 patients from 2005 to 2010. The triage tool is integrated into a secure mobile app for major smartphone platforms. Key parameters collected from the app will be retrospectively matched with final hospital discharge diagnosis for each patient encounter. The primary outcome is to assess the sensitivity, specificity and positive predictive value of the rAAA triage tool logistic score in prospective use as a mob app for prehospital ambulance clinicians. Data collection started in November 2014 and the study will recruit a minimum of 1150 non-consecutive patients over a time period of 2 years. ETHICS AND DISSEMINATION Full ethical approval has been gained for this study. The results of this study will be disseminated in peer-reviewed publications, and international/national presentations. TRIAL REGISTRATION NUMBER CPMS 16459; pre-results.
Collapse
Affiliation(s)
- Thomas L Lewis
- St George's Vascular Institute, St George's University of London, London, UK
| | - Rachael T Fothergill
- Clinical Audit & Research Unit, London Ambulance Service NHS Trust, 8-20 Pocock Street, London, UK
| | | | - Alan Karthikesalingam
- St George's Vascular Institute, St George's University of London, London, UK
- Cardiovascular and Cell Sciences Institute, St George's University of London, London, UK
| |
Collapse
|