1
|
Velupillai S, Suominen H, Liakata M, Roberts A, Shah AD, Morley K, Osborn D, Hayes J, Stewart R, Downs J, Chapman W, Dutta R. Using clinical Natural Language Processing for health outcomes research: Overview and actionable suggestions for future advances. J Biomed Inform 2018; 88:11-19. [PMID: 30368002 PMCID: PMC6986921 DOI: 10.1016/j.jbi.2018.10.005] [Citation(s) in RCA: 89] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Revised: 10/14/2018] [Accepted: 10/15/2018] [Indexed: 12/27/2022]
Abstract
The importance of incorporating Natural Language Processing (NLP) methods in clinical informatics research has been increasingly recognized over the past years, and has led to transformative advances. Typically, clinical NLP systems are developed and evaluated on word, sentence, or document level annotations that model specific attributes and features, such as document content (e.g., patient status, or report type), document section types (e.g., current medications, past medical history, or discharge summary), named entities and concepts (e.g., diagnoses, symptoms, or treatments) or semantic attributes (e.g., negation, severity, or temporality). From a clinical perspective, on the other hand, research studies are typically modelled and evaluated on a patient- or population-level, such as predicting how a patient group might respond to specific treatments or patient monitoring over time. While some NLP tasks consider predictions at the individual or group user level, these tasks still constitute a minority. Owing to the discrepancy between scientific objectives of each field, and because of differences in methodological evaluation priorities, there is no clear alignment between these evaluation approaches. Here we provide a broad summary and outline of the challenging issues involved in defining appropriate intrinsic and extrinsic evaluation methods for NLP research that is to be used for clinical outcomes research, and vice versa. A particular focus is placed on mental health research, an area still relatively understudied by the clinical NLP research community, but where NLP methods are of notable relevance. Recent advances in clinical NLP method development have been significant, but we propose more emphasis needs to be placed on rigorous evaluation for the field to advance further. To enable this, we provide actionable suggestions, including a minimal protocol that could be used when reporting clinical NLP method development and its evaluation.
Collapse
Affiliation(s)
- Sumithra Velupillai
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; School of Electrical Engineering and Computer Science, KTH, Stockholm, Sweden.
| | - Hanna Suominen
- College of Engineering and Computer Science, The Australian National University, Data61/CSIRO, University of Canberra, Australia; University of Turku, Finland.
| | - Maria Liakata
- Department of Computer Science, University of Warwick/Alan Turing Institute, UK.
| | - Angus Roberts
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK.
| | - Anoop D Shah
- Institute of Health Informatics, University College London, UK; University College London NHS Foundation Trust, London, UK.
| | - Katherine Morley
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; Melbourne School of Population and Global Health, The University of Melbourne, Australia.
| | - David Osborn
- Division of Psychiatry, University College London, UK; Camden and Islington NHS Foundation Trust, London, UK.
| | - Joseph Hayes
- Division of Psychiatry, University College London, UK; Camden and Islington NHS Foundation Trust, London, UK.
| | - Robert Stewart
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; South London and Maudsley NHS Foundation Trust, London, UK.
| | - Johnny Downs
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; South London and Maudsley NHS Foundation Trust, London, UK.
| | - Wendy Chapman
- Department of Biomedical Informatics, University of Utah, United States.
| | - Rina Dutta
- Institute of Psychiatry, Psychology & Neuroscience, King's College London, UK; South London and Maudsley NHS Foundation Trust, London, UK.
| |
Collapse
|
2
|
Abstract
New machine learning methods to analyze raw chemical and biological data are now widely accessible as open-source toolkits. This positions researchers to leverage powerful, predictive models in their own domains. We caution, however, that the application of machine learning to experimental research merits careful consideration. Machine learning algorithms readily exploit confounding variables and experimental artifacts instead of relevant patterns, leading to overoptimistic performance and poor model generalization. In parallel to the strong control experiments that remain a cornerstone of experimental research, we advance the concept of adversarial controls for scientific machine learning: the design of exacting and purposeful experiments to ensure that predictive performance arises from meaningful models.
Collapse
Affiliation(s)
- Kangway V. Chuang
- Department of Pharmaceutical Chemistry, Department of Bioengineering and Therapeutic Sciences, Institute for Neurodegenerative Diseases and Bakar Institute for Computational Health Sciences, University of California—San Francisco, 675 Nelson Rising Lane, San Francisco, California 94158, United States
| | - Michael J. Keiser
- Department of Pharmaceutical Chemistry, Department of Bioengineering and Therapeutic Sciences, Institute for Neurodegenerative Diseases and Bakar Institute for Computational Health Sciences, University of California—San Francisco, 675 Nelson Rising Lane, San Francisco, California 94158, United States
| |
Collapse
|
3
|
Berry PM, Donneau-Golencer T, Duong K, Gervasio M, Peintner B, Yorke-Smith N. Evaluating intelligent knowledge systems: experiences with a user-adaptive assistant agent. Knowl Inf Syst 2016. [DOI: 10.1007/s10115-016-1011-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
10
|
Conrath DW, Sharma RS. Toward a diagnostic instrument for assessing the quality of expert systems. DATA BASE FOR ADVANCES IN INFORMATION SYSTEMS 1992. [DOI: 10.1145/134347.134357] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
In this article we discuss the problem of evaluating the quality of expert systems. The solution we propose is a diagnostic instrument that assesses the relative strengths and weaknesses of an operational expert system. Such a tool is designed to identify the characteristics and features that determine the system's quality (or lack of it) as a means of supporting the knowledge engineering process. Our approach is based on underlying concepts form socio-technical theory. It is also empirically subjected to a rigorous validation methodology.
Collapse
|