1
|
Laursen MS, Pedersen JS, Hansen RS, Savarimuthu TR, Lynggaard RB, Vinholt PJ. Doctors Identify Hemorrhage Better during Chart Review when Assisted by Artificial Intelligence. Appl Clin Inform 2023; 14:743-751. [PMID: 37399838 PMCID: PMC10511273 DOI: 10.1055/a-2121-8380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 06/29/2023] [Indexed: 07/05/2023] Open
Abstract
OBJECTIVES This study evaluated if medical doctors could identify more hemorrhage events during chart review in a clinical setting when assisted by an artificial intelligence (AI) model and medical doctors' perception of using the AI model. METHODS To develop the AI model, sentences from 900 electronic health records were labeled as positive or negative for hemorrhage and categorized into one of 12 anatomical locations. The AI model was evaluated on a test cohort consisting of 566 admissions. Using eye-tracking technology, we investigated medical doctors' reading workflow during manual chart review. Moreover, we performed a clinical use study where medical doctors read two admissions with and without AI assistance to evaluate performance when using and perception of using the AI model. RESULTS The AI model had a sensitivity of 93.7% and a specificity of 98.1% on the test cohort. In the use studies, we found that medical doctors missed more than 33% of relevant sentences when doing chart review without AI assistance. Hemorrhage events described in paragraphs were more often overlooked compared with bullet-pointed hemorrhage mentions. With AI-assisted chart review, medical doctors identified 48 and 49 percentage points more hemorrhage events than without assistance in two admissions, and they were generally positive toward using the AI model as a supporting tool. CONCLUSION Medical doctors identified more hemorrhage events with AI-assisted chart review and they were generally positive toward using the AI model.
Collapse
Affiliation(s)
- Martin S. Laursen
- SDU Robotics, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Jannik S. Pedersen
- SDU Robotics, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Rasmus S. Hansen
- Department of Clinical Biochemistry, Odense University Hospital, Odense, Denmark
| | - Thiusius R. Savarimuthu
- SDU Robotics, The Maersk Mc-Kinney Moller Institute, University of Southern Denmark, Odense, Denmark
| | - Rasmus B. Lynggaard
- Department of Clinical Biochemistry, Odense University Hospital, Odense, Denmark
| | - Pernille J. Vinholt
- Department of Clinical Biochemistry, Odense University Hospital, Odense, Denmark
| |
Collapse
|
2
|
Barr PB, Bigdeli TB, Meyers JL, Peterson RE, Sanchez-Roige S, Mallard TT, Dick DM, Paige Harden K, Wilkinson A, Graham DP, Nielsen DA, Swann A, Lipsky RK, Kosten T, Aslan M, Harvey PD, Kimbrel NA, Beckham JC. Correlates of Risk for Disinhibited Behaviors in the Million Veteran Program Cohort. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.22.23286865. [PMID: 37034805 PMCID: PMC10081391 DOI: 10.1101/2023.03.22.23286865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Background Many psychiatric outcomes are thought to share a common etiological pathway reflecting behavioral disinhibition, generally referred to as externalizing disorders (EXT). Recent genome-wide association studies (GWAS) have demonstrated the overlap between EXT and important aspects of veterans' health, such as suicide-related behaviors, substance use disorders, and other medical conditions. Methods We conducted a series of phenome-wide association studies (PheWAS) of polygenic scores (PGS) for EXT, and comorbid psychiatric problems (depression, schizophrenia, and suicide attempt) in an ancestrally diverse cohort of U.S. veterans (N = 560,824), using diagnostic codes from electronic health records. We conducted ancestry-specific PheWASs of EXT PGS in the European, African, and Hispanic/Latin American ancestries. To determine if associations were driven by risk for other comorbid problems, we performed a conditional PheWAS, covarying for comorbid psychiatric problems (European ancestries only). Lastly, to adjust for unmeasured confounders we performed a within-family analysis of significant associations from the main PheWAS in full-siblings (N = 12,127, European ancestries only). Results The EXT PGS was associated with 619 outcomes across all bodily systems, of which, 188 were independent of risk for comorbid problems of PGS. Effect sizes ranged from OR = 1.02 (95% CI = 1.01, 1.03) for overweight/obesity to OR = 1.44 (95% CI = 1.42, 1.47) for viral hepatitis C. Of the significant outcomes 73 (11.9%) and 26 (4.5%) were significant in the African and Hispanic/Latin American results, respectively. Within-family analyses uncovered robust associations between EXT and consequences of substance use disorders, including liver disease, chronic airway obstruction, and viral hepatitis C. Conclusion Our results demonstrate a shared polygenic basis of EXT across populations of diverse ancestries and independent of risk for other psychiatric problems. The strongest associations with EXT were for diagnoses related to substance use disorders and their sequelae. Overall, we highlight the potential negative consequences of EXT for health and functioning in the US veteran population.
Collapse
Affiliation(s)
- Peter B. Barr
- VA New York Harbor Healthcare System, Brooklyn, NY
- Department of Psychiatry and Behavioral Sciences, SUNY Downstate Health Sciences University, Brooklyn, NY
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, NY
| | - Tim B. Bigdeli
- VA New York Harbor Healthcare System, Brooklyn, NY
- Department of Psychiatry and Behavioral Sciences, SUNY Downstate Health Sciences University, Brooklyn, NY
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, NY
| | - Jacquelyn L. Meyers
- VA New York Harbor Healthcare System, Brooklyn, NY
- Department of Psychiatry and Behavioral Sciences, SUNY Downstate Health Sciences University, Brooklyn, NY
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
- Department of Epidemiology and Biostatistics, School of Public Health, SUNY Downstate Health Sciences University, Brooklyn, NY
| | - Roseann E. Peterson
- VA New York Harbor Healthcare System, Brooklyn, NY
- Department of Psychiatry and Behavioral Sciences, SUNY Downstate Health Sciences University, Brooklyn, NY
- Institute for Genomics in Health (IGH), SUNY Downstate Health Sciences University, Brooklyn, NY
| | - Sandra Sanchez-Roige
- Department of Psychiatry, University of California San Diego, La Jolla, CA, USA
- Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Travis T. Mallard
- Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Department of Psychiatry, Harvard Medical School, Boston, MA, USA
| | - Danielle M. Dick
- Department of Psychiatry, Robert Wood Johnson Medical School, Rutgers University, Piscataway, NJ
- Rutgers Addiction Research Center, Rutgers University, Piscataway, NJ
| | - K. Paige Harden
- Department of Psychology, University of Texas at Austin, Austin, TX
- Population Research Center, University of Texas at Austin, Austin, TX
| | - Anna Wilkinson
- Michael E. DeBakey VA Medical Center, Houston, TX
- UTHealth Houston School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX
- Michael and Susan Dell Center for Healthy Living, The University of Texas Health Science Center at Houston, Houston, TX
| | - David P. Graham
- Michael E. DeBakey VA Medical Center, Houston, TX
- Department of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, TX
| | - David A. Nielsen
- Michael E. DeBakey VA Medical Center, Houston, TX
- Department of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, TX
| | - Alan Swann
- Michael E. DeBakey VA Medical Center, Houston, TX
- Department of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, TX
| | - Rachele K. Lipsky
- Michael E. DeBakey VA Medical Center, Houston, TX
- Department of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, TX
| | - Thomas Kosten
- Michael E. DeBakey VA Medical Center, Houston, TX
- Department of Psychiatry, Neuroscience, Pharmacology, and Immunology and Rheumatology, Baylor College of Medicine, Houston, TX
| | - Mihaela Aslan
- Clinical Epidemiology Research Center (CERC), VA Connecticut Healthcare System, West Haven, CT
- Yale University School of Medicine, New Haven, CT
| | - Philip D. Harvey
- Research Service, Bruce W. Carter Miami Veterans Affairs (VA) Medical Center, Miami, FL
- University of Miami Miller School of Medicine, Miami, FL
| | - Nathan A. Kimbrel
- Durham VA Health Care System, Durham, NC
- VA Mid-Atlantic Mental Illness Research, Education and Clinical Center, Durham, NC
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC
| | - Jean C. Beckham
- Durham VA Health Care System, Durham, NC
- VA Mid-Atlantic Mental Illness Research, Education and Clinical Center, Durham, NC
- Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC
| |
Collapse
|
3
|
Simmelink AM, Gichuki CM, Ampt FH, Manguro G, Lim MSC, Agius P, Hellard M, Jaoko W, Stoové MA, L'Engle K, Temmerman M, Gichangi P, Luchters S. Assessment of the lifetime prevalence and incidence of induced abortion and correlates among female sex workers in Mombasa, Kenya: a secondary cohort analysis. BMJ Open 2022; 12:e053218. [PMID: 36207033 PMCID: PMC9557798 DOI: 10.1136/bmjopen-2021-053218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
INTRODUCTION Prevalence of lifetime-induced abortion in female sex workers (FSWs) in Kenya was previously estimated between 43% and 86%. Our analysis aimed at assessing lifetime prevalence and correlates, and incidence and predictors of induced abortions among FSWs in Kenya. METHODS This is a secondary prospective cohort analysis using data collected as part of the WHISPER or SHOUT cluster-randomised trial in Mombasa, assessing effectiveness of an SMS-intervention to reduce incidence of unintended pregnancy. Eligible participants were current FSWs, 16-34 years and not pregnant or planning pregnancy. Baseline data on self-reported lifetime abortion, correlates and predictors were collected between September 2016 and May 2017. Abortion incidence was measured at 6-month and 12-month follow-up. A multivariable logistic regression model was used to assess correlates of lifetime abortion and discrete-time survival analysis was used to assess predictors of abortions during follow-up. RESULTS Among 866 eligible participants, lifetime abortion prevalence was 11.9%, while lifetime unintended pregnancy prevalence was 51.2%. Correlates of lifetime abortions were currently not using a highly effective contraceptive (adjusted OR (AOR)=1.76 (95% CI=1.11 to 2.79), p=0.017) and having ever-experienced intimate partner violence (IPV) (AOR=2.61 (95% CI=1.35 to 5.06), p=0.005). Incidence of unintended pregnancy and induced abortion were 15.5 and 3.9 per 100 women-years, respectively. No statistically significant associations were found between hazard of abortion and age, sex work duration, partner status, contraceptive use and IPV experience. CONCLUSION Although experience of unintended pregnancy remains high, lifetime prevalence of abortion may have decreased among FSW in Kenya. Addressing IPV could further decrease induced abortions in this population. TRIAL REGISTRATION NUMBER ACTRN12616000852459.
Collapse
Affiliation(s)
| | - Caroline M Gichuki
- Department of Population Health, The Aga Khan University, Nairobi, Nairobi, Kenya
- International Centre for Reproductive Health Kenya, Mombasa, Kenya
| | - Frances H Ampt
- Burnet Institute, Melbourne, Victoria, Australia
- Department of Epidemiology and Preventive Medicine, Monash University, Clayton, Victoria, Australia
| | - Griffins Manguro
- International Centre for Reproductive Health Kenya, Mombasa, Kenya
| | - Megan S C Lim
- Burnet Institute, Melbourne, Victoria, Australia
- Department of Epidemiology and Preventive Medicine, Monash University, Clayton, Victoria, Australia
| | - Paul Agius
- Burnet Institute, Melbourne, Victoria, Australia
- Department of Epidemiology and Preventive Medicine, Monash University, Clayton, Victoria, Australia
| | - Margaret Hellard
- Burnet Institute, Melbourne, Victoria, Australia
- Department of Epidemiology and Preventive Medicine, Monash University, Clayton, Victoria, Australia
- Department of Infectious Diseases, The Alfred Hospital, Melbourne, Victoria, Australia
- Doherty Institute and School of Population and Global Health, University of Melbourne, Melbourne, Victoria, Australia
| | - Walter Jaoko
- Department of Medical Microbiology and Immunology, University of Nairobi, Nairobi, Nairobi, Kenya
| | - Mark A Stoové
- Burnet Institute, Melbourne, Victoria, Australia
- Department of Epidemiology and Preventive Medicine, Monash University, Clayton, Victoria, Australia
- School of Psychology and Public Health, La Trobe University, Melbourne, Victoria, Australia
| | - Kelly L'Engle
- School of Nursing and Health Professions, University of San Francisco, San Francisco, California, USA
| | - Marleen Temmerman
- International Centre for Reproductive Health Kenya, Mombasa, Kenya
- Department of Obstetrics and Gynaecology, The Aga Khan University Hospital Nairobi, Nairobi, Kenya
- Department of Public Health and Primary Care, Ghent University, Gent, Belgium
| | - Peter Gichangi
- International Centre for Reproductive Health Kenya, Mombasa, Kenya
- Department of Public Health and Primary Care, Ghent University, Gent, Belgium
- Technical University of Mombasa, Mombasa, Kenya
| | - Stanley Luchters
- Burnet Institute, Melbourne, Victoria, Australia
- Department of Public Health and Primary Care, Ghent University, Gent, Belgium
- Centre for Sexual Health and HIV/AIDS Research (CeSHHAR), Harare, Zimbabwe
- Liverpool School of Tropical Medicine (LSTM), Liverpool, UK
| |
Collapse
|
4
|
Aronson JK. Artificial Intelligence in Pharmacovigilance: An Introduction to Terms, Concepts, Applications, and Limitations. Drug Saf 2022; 45:407-418. [PMID: 35579806 DOI: 10.1007/s40264-022-01156-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/10/2022] [Indexed: 01/29/2023]
Abstract
The tools of artificial intelligence (AI) have enormous potential to enhance activities in pharmacovigilance. Pharmacovigilance experts need not be AI experts, but they should know enough about AI to explore the possibilities of collaboration with those who are. Modern concepts of AI date from Alan Turing's work, especially his paper on "the imitation game", in the late 1940s and early 1950s. Its scope today includes computational skills, including the formulation of mathematical proofs; visual perception, including facial recognition and virtual reality; decision making by expert systems; aspects of language, such as language processing, speech recognition, creative composition, and translation; and combinations of these, e.g. in self-driving vehicles. Machines can be programmed with the ability to learn, using neural networks that mimic cognitive actions of the human brain, leading to deep structural learning. Limitations of AI include difficulties with language, arising from the need to understand context and interpret ambiguities, which particularly affect translation, and inadequacies of databases, requiring careful preparation and curation. New techniques may cause unforeseen difficulties via unexpected malfunctioning. Relevant terms and concepts include different types of machine learning, neural networks, natural language programming, ontologies, and expert systems. Adoption of the tools of AI in pharmacovigilance has been slow. Machine learning, in conjunction with natural language processing and data mining, to study adverse drug reactions in databases such as those found in electronic health records, claims databases, and social media, has the potential to enhance the characterization of known adverse effects and reactions and detect new signals.
Collapse
Affiliation(s)
- Jeffrey K Aronson
- Centre for Evidence-Based Medicine, Nuffield Department of Primary Care Health Sciences, Oxford, UK.
| |
Collapse
|
5
|
Hann A, Meining A. Artificial Intelligence in Endoscopy. Visc Med 2022; 37:471-475. [PMID: 35083312 DOI: 10.1159/000519407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 08/30/2021] [Indexed: 12/23/2022] Open
Abstract
Background Owing to their rapid development, artificial intelligence (AI) technologies offer a great promise for gastroenterology practice and research. At present, AI-guided image interpretation has already been used with success for endoscopic detection of early malignant lesions. Nonetheless, there are complex challenges and possible shortcomings that must be considered before full implementation can be realized. Summary In this review, the current status of AI in endoscopy is summarized. Future perspectives and open questions for further studies are stressed. Key Messages The usage of AI algorithms for polyp detection in screening colonoscopy results in a significant increase in the adenoma detection rate, mainly attributed to the identification of diminutive polyps. Computer-aided characterization of colorectal polyps accompanies the detection, but further studies are needed to evaluate the clinical benefit. In contrast to colonoscopy, usage of AI in gastroscopy is currently rather limited. Regarding other fields of endoscopic imaging, capsule endoscopy is the ideal imaging platform for AI, due to the potential of saving time in the video analysis.
Collapse
Affiliation(s)
- Alexander Hann
- Interventional and Experimental Endoscopy (InExEn), Department of Internal Medicine II, Gastroenterology, University Hospital Würzburg, Würzburg, Germany
| | - Alexander Meining
- Interventional and Experimental Endoscopy (InExEn), Department of Internal Medicine II, Gastroenterology, University Hospital Würzburg, Würzburg, Germany
| |
Collapse
|
6
|
Thapa R, Garikipati A, Shokouhi S, Hurtado M, Barnes G, Hoffman J, Calvert J, Katzmann L, Mao Q, Das R. Usability of Electronic Health records in Predicting Short-term falls: Machine learning Applications in Senior Care Facilities (Preprint). JMIR Aging 2021; 5:e35373. [PMID: 35363146 PMCID: PMC9015781 DOI: 10.2196/35373] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 01/16/2022] [Accepted: 02/07/2022] [Indexed: 11/23/2022] Open
Abstract
Background Short-term fall prediction models that use electronic health records (EHRs) may enable the implementation of dynamic care practices that specifically address changes in individualized fall risk within senior care facilities. Objective The aim of this study is to implement machine learning (ML) algorithms that use EHR data to predict a 3-month fall risk in residents from a variety of senior care facilities providing different levels of care. Methods This retrospective study obtained EHR data (2007-2021) from Juniper Communities’ proprietary database of 2785 individuals primarily residing in skilled nursing facilities, independent living facilities, and assisted living facilities across the United States. We assessed the performance of 3 ML-based fall prediction models and the Juniper Communities’ fall risk assessment. Additional analyses were conducted to examine how changes in the input features, training data sets, and prediction windows affected the performance of these models. Results The Extreme Gradient Boosting model exhibited the highest performance, with an area under the receiver operating characteristic curve of 0.846 (95% CI 0.794-0.894), specificity of 0.848, diagnostic odds ratio of 13.40, and sensitivity of 0.706, while achieving the best trade-off in balancing true positive and negative rates. The number of active medications was the most significant feature associated with fall risk, followed by a resident’s number of active diseases and several variables associated with vital signs, including diastolic blood pressure and changes in weight and respiratory rates. The combination of vital signs with traditional risk factors as input features achieved higher prediction accuracy than using either group of features alone. Conclusions This study shows that the Extreme Gradient Boosting technique can use a large number of features from EHR data to make short-term fall predictions with a better performance than that of conventional fall risk assessments and other ML models. The integration of routinely collected EHR data, particularly vital signs, into fall prediction models may generate more accurate fall risk surveillance than models without vital signs. Our data support the use of ML models for dynamic, cost-effective, and automated fall predictions in different types of senior care facilities.
Collapse
|
7
|
Gharagozloo M, Amrani A, Wittingstall K, Hamilton-Wright A, Gris D. Machine Learning in Modeling of Mouse Behavior. Front Neurosci 2021; 15:700253. [PMID: 34594182 PMCID: PMC8477014 DOI: 10.3389/fnins.2021.700253] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Accepted: 08/02/2021] [Indexed: 12/02/2022] Open
Abstract
Mouse behavior is a primary outcome in evaluations of therapeutic efficacy. Exhaustive, continuous, multiparametric behavioral phenotyping is a valuable tool for understanding the pathophysiological status of mouse brain diseases. Automated home cage behavior analysis produces highly granulated data both in terms of number of features and sampling frequency. Previously, we demonstrated several ways to reduce feature dimensionality. In this study, we propose novel approaches for analyzing 33-Hz data generated by CleverSys software. We hypothesized that behavioral patterns within short time windows are reflective of physiological state, and that computer modeling of mouse behavioral routines can serve as a predictive tool in classification tasks. To remove bias due to researcher decisions, our data flow is indifferent to the quality, value, and importance of any given feature in isolation. To classify day and night behavior, as an example application, we developed a data preprocessing flow and utilized logistic regression (LG), support vector machines (SVM), random forest (RF), and one-dimensional convolutional neural networks paired with long short-term memory deep neural networks (1DConvBiLSTM). We determined that a 5-min video clip is sufficient to classify mouse behavior with high accuracy. LG, SVM, and RF performed similarly, predicting mouse behavior with 85% accuracy, and combining the three algorithms in an ensemble procedure increased accuracy to 90%. The best performance was achieved by combining the 1DConv and BiLSTM algorithms yielding 96% accuracy. Our findings demonstrate that computer modeling of the home-cage ethome can clearly define mouse physiological state. Furthermore, we showed that continuous behavioral data can be analyzed using approaches similar to natural language processing. These data provide proof of concept for future research in diagnostics of complex pathophysiological changes that are accompanied by changes in behavioral profile.
Collapse
Affiliation(s)
- Marjan Gharagozloo
- Department of Neurology, Johns Hopkins University, Baltimore, MD, United States
| | - Abdelaziz Amrani
- Department of Pediatrics, Faculty of Medicine, Université de Sherbrooke, Sherbrooke, QC, Canada
| | - Kevin Wittingstall
- Department of Radiology, Sherbrooke Molecular Imaging Center, Université de Sherbrooke, Sherbrooke, QC, Canada
| | | | - Denis Gris
- Department of Pharmacology and Physiology, Faculty of Medicine, Université de Sherbrooke, Sherbrooke, QC, Canada
| |
Collapse
|
8
|
Liu F, Zhou P, Baccei SJ, Masciocchi MJ, Amornsiripanitch N, Kiefe CI, Rosen MP. Qualifying Certainty in Radiology Reports through Deep Learning-Based Natural Language Processing. AJNR Am J Neuroradiol 2021; 42:1755-1761. [PMID: 34413062 DOI: 10.3174/ajnr.a7241] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 05/19/2021] [Indexed: 01/22/2023]
Abstract
BACKGROUND AND PURPOSE Communication gaps exist between radiologists and referring physicians in conveying diagnostic certainty. We aimed to explore deep learning-based bidirectional contextual language models for automatically assessing diagnostic certainty expressed in the radiology reports to facilitate the precision of communication. MATERIALS AND METHODS We randomly sampled 594 head MR imaging reports from an academic medical center. We asked 3 board-certified radiologists to read sentences from the Impression section and assign each sentence 1 of the 4 certainty categories: "Non-Definitive," "Definitive-Mild," "Definitive-Strong," "Other." Using the annotated 2352 sentences, we developed and validated a natural language-processing system based on the start-of-the-art bidirectional encoder representations from transformers (BERT), which can capture contextual uncertainty semantics beyond the lexicon level. Finally, we evaluated 3 BERT variant models and reported standard metrics including sensitivity, specificity, and area under the curve. RESULTS A κ score of 0.74 was achieved for interannotator agreement on uncertainty interpretations among 3 radiologists. For the 3 BERT variant models, the biomedical variant (BioBERT) achieved the best macro-average area under the curve of 0.931 (compared with 0.928 for the BERT-base and 0.925 for the clinical variant [ClinicalBERT]) on the validation data. All 3 models yielded high macro-average specificity (93.13%-93.65%), while the BERT-base obtained the highest macro-average sensitivity of 79.46% (compared with 79.08% for BioBERT and 78.52% for ClinicalBERT). The BioBERT model showed great generalizability on the heldout test data with a macro-average sensitivity of 77.29%, specificity of 92.89%, and area under the curve of 0.93. CONCLUSIONS A deep transfer learning model can be developed to reliably assess the level of uncertainty communicated in a radiology report.
Collapse
Affiliation(s)
- F Liu
- From the Department of Population and Quantitative Health Sciences (F.L., C.I.K.), University of Massachusetts Medical School, Worcester, Massachusetts
- Department of Radiology (F.L., P.Z., S.J.B., M.J.M., N.A., M.P.R.), University of Massachusetts Medical School, Worcester, Massachusetts
| | - P Zhou
- Department of Radiology (F.L., P.Z., S.J.B., M.J.M., N.A., M.P.R.), University of Massachusetts Medical School, Worcester, Massachusetts
| | - S J Baccei
- Department of Radiology (F.L., P.Z., S.J.B., M.J.M., N.A., M.P.R.), University of Massachusetts Medical School, Worcester, Massachusetts
- Department of Radiology (S.J.B., M.J.M., N.A., M.P.R.), UMass Memorial Medical Center, Worcester, Massachusetts
| | - M J Masciocchi
- Department of Radiology (F.L., P.Z., S.J.B., M.J.M., N.A., M.P.R.), University of Massachusetts Medical School, Worcester, Massachusetts
- Department of Radiology (S.J.B., M.J.M., N.A., M.P.R.), UMass Memorial Medical Center, Worcester, Massachusetts
| | - N Amornsiripanitch
- Department of Radiology (F.L., P.Z., S.J.B., M.J.M., N.A., M.P.R.), University of Massachusetts Medical School, Worcester, Massachusetts
- Department of Radiology (S.J.B., M.J.M., N.A., M.P.R.), UMass Memorial Medical Center, Worcester, Massachusetts
| | - C I Kiefe
- From the Department of Population and Quantitative Health Sciences (F.L., C.I.K.), University of Massachusetts Medical School, Worcester, Massachusetts
| | - M P Rosen
- Department of Radiology (F.L., P.Z., S.J.B., M.J.M., N.A., M.P.R.), University of Massachusetts Medical School, Worcester, Massachusetts
- Department of Radiology (S.J.B., M.J.M., N.A., M.P.R.), UMass Memorial Medical Center, Worcester, Massachusetts
| |
Collapse
|
9
|
Sarker A, Al-Garadi MA, Yang YC, Choi J, Quyyumi AA, Martin GS. Defining Patient-Oriented Natural Language Processing: A New Paradigm for Research and Development to Facilitate Adoption and Use by Medical Experts. JMIR Med Inform 2021; 9:e18471. [PMID: 34581670 PMCID: PMC8512184 DOI: 10.2196/18471] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 02/13/2021] [Accepted: 08/02/2021] [Indexed: 01/20/2023] Open
Abstract
The capabilities of natural language processing (NLP) methods have expanded significantly in recent years, and progress has been particularly driven by advances in data science and machine learning. However, NLP is still largely underused in patient-oriented clinical research and care (POCRC). A key reason behind this is that clinical NLP methods are typically developed, optimized, and evaluated with narrowly focused data sets and tasks (eg, those for the detection of specific symptoms in free texts). Such research and development (R&D) approaches may be described as problem oriented, and the developed systems perform specialized tasks well. As standalone systems, however, they generally do not comprehensively meet the needs of POCRC. Thus, there is often a gap between the capabilities of clinical NLP methods and the needs of patient-facing medical experts. We believe that to increase the practical use of biomedical NLP, future R&D efforts need to be broadened to a new research paradigm-one that explicitly incorporates characteristics that are crucial for POCRC. We present our viewpoint about 4 such interrelated characteristics that can increase NLP systems' suitability for POCRC (3 that represent NLP system properties and 1 associated with the R&D process)-(1) interpretability (the ability to explain system decisions), (2) patient centeredness (the capability to characterize diverse patients), (3) customizability (the flexibility for adapting to distinct settings, problems, and cohorts), and (4) multitask evaluation (the validation of system performance based on multiple tasks involving heterogeneous data sets). By using the NLP task of clinical concept detection as an example, we detail these characteristics and discuss how they may result in the increased uptake of NLP systems for POCRC.
Collapse
Affiliation(s)
- Abeed Sarker
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA, United States
| | - Mohammed Ali Al-Garadi
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA, United States
| | - Yuan-Chi Yang
- Department of Biomedical Informatics, School of Medicine, Emory University, Atlanta, GA, United States
| | - Jinho Choi
- Department of Computer Science, College of Arts and Sciences, Emory University, Atlanta, GA, United States
| | - Arshed A Quyyumi
- Emory Clinical Cardiovascular Institute, Division of Cardiology, Department of Medicine, School of Medicine, Emory University, Atlanta, GA, United States
| | - Greg S Martin
- Predictive Health Institute and Center for Health Discovery and Well Being, Department of Medicine, School of Medicine, Emory University, Atlanta, GA, United States
| |
Collapse
|
10
|
Wilson A, Saeed H, Pringle C, Eleftheriou I, Bromiley PA, Brass A. Artificial intelligence projects in healthcare: 10 practical tips for success in a clinical environment. BMJ Health Care Inform 2021; 28:e100323. [PMID: 34326160 PMCID: PMC8323348 DOI: 10.1136/bmjhci-2021-100323] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Accepted: 05/23/2021] [Indexed: 11/06/2022] Open
Abstract
There is much discussion concerning 'digital transformation' in healthcare and the potential of artificial intelligence (AI) in healthcare systems. Yet it remains rare to find AI solutions deployed in routine healthcare settings. This is in part due to the numerous challenges inherent in delivering an AI project in a clinical environment. In this article, several UK healthcare professionals and academics reflect on the challenges they have faced in building AI solutions using routinely collected healthcare data.These personal reflections are summarised as 10 practical tips. In our experience, these are essential considerations for an AI healthcare project to succeed. They are organised into four phases: conceptualisation, data management, AI application and clinical deployment. There is a focus on conceptualisation, reflecting our view that initial set-up is vital to success. We hope that our personal experiences will provide useful insights to others looking to improve patient care through optimal data use.
Collapse
Affiliation(s)
- Anthony Wilson
- Department of Adult Critical Care, Manchester University NHS Foundation Trust, Manchester, UK
| | - Haroon Saeed
- Department of Pediatric Ear Nose and Throat Surgery, Royal Manchester Children's Hospital, Manchester, UK
| | - Catherine Pringle
- Children's Brain Tumour Research Network, Royal Manchester Children's Hospital, Manchester, UK
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, Manchester, UK
| | - Iliada Eleftheriou
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, Manchester, UK
| | - Paul A Bromiley
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, Manchester, UK
| | - Andy Brass
- Division of Informatics, Imaging and Data Sciences, The University of Manchester, Manchester, UK
| |
Collapse
|
11
|
Pedersen JS, Laursen MS, Rajeeth Savarimuthu T, Hansen RS, Alnor AB, Bjerre KV, Kjær IM, Gils C, Thorsen AF, Andersen ES, Nielsen CB, Andersen LC, Just SA, Vinholt PJ. Deep learning detects and visualizes bleeding events in electronic health records. Res Pract Thromb Haemost 2021; 5:e12505. [PMID: 34013150 PMCID: PMC8114029 DOI: 10.1002/rth2.12505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Revised: 02/21/2021] [Accepted: 03/02/2021] [Indexed: 12/01/2022] Open
Abstract
BACKGROUND Bleeding is associated with a significantly increased morbidity and mortality. Bleeding events are often described in the unstructured text of electronic health records, which makes them difficult to identify by manual inspection. OBJECTIVES To develop a deep learning model that detects and visualizes bleeding events in electronic health records. PATIENTS/METHODS Three hundred electronic health records with International Classification of Diseases, Tenth Revision diagnosis codes for bleeding or leukemia were extracted. Each sentence in the electronic health record was annotated as positive or negative for bleeding. The annotated sentences were used to develop a deep learning model that detects bleeding at sentence and note level. RESULTS On a balanced test set of 1178 sentences, the best-performing deep learning model achieved a sensitivity of 0.90, specificity of 0.90, and negative predictive value of 0.90. On a test set consisting of 700 notes, of which 49 were positive for bleeding, the model achieved a note-level sensitivity of 1.00, specificity of 0.52, and negative predictive value of 1.00. By using a sentence-level model on a note level, the model can explain its predictions by visualizing the exact sentence in a note that contains information regarding bleeding. Moreover, we found that the model performed consistently well across different types of bleedings. CONCLUSIONS A deep learning model can be used to detect and visualize bleeding events in the free text of electronic health records. The deep learning model can thus facilitate systematic assessment of bleeding risk, and thereby optimize patient care and safety.
Collapse
Affiliation(s)
- Jannik S. Pedersen
- The Maersk Mc‐Kinney Moller InstituteUniversity of Southern DenmarkOdenseDenmark
| | - Martin S. Laursen
- The Maersk Mc‐Kinney Moller InstituteUniversity of Southern DenmarkOdenseDenmark
| | | | - Rasmus Søgaard Hansen
- Department of Clinical Biochemistry and PharmacologyOdense University HospitalOdenseDenmark
| | - Anne Bryde Alnor
- Department of Clinical Biochemistry and PharmacologyOdense University HospitalOdenseDenmark
| | - Kristian Voss Bjerre
- Department of Clinical Biochemistry and PharmacologyOdense University HospitalOdenseDenmark
| | - Ina Mathilde Kjær
- Department of Clinical Biochemistry and ImmunologyLillebaelt HospitalDenmark
| | - Charlotte Gils
- Department of Clinical Biochemistry and PharmacologyOdense University HospitalOdenseDenmark
| | | | | | | | | | | | - Pernille Just Vinholt
- Department of Clinical Biochemistry and PharmacologyOdense University HospitalOdenseDenmark
| |
Collapse
|
12
|
Mitra A, Rawat BPS, McManus D, Kapoor A, Yu H. Bleeding Entity Recognition in Electronic Health Records: A Comprehensive Analysis of End-to-End Systems. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2021; 2020:860-869. [PMID: 33936461 PMCID: PMC8075442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
A bleeding event is a common adverse drug reaction amongst patients on anticoagulation and factors critically into a clinician's decision to prescribe or continue anticoagulation for atrial fibrillation. However, bleeding events are not uniformly captured in the administrative data of electronic health records (EHR). As manual review is prohibitively expensive, we investigate the effectiveness of various natural language processing (NLP) methods for automatic extraction of bleeding events. Using our expert-annotated 1,079 de-identified EHR notes, we evaluated state-of-the-art NLP models such as biLSTM-CRF with language modeling, and different BERT variants for six entity types. On our dataset, the biLSTM-CRF surpassed other models resulting in a macro F1-score of 0.75 whereas the performance difference is negligible for sentence and document-level predictions with the best macro F1-scores of 0.84 and 0.96, respectively. Our error analyses suggest that the models' incorrect predictions can be attributed to variability in entity spans, memorization, and missing negation signals.
Collapse
Affiliation(s)
- Avijit Mitra
- College of Information and Computer Science, University of Massachusetts Amherst, Amherst, MA, United States
| | - Bhanu Pratap Singh Rawat
- College of Information and Computer Science, University of Massachusetts Amherst, Amherst, MA, United States
| | - David McManus
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA, United States
| | - Alok Kapoor
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA, United States
| | - Hong Yu
- College of Information and Computer Science, University of Massachusetts Amherst, Amherst, MA, United States
- Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, United States
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA, United States
- Center for Healthcare Organization and Implementation Research, Bedford Veterans Affairs Medical Center, Bedford, MA, United States
| |
Collapse
|
13
|
Wu S, Roberts K, Datta S, Du J, Ji Z, Si Y, Soni S, Wang Q, Wei Q, Xiang Y, Zhao B, Xu H. Deep learning in clinical natural language processing: a methodical review. J Am Med Inform Assoc 2021; 27:457-470. [PMID: 31794016 DOI: 10.1093/jamia/ocz200] [Citation(s) in RCA: 167] [Impact Index Per Article: 55.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 10/15/2019] [Accepted: 11/09/2019] [Indexed: 02/07/2023] Open
Abstract
OBJECTIVE This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research. MATERIALS AND METHODS We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to NLP problems in electronic health records. After screening 1,737 articles, we collected data on 25 variables across 212 papers. RESULTS DL in clinical NLP publications more than doubled each year, through 2018. Recurrent neural networks (60.8%) and word2vec embeddings (74.1%) were the most popular methods; the information extraction tasks of text classification, named entity recognition, and relation extraction were dominant (89.2%). However, there was a "long tail" of other methods and specific tasks. Most contributions were methodological variants or applications, but 20.8% were new methods of some kind. The earliest adopters were in the NLP community, but the medical informatics community was the most prolific. DISCUSSION Our analysis shows growing acceptance of deep learning as a baseline for NLP research, and of DL-based NLP in the medical community. A number of common associations were substantiated (eg, the preference of recurrent neural networks for sequence-labeling named entity recognition), while others were surprisingly nuanced (eg, the scarcity of French language clinical NLP with deep learning). CONCLUSION Deep learning has not yet fully penetrated clinical NLP and is growing rapidly. This review highlighted both the popular and unique trends in this active field.
Collapse
Affiliation(s)
- Stephen Wu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Kirk Roberts
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Surabhi Datta
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Jingcheng Du
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Zongcheng Ji
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Yuqi Si
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Sarvesh Soni
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Qiong Wang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Qiang Wei
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Yang Xiang
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Bo Zhao
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| | - Hua Xu
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas, USA
| |
Collapse
|
14
|
Jang R, Kim N, Jang M, Lee KH, Lee SM, Lee KH, Noh HN, Seo JB. Assessment of the Robustness of Convolutional Neural Networks in Labeling Noise by Using Chest X-Ray Images From Multiple Centers. JMIR Med Inform 2020; 8:e18089. [PMID: 32749222 PMCID: PMC7435602 DOI: 10.2196/18089] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Revised: 06/08/2020] [Accepted: 06/21/2020] [Indexed: 12/25/2022] Open
Abstract
Background Computer-aided diagnosis on chest x-ray images using deep learning is a widely studied modality in medicine. Many studies are based on public datasets, such as the National Institutes of Health (NIH) dataset and the Stanford CheXpert dataset. However, these datasets are preprocessed by classical natural language processing, which may cause a certain extent of label errors. Objective This study aimed to investigate the robustness of deep convolutional neural networks (CNNs) for binary classification of posteroanterior chest x-ray through random incorrect labeling. Methods We trained and validated the CNN architecture with different noise levels of labels in 3 datasets, namely, Asan Medical Center-Seoul National University Bundang Hospital (AMC-SNUBH), NIH, and CheXpert, and tested the models with each test set. Diseases of each chest x-ray in our dataset were confirmed by a thoracic radiologist using computed tomography (CT). Receiver operating characteristic (ROC) and area under the curve (AUC) were evaluated in each test. Randomly chosen chest x-rays of public datasets were evaluated by 3 physicians and 1 thoracic radiologist. Results In comparison with the public datasets of NIH and CheXpert, where AUCs did not significantly drop to 16%, the AUC of the AMC-SNUBH dataset significantly decreased from 2% label noise. Evaluation of the public datasets by 3 physicians and 1 thoracic radiologist showed an accuracy of 65%-80%. Conclusions The deep learning–based computer-aided diagnosis model is sensitive to label noise, and computer-aided diagnosis with inaccurate labels is not credible. Furthermore, open datasets such as NIH and CheXpert need to be distilled before being used for deep learning–based computer-aided diagnosis.
Collapse
Affiliation(s)
- Ryoungwoo Jang
- Department of Biomedical Engineering, College of Medicine, University of Ulsan, Seoul, Republic of Korea
| | - Namkug Kim
- Department of Convergence Medicine, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Miso Jang
- Department of Biomedical Engineering, College of Medicine, University of Ulsan, Seoul, Republic of Korea
| | - Kyung Hwa Lee
- Department of Biomedical Engineering, College of Medicine, University of Ulsan, Seoul, Republic of Korea
| | - Sang Min Lee
- Department of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| | - Kyung Hee Lee
- Department of Radiology, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, Republic of Korea
| | - Han Na Noh
- Department of Health Screening and Promotion Center, Asan Medical Center, Seoul, Republic of Korea
| | - Joon Beom Seo
- Department of Radiology, Asan Medical Center, University of Ulsan College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
15
|
Ong CJ, Orfanoudaki A, Zhang R, Caprasse FPM, Hutch M, Ma L, Fard D, Balogun O, Miller MI, Minnig M, Saglam H, Prescott B, Greer DM, Smirnakis S, Bertsimas D. Machine learning and natural language processing methods to identify ischemic stroke, acuity and location from radiology reports. PLoS One 2020; 15:e0234908. [PMID: 32559211 PMCID: PMC7304623 DOI: 10.1371/journal.pone.0234908] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2019] [Accepted: 06/04/2020] [Indexed: 12/20/2022] Open
Abstract
Accurate, automated extraction of clinical stroke information from unstructured text has several important applications. ICD-9/10 codes can misclassify ischemic stroke events and do not distinguish acuity or location. Expeditious, accurate data extraction could provide considerable improvement in identifying stroke in large datasets, triaging critical clinical reports, and quality improvement efforts. In this study, we developed and report a comprehensive framework studying the performance of simple and complex stroke-specific Natural Language Processing (NLP) and Machine Learning (ML) methods to determine presence, location, and acuity of ischemic stroke from radiographic text. We collected 60,564 Computed Tomography and Magnetic Resonance Imaging Radiology reports from 17,864 patients from two large academic medical centers. We used standard techniques to featurize unstructured text and developed neurovascular specific word GloVe embeddings. We trained various binary classification algorithms to identify stroke presence, location, and acuity using 75% of 1,359 expert-labeled reports. We validated our methods internally on the remaining 25% of reports and externally on 500 radiology reports from an entirely separate academic institution. In our internal population, GloVe word embeddings paired with deep learning (Recurrent Neural Networks) had the best discrimination of all methods for our three tasks (AUCs of 0.96, 0.98, 0.93 respectively). Simpler NLP approaches (Bag of Words) performed best with interpretable algorithms (Logistic Regression) for identifying ischemic stroke (AUC of 0.95), MCA location (AUC 0.96), and acuity (AUC of 0.90). Similarly, GloVe and Recurrent Neural Networks (AUC 0.92, 0.89, 0.93) generalized better in our external test set than BOW and Logistic Regression for stroke presence, location and acuity, respectively (AUC 0.89, 0.86, 0.80). Our study demonstrates a comprehensive assessment of NLP techniques for unstructured radiographic text. Our findings are suggestive that NLP/ML methods can be used to discriminate stroke features from large data cohorts for both clinical and research-related investigations.
Collapse
Affiliation(s)
- Charlene Jennifer Ong
- Boston University School of Medicine, Boston, Massachusetts, United States of America
- Boston Medical Center, Boston, Massachusetts, United States of America
- Harvard Medical School, Boston, Massachusetts, United States of America
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- * E-mail:
| | - Agni Orfanoudaki
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Rebecca Zhang
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Francois Pierre M. Caprasse
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Meghan Hutch
- Boston University School of Medicine, Boston, Massachusetts, United States of America
- Boston Medical Center, Boston, Massachusetts, United States of America
| | - Liang Ma
- Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Darian Fard
- Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Oluwafemi Balogun
- Boston University School of Medicine, Boston, Massachusetts, United States of America
- Boston Medical Center, Boston, Massachusetts, United States of America
| | - Matthew I. Miller
- Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Margaret Minnig
- Boston University School of Medicine, Boston, Massachusetts, United States of America
| | - Hanife Saglam
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Brenton Prescott
- Boston Medical Center, Boston, Massachusetts, United States of America
| | - David M. Greer
- Boston University School of Medicine, Boston, Massachusetts, United States of America
- Boston Medical Center, Boston, Massachusetts, United States of America
| | - Stelios Smirnakis
- Harvard Medical School, Boston, Massachusetts, United States of America
| | - Dimitris Bertsimas
- Operations Research Center, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Sloan School of Management, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| |
Collapse
|
16
|
Hu B, Bajracharya A, Yu H. Generating Medical Assessments Using a Neural Network Model: Algorithm Development and Validation. JMIR Med Inform 2020; 8:e14971. [PMID: 31939742 PMCID: PMC7006435 DOI: 10.2196/14971] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 09/28/2019] [Accepted: 10/19/2019] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND Since its inception, artificial intelligence has aimed to use computers to help make clinical diagnoses. Evidence-based medical reasoning is important for patient care. Inferring clinical diagnoses is a crucial step during the patient encounter. Previous works mainly used expert systems or machine learning-based methods to predict the International Classification of Diseases - Clinical Modification codes based on electronic health records. We report an alternative approach: inference of clinical diagnoses from patients' reported symptoms and physicians' clinical observations. OBJECTIVE We aimed to report a natural language processing system for generating medical assessments based on patient information described in the electronic health record (EHR) notes. METHODS We processed EHR notes into the Subjective, Objective, Assessment, and Plan sections. We trained a neural network model for medical assessment generation (N2MAG). Our N2MAG is an innovative deep neural model that uses the Subjective and Objective sections of an EHR note to automatically generate an "expert-like" assessment of the patient. N2MAG can be trained in an end-to-end fashion and does not require feature engineering and external knowledge resources. RESULTS We evaluated N2MAG and the baseline models both quantitatively and qualitatively. Evaluated by both the Recall-Oriented Understudy for Gisting Evaluation metrics and domain experts, our results show that N2MAG outperformed the existing state-of-the-art baseline models. CONCLUSIONS N2MAG could generate a medical assessment from the Subject and Objective section descriptions in EHR notes. Future work will assess its potential for providing clinical decision support.
Collapse
Affiliation(s)
- Baotian Hu
- Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, United States
| | - Adarsha Bajracharya
- Department of Medicine, University of Massachusetts Medical School, Worcester, MA, United States
| | - Hong Yu
- Department of Computer Science, University of Massachusetts Lowell, Lowell, MA, United States.,Bedford Veterans Affairs Medical Center, Bedford, MA, United States.,School of Computer Science, University of Massachusetts Amherst, Amherst, MA, United States
| |
Collapse
|
17
|
Qiao N, Song M, Ye Z, He W, Ma Z, Wang Y, Zhang Y, Shou X. Deep Learning for Automatically Visual Evoked Potential Classification During Surgical Decompression of Sellar Region Tumors. Transl Vis Sci Technol 2019; 8:21. [PMID: 31788350 PMCID: PMC6871542 DOI: 10.1167/tvst.8.6.21] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 09/17/2019] [Indexed: 12/22/2022] Open
Abstract
Purpose Detection of the huge amount of data generated in real-time visual evoked potential (VEP) requires labor-intensive work and experienced electrophysiologists. This study aims to build an automatic VEP classification system by using a deep learning algorithm. Methods Patients with sellar region tumor and optic chiasm compression were enrolled. Flash VEP monitoring was applied during surgical decompression. Sequential VEP images were fed into three neural network algorithms to train VEP classification models. Results We included 76 patients. During surgical decompression, we observed 68 eyes with increased VEP amplitude, 47 eyes with a transient decrease, and 37 eyes without change. We generated 2,843 sequences (39,802 images) in total (887 sequences with increasing VEP, 276 sequences with decreasing VEP, and 1680 sequences without change). The model combining convolutional and recurrent neural network had the highest accuracy (87.4%; 95% confidence interval, 84.2%–90.1%). The sensitivity of predicting no change VEP, increasing VEP, and decreasing VEP was 92.6%, 78.9%, and 83.7%, respectively. The specificity of predicting no change VEP, increasing VEP, and decreasing VEP was 80.5%, 93.3%, and 100.0%, respectively. The class activation map visualization technique showed that the P2-N3-P3 complex was important in determining the output. Conclusions We identified three VEP responses (no change, increase, and decrease) during transsphenoidal surgical decompression of sellar region tumors. We developed a deep learning model to classify the sequential changes of intraoperative VEP. Translational Relevance Our model may have the potential to be applied in real-time monitoring during surgical resection of sellar region tumors.
Collapse
Affiliation(s)
- Nidan Qiao
- Shanghai Pituitary Tumor Center, Shanghai Neurosurgical Research Institute, Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China & Neuroendocrine Unit, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Mengju Song
- Department of Ophthalmology, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China & Putuo Oculopathy Dental Disease Prevention & Cure Clinic, Shanghai, China
| | - Zhao Ye
- Shanghai Pituitary Tumor Center, Shanghai Neurosurgical Research Institute, Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Wenqiang He
- Shanghai Pituitary Tumor Center, Shanghai Neurosurgical Research Institute, Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Zengyi Ma
- Shanghai Pituitary Tumor Center, Shanghai Neurosurgical Research Institute, Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yongfei Wang
- Shanghai Pituitary Tumor Center, Shanghai Neurosurgical Research Institute, Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Yuyan Zhang
- Department of Ophthalmology, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Xuefei Shou
- Shanghai Pituitary Tumor Center, Shanghai Neurosurgical Research Institute, Department of Neurosurgery, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| |
Collapse
|