1
|
Schiffer-Kane K, Liu C, Callahan TJ, Ta C, Nestor JG, Weng C. Converting OMOP CDM to phenopackets: A model alignment and patient data representation evaluation. J Biomed Inform 2024; 155:104659. [PMID: 38777085 PMCID: PMC11181468 DOI: 10.1016/j.jbi.2024.104659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Revised: 05/11/2024] [Accepted: 05/18/2024] [Indexed: 05/25/2024]
Abstract
OBJECTIVE This study aims to promote interoperability in precision medicine and translational research by aligning the Observational Medical Outcomes Partnership (OMOP) and Phenopackets data models. Phenopackets is an expert knowledge-driven schema designed to facilitate the storage and exchange of multimodal patient data, and support downstream analysis. The first goal of this paper is to explore model alignment by characterizing the common data models using a newly developed data transformation process and evaluation method. Second, using OMOP normalized clinical data, we evaluate the mapping of real-world patient data to Phenopackets. We evaluate the suitability of Phenopackets as a patient data representation for real-world clinical cases. METHODS We identified mappings between OMOP and Phenopackets and applied them to a real patient dataset to assess the transformation's success. We analyzed gaps between the models and identified key considerations for transforming data between them. Further, to improve ambiguous alignment, we incorporated Unified Medical Language System (UMLS) semantic type-based filtering to direct individual concepts to their most appropriate domain and conducted a domain-expert evaluation of the mapping's clinical utility. RESULTS The OMOP to Phenopacket transformation pipeline was executed for 1,000 Alzheimer's disease patients and successfully mapped all required entities. However, due to missing values in OMOP for required Phenopacket attributes, 10.2 % of records were lost. The use of UMLS-semantic type filtering for ambiguous alignment of individual concepts resulted in 96 % agreement with clinical thinking, increased from 68 % when mapping exclusively by domain correspondence. CONCLUSION This study presents a pipeline to transform data from OMOP to Phenopackets. We identified considerations for the transformation to ensure data quality, handling restrictions for successful Phenopacket validation and discrepant data formats. We identified unmappable Phenopacket attributes that focus on specialty use cases, such as genomics or oncology, which OMOP does not currently support. We introduce UMLS semantic type filtering to resolve ambiguous alignment to Phenopacket entities to be most appropriate for real-world interpretation. We provide a systematic approach to align OMOP and Phenopackets schemas. Our work facilitates future use of Phenopackets in clinical applications by addressing key barriers to interoperability when deriving a Phenopacket from real-world patient data.
Collapse
Affiliation(s)
- Kayla Schiffer-Kane
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Cong Liu
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Tiffany J Callahan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Casey Ta
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
| | - Jordan G Nestor
- Department of Medicine, Division of Nephrology, Columbia University Irving Medical Center, New York, NY, USA
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA.
| |
Collapse
|
2
|
Cashaback JGA, Allen JL, Chou AHY, Lin DJ, Price MA, Secerovic NK, Song S, Zhang H, Miller HL. NSF DARE-transforming modeling in neurorehabilitation: a patient-in-the-loop framework. J Neuroeng Rehabil 2024; 21:23. [PMID: 38347597 PMCID: PMC10863253 DOI: 10.1186/s12984-024-01318-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 01/25/2024] [Indexed: 02/15/2024] Open
Abstract
In 2023, the National Science Foundation (NSF) and the National Institute of Health (NIH) brought together engineers, scientists, and clinicians by sponsoring a conference on computational modelling in neurorehabiilitation. To facilitate multidisciplinary collaborations and improve patient care, in this perspective piece we identify where and how computational modelling can support neurorehabilitation. To address the where, we developed a patient-in-the-loop framework that uses multiple and/or continual measurements to update diagnostic and treatment model parameters, treatment type, and treatment prescription, with the goal of maximizing clinically-relevant functional outcomes. This patient-in-the-loop framework has several key features: (i) it includes diagnostic and treatment models, (ii) it is clinically-grounded with the International Classification of Functioning, Disability and Health (ICF) and patient involvement, (iii) it uses multiple or continual data measurements over time, and (iv) it is applicable to a range of neurological and neurodevelopmental conditions. To address the how, we identify state-of-the-art and highlight promising avenues of future research across the realms of sensorimotor adaptation, neuroplasticity, musculoskeletal, and sensory & pain computational modelling. We also discuss both the importance of and how to perform model validation, as well as challenges to overcome when implementing computational models within a clinical setting. The patient-in-the-loop approach offers a unifying framework to guide multidisciplinary collaboration between computational and clinical stakeholders in the field of neurorehabilitation.
Collapse
Affiliation(s)
- Joshua G A Cashaback
- Biomedical Engineering, Mechanical Engineering, Kinesiology and Applied Physiology, Biome chanics and Movement Science Program, Interdisciplinary Neuroscience Graduate Program, University of Delaware, 540 S College Ave, Newark, DE, 19711, USA.
| | - Jessica L Allen
- Department of Mechanical Engineering, University of Florida, Gainesville, USA
| | | | - David J Lin
- Division of Neurocritical Care and Stroke Service, Department of Neurology, Center for Neurotechnology and Neurorecovery, Massachusetts General Hospital, Harvard Medical School, Boston, USA
- Department of Veterans Affairs, Center for Neurorestoration and Neurotechnology, Rehabilitation Research and Development Service, Providence, USA
| | - Mark A Price
- Department of Mechanical and Industrial Engineering, Department of Kinesiology, University of Massachusetts Amherst, Amherst, USA
| | - Natalija K Secerovic
- School of Electrical Engineering, The Mihajlo Pupin Institute, University of Belgrade, Belgrade, Serbia
- Laboratory for Neuroengineering, Institute for Robotics and Intelligent Systems ETH Zürich, Zurich, Switzerland
| | - Seungmoon Song
- Mechanical and Industrial Engineering, Northeastern University, Boston, USA
| | - Haohan Zhang
- Department of Mechanical Engineering, University of Utah, Salt Lake City, USA
| | - Haylie L Miller
- School of Kinesiology, University of Michigan, 830 N University Ave, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
3
|
Parrella NF, Hill AT, Dipnall LM, Loke YJ, Enticott PG, Ford TC. Inhibitory dysfunction and social processing difficulties in autism: A comprehensive narrative review. J Psychiatr Res 2024; 169:113-125. [PMID: 38016393 DOI: 10.1016/j.jpsychires.2023.11.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 09/04/2023] [Accepted: 11/15/2023] [Indexed: 11/30/2023]
Abstract
The primary inhibitory neurotransmitter γ-aminobutyric acid (GABA) has a prominent role in regulating neural development and function, with disruption to GABAergic signalling linked to behavioural phenotypes associated with neurodevelopmental disorders, particularly autism. Such neurochemical disruption, likely resulting from diverse genetic and molecular mechanisms, particularly during early development, can subsequently affect the cellular balance of excitation and inhibition in neuronal circuits, which may account for the social processing difficulties observed in autism and related conditions. This comprehensive narrative review integrates diverse streams of research from several disciplines, including molecular neurobiology, genetics, epigenetics, and systems neuroscience. In so doing it aims to elucidate the relevance of inhibitory dysfunction to autism, with specific focus on social processing difficulties that represent a core feature of this disorder. Many of the social processing difficulties experienced in autism have been linked to higher levels of the excitatory neurotransmitter glutamate and/or lower levels of inhibitory GABA. While current therapeutic options for social difficulties in autism are largely limited to behavioural interventions, this review highlights the psychopharmacological studies that explore the utility of GABA modulation in alleviating such difficulties.
Collapse
Affiliation(s)
| | - Aron T Hill
- Cognitive Neuroscience Unit, School of Psychology, Deakin University, Geelong, Australia; Department of Psychiatry, Central Clinical School, Monash University, Melbourne, Victoria, Australia
| | - Lillian M Dipnall
- Cognitive Neuroscience Unit, School of Psychology, Deakin University, Geelong, Australia; Early Life Epigenetics Group, Deakin University, Geelong, Australia
| | - Yuk Jing Loke
- Epigenetics Group, Murdoch Children's Research Institute, Melbourne, Victoria, Australia; Department of Paediatrics, The University of Melbourne, Melbourne, Victoria, Australia
| | - Peter G Enticott
- Cognitive Neuroscience Unit, School of Psychology, Deakin University, Geelong, Australia
| | - Talitha C Ford
- Cognitive Neuroscience Unit, School of Psychology, Deakin University, Geelong, Australia; Centre for Human Psychopharmacology, Faculty of Health, Arts and Design, Swinburne University of Technology, Melbourne, Victoria, Australia
| |
Collapse
|
4
|
Wang Y, Stroh JN, Hripcsak G, Low Wang CC, Bennett TD, Wrobel J, Der Nigoghossian C, Mueller SW, Claassen J, Albers DJ. A methodology of phenotyping ICU patients from EHR data: High-fidelity, personalized, and interpretable phenotypes estimation. J Biomed Inform 2023; 148:104547. [PMID: 37984547 PMCID: PMC10802138 DOI: 10.1016/j.jbi.2023.104547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/13/2023] [Accepted: 11/16/2023] [Indexed: 11/22/2023]
Abstract
OBJECTIVE Computing phenotypes that provide high-fidelity, time-dependent characterizations and yield personalized interpretations is challenging, especially given the complexity of physiological and healthcare systems and clinical data quality. This paper develops a methodological pipeline to estimate unmeasured physiological parameters and produce high-fidelity, personalized phenotypes anchored to physiological mechanics from electronic health record (EHR). METHODS A methodological phenotyping pipeline is developed that computes new phenotypes defined with unmeasurable computational biomarkers quantifying specific physiological properties in real time. Working within the inverse problem framework, this pipeline is applied to the glucose-insulin system for ICU patients using data assimilation to estimate an established mathematical physiological model with stochastic optimization. This produces physiological model parameter vectors of clinically unmeasured endocrine properties, here insulin secretion, clearance, and resistance, estimated for individual patient. These physiological parameter vectors are used as inputs to unsupervised machine learning methods to produce phenotypic labels and discrete physiological phenotypes. These phenotypes are inherently interpretable because they are based on parametric physiological descriptors. To establish potential clinical utility, the computed phenotypes are evaluated with external EHR data for consistency and reliability and with clinician face validation. RESULTS The phenotype computation was performed on a cohort of 109 ICU patients who received no or short-acting insulin therapy, rendering continuous and discrete physiological phenotypes as specific computational biomarkers of unmeasured insulin secretion, clearance, and resistance on time windows of three days. Six, six, and five discrete phenotypes were found in the first, middle, and last three-day periods of ICU stays, respectively. Computed phenotypic labels were predictive with an average accuracy of 89%. External validation of discrete phenotypes showed coherence and consistency in clinically observable differences based on laboratory measurements and ICD 9/10 codes and clinical concordance from face validity. A particularly clinically impactful parameter, insulin secretion, had a concordance accuracy of 83%±27%. CONCLUSION The new physiological phenotypes computed with individual patient ICU data and defined by estimates of mechanistic model parameters have high physiological fidelity, are continuous, time-specific, personalized, interpretable, and predictive. This methodology is generalizable to other clinical and physiological settings and opens the door for discovering deeper physiological information to personalize medical care.
Collapse
Affiliation(s)
- Yanran Wang
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, 13001 East 17th Place, 3rd Floor, Mail Stop B119, Aurora, CO 80045, United States of America; Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America.
| | - J N Stroh
- Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America; Department of Biomedical Engineering, University of Colorado, 12705 East Montview Boulevard, Suite 100, Aurora, CO 80045, United States of America
| | - George Hripcsak
- Biomedical Informatics, Columbia University, 622 W. 168th Street, PH20, New York, NY 10032, United States of America
| | - Cecilia C Low Wang
- Division of Endocrinology, Metabolism and Diabetes, Department of Medicine, University of Colorado School of Medicine, 12801 East 17th Avenue, 7103, Aurora, CO 80045, United States of America
| | - Tellen D Bennett
- Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America
| | - Julia Wrobel
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Rd, NE Atlanta, GA 30322, United States of America
| | - Caroline Der Nigoghossian
- Columbia University School of Nursing, 560 West 168th Street, New York, NY 10032, United States of America
| | - Scott W Mueller
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, 12850 East Montview Boulevard, Aurora, CO 80045, United States of America
| | - Jan Claassen
- The Neurological Institute of New York, Columbia University Irving Medical Center, 710 West 168th Street, New York NY 10032, United States of America
| | - D J Albers
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, 13001 East 17th Place, 3rd Floor, Mail Stop B119, Aurora, CO 80045, United States of America; Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America; Department of Biomedical Engineering, University of Colorado, 12705 East Montview Boulevard, Suite 100, Aurora, CO 80045, United States of America; Biomedical Informatics, Columbia University, 622 W. 168th Street, PH20, New York, NY 10032, United States of America
| |
Collapse
|
5
|
van Loo HM, de Vries YA, Taylor J, Todorovic L, Dollinger C, Kendler KS. Clinical characteristics indexing genetic differences in bipolar disorder - a systematic review. Mol Psychiatry 2023; 28:3661-3670. [PMID: 37968345 DOI: 10.1038/s41380-023-02297-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 09/29/2023] [Accepted: 10/06/2023] [Indexed: 11/17/2023]
Abstract
Bipolar disorder is a heterogenous condition with a varied clinical presentation. While progress has been made in identifying genetic variants associated with bipolar disorder, most common genetic variants have not yet been identified. More detailed phenotyping (beyond diagnosis) may increase the chance of finding genetic variants. Our aim therefore was to identify clinical characteristics that index genetic differences in bipolar disorder.We performed a systematic review of all genome-wide molecular genetic, family, and twin studies investigating familial/genetic influences on the clinical characteristics of bipolar disorder. We performed an electronic database search of PubMed and PsycInfo until October 2022. We reviewed title/abstracts of 2693 unique records and full texts of 391 reports, identifying 445 relevant analyses from 142 different reports. These reports described 199 analyses from family studies, 183 analyses from molecular genetic studies and 63 analyses from other types of studies. We summarized the overall evidence per phenotype considering study quality, power, and number of studies.We found moderate to strong evidence for a positive association of age at onset, subtype (bipolar I versus bipolar II), psychotic symptoms and manic symptoms with familial/genetic risk of bipolar disorder. Sex was not associated with overall genetic risk but could indicate qualitative genetic differences. Assessment of genetically relevant clinical characteristics of patients with bipolar disorder can be used to increase the phenotypic and genetic homogeneity of the sample in future genetic studies, which may yield more power, increase specificity, and improve understanding of the genetic architecture of bipolar disorder.
Collapse
Affiliation(s)
- Hanna M van Loo
- Department of Psychiatry and Interdisciplinary Center Psychopathology and Emotion regulation, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands.
| | - Ymkje Anna de Vries
- Department of Child and Adolescent Psychiatry, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Jacob Taylor
- Department of Psychiatry, Brigham and Women's Hospital, Boston, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Luka Todorovic
- Department of Psychiatry and Interdisciplinary Center Psychopathology and Emotion regulation, University of Groningen, University Medical Center Groningen, Groningen, the Netherlands
- Department of Child and Adolescent Psychiatry, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Camille Dollinger
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Kenneth S Kendler
- Virginia Institute for Psychiatric and Behavioral Genetics and Department of Psychiatry, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
6
|
Murlanova K, Pletnikov MV. Modeling psychotic disorders: Environment x environment interaction. Neurosci Biobehav Rev 2023; 152:105310. [PMID: 37437753 DOI: 10.1016/j.neubiorev.2023.105310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 06/26/2023] [Accepted: 07/05/2023] [Indexed: 07/14/2023]
Abstract
Schizophrenia is a major psychotic disorder with multifactorial etiology that includes interactions between genetic vulnerability and environmental risk factors. In addition, interplay of multiple environmental adversities affects neurodevelopment and may increase the individual risk of developing schizophrenia. Consistent with the two-hit hypothesis of schizophrenia, we review rodent models that combine maternal immune activation as the first hit with other adverse environmental exposures as the second hit. We discuss the strengths and pitfalls of the current animal models of environment x environment interplay and propose some future directions to advance the field.
Collapse
Affiliation(s)
- Kateryna Murlanova
- Department of Physiology and Biophysics, Jacobs School of Medicine and Biomedical Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Mikhail V Pletnikov
- Department of Physiology and Biophysics, Jacobs School of Medicine and Biomedical Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA; Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA; Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.
| |
Collapse
|
7
|
Wang Y, Stroh JN, Hripcsak G, Low Wang CC, Bennett TD, Wrobel J, Der Nigoghossian C, Mueller S, Claassen J, Albers DJ. A methodology of phenotyping ICU patients from EHR data: high-fidelity, personalized, and interpretable phenotypes estimation. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.03.15.23287315. [PMID: 37662404 PMCID: PMC10473766 DOI: 10.1101/2023.03.15.23287315] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Objective Computing phenotypes that provide high-fidelity, time-dependent characterizations and yield personalized interpretations is challenging, especially given the complexity of physiological and healthcare systems and clinical data quality. This paper develops a methodological pipeline to estimate unmeasured physiological parameters and produce high-fidelity, personalized phenotypes anchored to physiological mechanics from electronic health record (EHR). Methods A methodological phenotyping pipeline is developed that computes new phenotypes defined with unmeasurable computational biomarkers quantifying specific physiological properties in real time. Working within the inverse problem framework, this pipeline is applied to the glucose-insulin system for ICU patients using data assimilation to estimate an established mathematical physiological model with stochastic optimization. This produces physiological model parameter vectors of clinically unmeasured endocrine properties, here insulin secretion, clearance, and resistance, estimated for individual patient. These physiological parameter vectors are used as inputs to unsupervised machine learning methods to produce phenotypic labels and discrete physiological phenotypes. These phenotypes are inherently interpretable because they are based on parametric physiological descriptors. To establish potential clinical utility, the computed phenotypes are evaluated with external EHR data for consistency and reliability and with clinician face validation. Results The phenotype computation was performed on a cohort of 109 ICU patients who received no or short-acting insulin therapy, rendering continuous and discrete physiological phenotypes as specific computational biomarkers of unmeasured insulin secretion, clearance, and resistance on time windows of three days. Six, six, and five discrete phenotypes were found in the first, middle, and last three-day periods of ICU stays, respectively. Computed phenotypic labels were predictive with an average accuracy of 89%. External validation of discrete phenotypes showed coherence and consistency in clinically observable differences based on laboratory measurements and ICD 9/10 codes and clinical concordance from face validity. A particularly clinically impactful parameter, insulin secretion, had a concordance accuracy of 83% ± 27%. Conclusion The new physiological phenotypes computed with individual patient ICU data and defined by estimates of mechanistic model parameters have high physiological fidelity, are continuous, time-specific, personalized, interpretable, and predictive. This methodology is generalizable to other clinical and physiological settings and opens the door for discovering deeper physiological information to personalize medical care.
Collapse
|
8
|
Callahan TJ, Stefanski AL, Wyrwa JM, Zeng C, Ostropolets A, Banda JM, Baumgartner WA, Boyce RD, Casiraghi E, Coleman BD, Collins JH, Deakyne Davies SJ, Feinstein JA, Lin AY, Martin B, Matentzoglu NA, Meeker D, Reese J, Sinclair J, Taneja SB, Trinkley KE, Vasilevsky NA, Williams AE, Zhang XA, Denny JC, Ryan PB, Hripcsak G, Bennett TD, Haendel MA, Robinson PN, Hunter LE, Kahn MG. Ontologizing health systems data at scale: making translational discovery a reality. NPJ Digit Med 2023; 6:89. [PMID: 37208468 PMCID: PMC10196319 DOI: 10.1038/s41746-023-00830-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 04/28/2023] [Indexed: 05/21/2023] Open
Abstract
Common data models solve many challenges of standardizing electronic health record (EHR) data but are unable to semantically integrate all of the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OBO ontologies requires significant manual curation and domain expertise. We introduce OMOP2OBO, an algorithm for mapping Observational Medical Outcomes Partnership (OMOP) vocabularies to OBO ontologies. Using OMOP2OBO, we produced mappings for 92,367 conditions, 8611 drug ingredients, and 10,673 measurement results, which covered 68-99% of concepts used in clinical practice when examined across 24 hospitals. When used to phenotype rare disease patients, the mappings helped systematically identify undiagnosed patients who might benefit from genetic testing. By aligning OMOP vocabularies to OBO ontologies our algorithm presents new opportunities to advance EHR-based deep phenotyping.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| | - Adrianne L Stefanski
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Jordan M Wyrwa
- Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Chenjie Zeng
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Anna Ostropolets
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Juan M Banda
- Department of Computer Science, Georgia State University, Atlanta, GA, 30303, USA
| | - William A Baumgartner
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15260, USA
| | - Elena Casiraghi
- Computer Science, Università degli Studi di Milano, Milan, Italy
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Ben D Coleman
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Janine H Collins
- Department of Haematology, University of Cambridge, Cambridge, UK
| | - Sara J Deakyne Davies
- Department of Research Informatics & Data Science, Analytics Resource Center, Children's Hospital Colorado, Aurora, CO, 80045, USA
| | - James A Feinstein
- Adult and Child Center for Health Outcomes Research and Delivery Science (ACCORDS), University of Colorado Anschutz School of Medicine, Aurora, CO, 80045, USA
| | - Asiyah Y Lin
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Blake Martin
- Departments of Biomedical Informatics and Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | | | | | - Justin Reese
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Katy E Trinkley
- Department of Family Medicine, University of Colorado Anschutz School of Medicine, Aurora, CO, 80045, USA
| | - Nicole A Vasilevsky
- Translational and Integrative Sciences Lab, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Andrew E Williams
- Tufts Institute for Clinical Research and Health Policy Studies, Tufts University, Boston, MA, 02155, USA
| | - Xingmin A Zhang
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Joshua C Denny
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Patrick B Ryan
- Janssen Research and Development, Raritan, NJ, 08869, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Tellen D Bennett
- Departments of Biomedical Informatics and Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Melissa A Haendel
- Departments of Biomedical Informatics and Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Lawrence E Hunter
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Michael G Kahn
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| |
Collapse
|
9
|
Daniali M, Galer PD, Lewis-Smith D, Parthasarathy S, Kim E, Salvucci DD, Miller JM, Haag S, Helbig I. Enriching representation learning using 53 million patient notes through human phenotype ontology embedding. Artif Intell Med 2023; 139:102523. [PMID: 37100502 PMCID: PMC10782859 DOI: 10.1016/j.artmed.2023.102523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 02/17/2023] [Accepted: 02/23/2023] [Indexed: 03/04/2023]
Abstract
The Human Phenotype Ontology (HPO) is a dictionary of >15,000 clinical phenotypic terms with defined semantic relationships, developed to standardize phenotypic analysis. Over the last decade, the HPO has been used to accelerate the implementation of precision medicine into clinical practice. In addition, recent research in representation learning, specifically in graph embedding, has led to notable progress in automated prediction via learned features. Here, we present a novel approach to phenotype representation by incorporating phenotypic frequencies based on 53 million full-text health care notes from >1.5 million individuals. We demonstrate the efficacy of our proposed phenotype embedding technique by comparing our work to existing phenotypic similarity-measuring methods. Using phenotype frequencies in our embedding technique, we are able to identify phenotypic similarities that surpass current computational models. Furthermore, our embedding technique exhibits a high degree of agreement with domain experts' judgment. By transforming complex and multidimensional phenotypes from the HPO format into vectors, our proposed method enables efficient representation of these phenotypes for downstream tasks that require deep phenotyping. This is demonstrated in a patient similarity analysis and can further be applied to disease trajectory and risk prediction.
Collapse
Affiliation(s)
- Maryam Daniali
- Department of Computer Science, Drexel University, Philadelphia, PA, USA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Peter D Galer
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, PA, USA
| | - David Lewis-Smith
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Translational and Clinical Research Institute, Newcastle University, Newcastle-upon-Tyne, UK; Department of Clinical Neurosciences, Royal Victoria Infirmary, Newcastle-upon-Tyne, UK
| | - Shridhar Parthasarathy
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Edward Kim
- Department of Computer Science, Drexel University, Philadelphia, PA, USA
| | - Dario D Salvucci
- Department of Computer Science, Drexel University, Philadelphia, PA, USA
| | - Jeffrey M Miller
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Scott Haag
- Department of Computer Science, Drexel University, Philadelphia, PA, USA; Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA
| | - Ingo Helbig
- Department of Biomedical and Health Informatics (DBHi), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA, USA; The Epilepsy Neuro Genetics Initiative (ENGIN), Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Neurology, University of Pennsylvania, Perelman School of Medicine, Philadelphia, PA, USA.
| |
Collapse
|
10
|
Kim DH, Jensen A, Jones K, Raghavan S, Phillips LS, Hung A, Sun YV, Li G, Reaven P, Zhou H, Zhou JJ. A platform for phenotyping disease progression and associated longitudinal risk factors in large-scale EHRs, with application to incident diabetes complications in the UK Biobank. JAMIA Open 2023; 6:ooad006. [PMID: 36789288 PMCID: PMC9912368 DOI: 10.1093/jamiaopen/ooad006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 01/19/2023] [Accepted: 01/31/2023] [Indexed: 02/12/2023] Open
Abstract
Objective Modern healthcare data reflect massive multi-level and multi-scale information collected over many years. The majority of the existing phenotyping algorithms use case-control definitions of disease. This paper aims to study the time to disease onset and progression and identify the time-varying risk factors that drive them. Materials and Methods We developed an algorithmic approach to phenotyping the incidence of diseases by consolidating data sources from the UK Biobank (UKB), including primary care electronic health records (EHRs). We focused on defining events, event dates, and their censoring time, including relevant terms and existing phenotypes, excluding generic, rare, or semantically distant terms, forward-mapping terminology terms, and expert review. We applied our approach to phenotyping diabetes complications, including a composite cardiovascular disease (CVD) outcome, diabetic kidney disease (DKD), and diabetic retinopathy (DR), in the UKB study. Results We identified 49 049 participants with diabetes. Among them, 1023 had type 1 diabetes (T1D), and 40 193 had type 2 diabetes (T2D). A total of 23 833 diabetes subjects had linked primary care records. There were 3237, 3113, and 4922 patients with CVD, DKD, and DR events, respectively. The risk prediction performance for each outcome was assessed, and our results are consistent with the prediction area under the ROC (receiver operating characteristic) curve (AUC) of standard risk prediction models using cohort studies. Discussion and Conclusion Our publicly available pipeline and platform enable streamlined curation of incidence events, identification of time-varying risk factors underlying disease progression, and the definition of a relevant cohort for time-to-event analyses. These important steps need to be considered simultaneously to study disease progression.
Collapse
Affiliation(s)
- Do Hyun Kim
- Department of Biostatistics, University of California, Los Angeles, California, USA
| | - Aubrey Jensen
- Department of Biostatistics, University of California, Los Angeles, California, USA
| | - Kelly Jones
- Department of Computer Science, Columbia University, New York, New York, USA
| | - Sridharan Raghavan
- Division of Hospital Medicine, University of Colorado School of Medicine, Aurora, Colorado, USA
- Rocky Mountain Regional VA Medical Center, Aurora, Colorado, USA
| | - Lawrence S Phillips
- Division of Endocrinology, Emory University School of Medicine, Atlanta, Georgia, USA
- Atlanta VA Medical Center, Decatur, Georgia, USA
| | - Adriana Hung
- VA Tennessee Valley Healthcare System, Nashville, Tennessee, USA
- Vanderbilt University, Nashville, Tennessee, USA
| | - Yan V Sun
- Department of Epidemiology, Emory University, Atlanta, Georgia, USA
| | - Gang Li
- Department of Biostatistics, University of California, Los Angeles, California, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Peter Reaven
- Phoenix VA Health Care System, Phoenix, Arizona, USA
| | - Hua Zhou
- Department of Biostatistics, University of California, Los Angeles, California, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| | - Jin J Zhou
- Department of Biostatistics, University of California, Los Angeles, California, USA
- Phoenix VA Health Care System, Phoenix, Arizona, USA
- Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, California, USA
| |
Collapse
|
11
|
Yang S, Varghese P, Stephenson E, Tu K, Gronsbell J. Machine learning approaches for electronic health records phenotyping: a methodical review. J Am Med Inform Assoc 2023; 30:367-381. [PMID: 36413056 PMCID: PMC9846699 DOI: 10.1093/jamia/ocac216] [Citation(s) in RCA: 23] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 09/27/2022] [Accepted: 10/27/2022] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE Accurate and rapid phenotyping is a prerequisite to leveraging electronic health records for biomedical research. While early phenotyping relied on rule-based algorithms curated by experts, machine learning (ML) approaches have emerged as an alternative to improve scalability across phenotypes and healthcare settings. This study evaluates ML-based phenotyping with respect to (1) the data sources used, (2) the phenotypes considered, (3) the methods applied, and (4) the reporting and evaluation methods used. MATERIALS AND METHODS We searched PubMed and Web of Science for articles published between 2018 and 2022. After screening 850 articles, we recorded 37 variables on 100 studies. RESULTS Most studies utilized data from a single institution and included information in clinical notes. Although chronic conditions were most commonly considered, ML also enabled the characterization of nuanced phenotypes such as social determinants of health. Supervised deep learning was the most popular ML paradigm, while semi-supervised and weakly supervised learning were applied to expedite algorithm development and unsupervised learning to facilitate phenotype discovery. ML approaches did not uniformly outperform rule-based algorithms, but deep learning offered a marginal improvement over traditional ML for many conditions. DISCUSSION Despite the progress in ML-based phenotyping, most articles focused on binary phenotypes and few articles evaluated external validity or used multi-institution data. Study settings were infrequently reported and analytic code was rarely released. CONCLUSION Continued research in ML-based phenotyping is warranted, with emphasis on characterizing nuanced phenotypes, establishing reporting and evaluation standards, and developing methods to accommodate misclassified phenotypes due to algorithm errors in downstream applications.
Collapse
Affiliation(s)
- Siyue Yang
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
| | | | - Ellen Stephenson
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Karen Tu
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Jessica Gronsbell
- Department of Statistical Sciences, University of Toronto, Toronto, Ontario, Canada
- Department of Family & Community Medicine, University of Toronto, Toronto, Ontario, Canada
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
12
|
Maintz L, Welchowski T, Herrmann N, Brauer J, Traidl-Hoffmann C, Havenith R, Müller S, Rhyner C, Dreher A, Schmid M, Bieber T. IL-13, periostin and dipeptidyl-peptidase-4 reveal endotype-phenotype associations in atopic dermatitis. Allergy 2023; 78:1554-1569. [PMID: 36647778 DOI: 10.1111/all.15647] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 11/28/2022] [Accepted: 12/10/2022] [Indexed: 01/18/2023]
Abstract
BACKGROUND The heterogeneous (endo)phenotypes of atopic dermatitis (AD) require precision medicine. Currently, systemic therapy is recommended to patients with an Eczema Area and Severity Index (EASI)≥16. Previous studies have demonstrated an improved treatment response to the anti-interleukin (IL)-13 antibody tralokinumab in AD subgroups with elevated levels of the IL-13-related biomarkers dipeptidyl-peptidase (DPP)-4 and periostin. METHODS Herein, 373 AD patients aged≥12 years were stratified by IL-13high , periostinhigh and DPP-4high endotypes using cross-sectional data from the ProRaD cohort Bonn. "High" was defined as >80th quantile of 47 non-atopic controls. We analyzed endotype-phenotype associations using machine-learning gradient boosting compared to logistic regression. RESULTS AD severity and eosinophils correlated with IL-13 and periostin levels. Correlations of IL-13 with EASI were stronger in patients with increased (rs=0.482) than with normal (rs=0.342) periostin levels. We identified eosinophilia>6% and an EASI range of 5.5-17 dependent on the biomarker combination to be associated with increasing probabilities of biomarkerhigh endotypes. Also patients with mild-to-low-moderate severity (EASI<16) featured increased biomarkers (IL-13high : 41%, periostinhigh : 48.4%, DPP-4high : 22.3%). Herthoge sign (adjusted Odds Ratio (aOR)=1.89, 95% Confidence Interval (CI) [1.14-3.14]) and maternal allergic rhinitis (aOR=2.79-4.47) increased the probability of an IL-13high -endotype, "dirty neck" (aOR=2.83 [1.32-6.07]), orbital darkening (aOR=2.43 [1.08-5.50]), keratosis pilaris (aOR=2.21 [1.1-4.42]) and perleche (aOR=3.44 [1.72-6.86]) of a DPP-4high -endotype. CONCLUSIONS A substantial proportion of patients with EASI<16 featured high biomarker levels suggesting systemic impact of skin inflammation already below the current cut-off for systemic therapy. Our findings facilitate the identification of patients with distinct endotypes potentially linked to response to IL-13-targeted therapy.
Collapse
Affiliation(s)
- Laura Maintz
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Bonn, Germany
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
| | - Thomas Welchowski
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
- Department of Medical Biometry, Informatics and Epidemiology, University Hospital Bonn, Venusberg-Campus 1, Bonn, Germany
| | - Nadine Herrmann
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Bonn, Germany
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
| | - Juliette Brauer
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Bonn, Germany
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
| | - Claudia Traidl-Hoffmann
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
- Environmental Medicine, Faculty of Medicine, University of Augsburg, Stenglinstraße 2, Augsburg, Germany
- Institute of Environmental Medicine, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health, Augsburg, Germany
| | - Regina Havenith
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Bonn, Germany
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
| | - Svenja Müller
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Bonn, Germany
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
| | - Claudio Rhyner
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
- Davos Biosciences, Herman-Burchard-Str. 1, 7265, Davos, Switzerland
| | - Anita Dreher
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
- Davos Biosciences, Herman-Burchard-Str. 1, 7265, Davos, Switzerland
| | - Matthias Schmid
- Department of Medical Biometry, Informatics and Epidemiology, University Hospital Bonn, Venusberg-Campus 1, Bonn, Germany
| | - Thomas Bieber
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Bonn, Germany
- Christine Kühne Center for Allergy Research and Education Davos (CK-CARE), Herman-Burchard-Str. 1, 7265, Davos, Switzerland
- Davos Biosciences, Herman-Burchard-Str. 1, 7265, Davos, Switzerland
| |
Collapse
|
13
|
Azizi S, Hier DB, Wunsch II DC. Enhanced neurologic concept recognition using a named entity recognition model based on transformers. Front Digit Health 2022; 4:1065581. [PMID: 36569804 PMCID: PMC9772022 DOI: 10.3389/fdgth.2022.1065581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Accepted: 11/21/2022] [Indexed: 12/12/2022] Open
Abstract
Although deep learning has been applied to the recognition of diseases and drugs in electronic health records and the biomedical literature, relatively little study has been devoted to the utility of deep learning for the recognition of signs and symptoms. The recognition of signs and symptoms is critical to the success of deep phenotyping and precision medicine. We have developed a named entity recognition model that uses deep learning to identify text spans containing neurological signs and symptoms and then maps these text spans to the clinical concepts of a neuro-ontology. We compared a model based on convolutional neural networks to one based on bidirectional encoder representation from transformers. Models were evaluated for accuracy of text span identification on three text corpora: physician notes from an electronic health record, case histories from neurologic textbooks, and clinical synopses from an online database of genetic diseases. Both models performed best on the professionally-written clinical synopses and worst on the physician-written clinical notes. Both models performed better when signs and symptoms were represented as shorter text spans. Consistent with prior studies that examined the recognition of diseases and drugs, the model based on bidirectional encoder representations from transformers outperformed the model based on convolutional neural networks for recognizing signs and symptoms. Recall for signs and symptoms ranged from 59.5% to 82.0% and precision ranged from 61.7% to 80.4%. With further advances in NLP, fully automated recognition of signs and symptoms in electronic health records and the medical literature should be feasible.
Collapse
Affiliation(s)
- Sima Azizi
- Applied Computational Intelligence Laboratory, Department of Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, United States
| | - Daniel B. Hier
- Applied Computational Intelligence Laboratory, Department of Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, United States
- Department of Neurology and Rehabilitation, University of Illinois at Chicago, Chicago, IL, United States
| | - Donald C. Wunsch II
- Applied Computational Intelligence Laboratory, Department of Electrical & Computer Engineering, Missouri University of Science & Technology, Rolla, MO, United States
- National Science Foundation, ECCS Division, Arlington, VA, United States
| |
Collapse
|
14
|
Bavaresco RS, Barbosa JLV. Ubiquitous computing in light of human phenotypes: foundations, challenges, and opportunities. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING 2022; 14:2341-2349. [PMID: 36530468 PMCID: PMC9735054 DOI: 10.1007/s12652-022-04489-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 11/28/2022] [Indexed: 06/17/2023]
Abstract
The interest in human phenotypes has leveraged interdisciplinary efforts encouraging a better understanding of the broad spectrum of psychological and behavioral disorders. Moreover, the usage of mobile and wearable devices along with unobtrusive computational capabilities provides an extensive amount of information that allows the characterization of phenotypes. This article describes the human phenotype through the lens of computational range and reviews state-of-the-art computational phenotyping. Furthermore, the article discusses computational phenotyping's extension concerning the combination of intelligent environments and personal mobile devices, addressing technical, managerial, and ethical challenges. This combination reinforces ubiquitous computational capabilities for phenotyping as a facilitator of interdisciplinary information convergence in favor of clinical and biomedical research.
Collapse
Affiliation(s)
- Rodrigo Simon Bavaresco
- Applied Computing Graduate Program - PPGCA, University of Vale do Rio dos Sinos - UNISINOS, Av. Unisinos, São Leopoldo, Rio Grande do Sul, 93.022-000 Brazil
| | - Jorge Luis Victória Barbosa
- Applied Computing Graduate Program - PPGCA, University of Vale do Rio dos Sinos - UNISINOS, Av. Unisinos, São Leopoldo, Rio Grande do Sul, 93.022-000 Brazil
| |
Collapse
|
15
|
Bragazzi NL, Garbarino S, Puce L, Trompetto C, Marinelli L, Currà A, Jahrami H, Trabelsi K, Mellado B, Asgary A, Wu J, Kong JD. Planetary sleep medicine: Studying sleep at the individual, population, and planetary level. Front Public Health 2022; 10:1005100. [PMID: 36330122 PMCID: PMC9624384 DOI: 10.3389/fpubh.2022.1005100] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 09/20/2022] [Indexed: 01/27/2023] Open
Abstract
Circadian rhythms are a series of endogenous autonomous oscillators that are generated by the molecular circadian clock which coordinates and synchronizes internal time with the external environment in a 24-h daily cycle (that can also be shorter or longer than 24 h). Besides daily rhythms, there exist as well other biological rhythms that have different time scales, including seasonal and annual rhythms. Circadian and other biological rhythms deeply permeate human life, at any level, spanning from the molecular, subcellular, cellular, tissue, and organismal level to environmental exposures, and behavioral lifestyles. Humans are immersed in what has been called the "circadian landscape," with circadian rhythms being highly pervasive and ubiquitous, and affecting every ecosystem on the planet, from plants to insects, fishes, birds, mammals, and other animals. Anthropogenic behaviors have been producing a cascading and compounding series of effects, including detrimental impacts on human health. However, the effects of climate change on sleep have been relatively overlooked. In the present narrative review paper, we wanted to offer a way to re-read/re-think sleep medicine from a planetary health perspective. Climate change, through a complex series of either direct or indirect mechanisms, including (i) pollution- and poor air quality-induced oxygen saturation variability/hypoxia, (ii) changes in light conditions and increases in the nighttime, (iii) fluctuating temperatures, warmer values, and heat due to extreme weather, and (iv) psychological distress imposed by disasters (like floods, wildfires, droughts, hurricanes, and infectious outbreaks by emerging and reemerging pathogens) may contribute to inducing mismatches between internal time and external environment, and disrupting sleep, causing poor sleep quantity and quality and sleep disorders, such as insomnia, and sleep-related breathing issues, among others. Climate change will generate relevant costs and impact more vulnerable populations in underserved areas, thus widening already existing global geographic, age-, sex-, and gender-related inequalities.
Collapse
Affiliation(s)
- Nicola Luigi Bragazzi
- Laboratory for Industrial and Applied Mathematics (LIAM), Department of Mathematics and Statistics, York University, Toronto, ON, Canada,*Correspondence: Nicola Luigi Bragazzi
| | - Sergio Garbarino
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics and Maternal/Child Sciences (DINOGMI), University of Genoa, Genoa, Italy
| | - Luca Puce
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics and Maternal/Child Sciences (DINOGMI), University of Genoa, Genoa, Italy
| | - Carlo Trompetto
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics and Maternal/Child Sciences (DINOGMI), University of Genoa, Genoa, Italy,Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
| | - Lucio Marinelli
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics and Maternal/Child Sciences (DINOGMI), University of Genoa, Genoa, Italy,Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
| | - Antonio Currà
- Department of Medical-Surgical Sciences and Biotechnologies, Academic Neurology Unit, Ospedale A. Fiorini, Terracina, Italy,Sapienza University of Rome, Rome, Italy
| | - Haitham Jahrami
- Ministry of Health, Manama, Bahrain,College of Medicine and Medical Sciences, Arabian Gulf University, Manama, Bahrain
| | - Khaled Trabelsi
- High Institute of Sport and Physical Education of Sfax, University of Sfax, Sfax, Tunisia,Research Laboratory: Education, Motricity, Sport and Health, EM2S, LR19JS01, University of Sfax, Sfax, Tunisia
| | - Bruce Mellado
- School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa,Subatomic Physics, iThemba Laboratory for Accelerator Based Sciences, Somerset West, South Africa
| | - Ali Asgary
- Disaster and Emergency Management Area and Advanced Disaster, Emergency and Rapid-Response Simulation (ADERSIM), School of Administrative Studies, York University, Toronto, ON, Canada
| | - Jianhong Wu
- Laboratory for Industrial and Applied Mathematics (LIAM), Department of Mathematics and Statistics, York University, Toronto, ON, Canada
| | - Jude Dzevela Kong
- Laboratory for Industrial and Applied Mathematics (LIAM), Department of Mathematics and Statistics, York University, Toronto, ON, Canada
| |
Collapse
|
16
|
Abraham A, Le B, Kosti I, Straub P, Velez-Edwards DR, Davis LK, Newton JM, Muglia LJ, Rokas A, Bejan CA, Sirota M, Capra JA. Dense phenotyping from electronic health records enables machine learning-based prediction of preterm birth. BMC Med 2022; 20:333. [PMID: 36167547 PMCID: PMC9516830 DOI: 10.1186/s12916-022-02522-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 08/10/2022] [Indexed: 11/16/2022] Open
Abstract
BACKGROUND Identifying pregnancies at risk for preterm birth, one of the leading causes of worldwide infant mortality, has the potential to improve prenatal care. However, we lack broadly applicable methods to accurately predict preterm birth risk. The dense longitudinal information present in electronic health records (EHRs) is enabling scalable and cost-efficient risk modeling of many diseases, but EHR resources have been largely untapped in the study of pregnancy. METHODS Here, we apply machine learning to diverse data from EHRs with 35,282 deliveries to predict singleton preterm birth. RESULTS We find that machine learning models based on billing codes alone can predict preterm birth risk at various gestational ages (e.g., ROC-AUC = 0.75, PR-AUC = 0.40 at 28 weeks of gestation) and outperform comparable models trained using known risk factors (e.g., ROC-AUC = 0.65, PR-AUC = 0.25 at 28 weeks). Examining the patterns learned by the model reveals it stratifies deliveries into interpretable groups, including high-risk preterm birth subtypes enriched for distinct comorbidities. Our machine learning approach also predicts preterm birth subtypes (spontaneous vs. indicated), mode of delivery, and recurrent preterm birth. Finally, we demonstrate the portability of our approach by showing that the prediction models maintain their accuracy on a large, independent cohort (5978 deliveries) from a different healthcare system. CONCLUSIONS By leveraging rich phenotypic and genetic features derived from EHRs, we suggest that machine learning algorithms have great potential to improve medical care during pregnancy. However, further work is needed before these models can be applied in clinical settings.
Collapse
Affiliation(s)
- Abin Abraham
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, 37235, USA
- Vanderbilt University Medical Center, Vanderbilt University, Nashville, TN, 37232, USA
| | - Brian Le
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
| | - Idit Kosti
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA
| | - Peter Straub
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, 37235, USA
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Digna R Velez-Edwards
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, 37235, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Lea K Davis
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, 37235, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Psychiatry and Behavioral Sciences, Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - J M Newton
- Department of Obstetrics and Gynecology, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Louis J Muglia
- Burroughs-Wellcome Fund, Research Triangle Park, NC, USA
| | - Antonis Rokas
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Biological Sciences, Vanderbilt University, Nashville, USA
| | - Cosmin A Bejan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, CA, USA
- Department of Pediatrics, University of California, San Francisco, San Francisco, CA, USA
| | - John A Capra
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN, 37235, USA.
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA.
- Department of Biological Sciences, Vanderbilt University, Nashville, USA.
- Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, USA.
| |
Collapse
|
17
|
Liang C, Weissman S, Olatosi B, Poon EG, Yarrington ME, Li X. Curating a knowledge base for individuals with coinfection of HIV and SARS-CoV-2: a study protocol of EHR-based data mining and clinical implementation. BMJ Open 2022; 12:e067204. [PMID: 36100301 PMCID: PMC9471209 DOI: 10.1136/bmjopen-2022-067204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 08/25/2022] [Indexed: 11/04/2022] Open
Abstract
INTRODUCTION Despite a higher risk of severe COVID-19 disease in individuals with HIV, the interactions between SARS-CoV-2 and HIV infections remain unclear. To delineate these interactions, multicentre Electronic Health Records (EHR) hold existing promise to provide full-spectrum and longitudinal clinical data, demographics and sociobehavioural data at individual level. Presently, a comprehensive EHR-based cohort for the HIV/SARS-CoV-2 coinfection has not been established; EHR integration and data mining methods tailored for studying the coinfection are urgently needed yet remain underdeveloped. METHODS AND ANALYSIS The overarching goal of this exploratory/developmental study is to establish an EHR-based cohort for individuals with HIV/SARS-CoV-2 coinfection and perform large-scale EHR-based data mining to examine the interactions between HIV and SARS-CoV-2 infections and systematically identify and validate factors contributing to the severe clinical course of the coinfection. We will use a nationwide EHR database in the USA, namely, National COVID Cohort Collaborative (N3C). Ultimately, collected clinical evidence will be implemented and used to pilot test a clinical decision support prototype to assist providers in screening and referral of at-risk patients in real-world clinics. ETHICS AND DISSEMINATION The study was approved by the institutional review boards at the University of South Carolina (Pro00121828) as non-human subject study. Study findings will be presented at academic conferences and published in peer-reviewed journals. This study will disseminate urgently needed clinical evidence for guiding clinical practice for individuals with the coinfection at Prisma Health, a healthcare system in collaboration.
Collapse
Affiliation(s)
- Chen Liang
- Department of Health Services Policy and Management, University of South Carolina, Columbia, South Carolina, USA
- Big Data Health Science Center, University of South Carolina, Columbia, South Carolina, USA
| | - Sharon Weissman
- Big Data Health Science Center, University of South Carolina, Columbia, South Carolina, USA
- Department of Internal Medicine, University of South Carolina, Columbia, South Carolina, USA
| | - Bankole Olatosi
- Department of Health Services Policy and Management, University of South Carolina, Columbia, South Carolina, USA
- Big Data Health Science Center, University of South Carolina, Columbia, South Carolina, USA
| | - Eric G Poon
- Department of Medicine, Duke University, Durham, North Carolina, USA
| | | | - Xiaoming Li
- Big Data Health Science Center, University of South Carolina, Columbia, South Carolina, USA
- Department of Health Promotion Education and Behavior, University of South Carolina, Columbia, South Carolina, USA
| |
Collapse
|
18
|
Karalexi MA, Eberhard-Gran M, Valdimarsdóttir UA, Karlsson H, Munk-Olsen T, Skalkidou A. Perinatal mental health: how nordic data sources have contributed to existing evidence and future avenues to explore. Nord J Psychiatry 2022; 76:423-432. [PMID: 35057712 DOI: 10.1080/08039488.2021.1998616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
PURPOSE Perinatal mental health disorders affect a significant number of women with debilitating and potentially life-threatening consequences. Researchers in Nordic countries have access to high quality, population-based data sources and the possibility to link data, and are thus uniquely positioned to fill current evidence gaps. We aimed to review how Nordic studies have contributed to existing evidence on perinatal mental health. METHODS We summarized examples of published evidence on perinatal mental health derived from large population-based longitudinal and register-based data from Denmark, Finland, Iceland, Norway and Sweden. RESULTS Nordic datasets, such as the Danish National Birth Cohort, the FinnBrain Birth Cohort Study, the Icelandic SAGA cohort, the Norwegian MoBa and ABC studies, as well as the Swedish BASIC and Mom2B studies facilitate the study of prevalence of perinatal mental disorders, and further provide opportunity to prospectively test etiological hypotheses, yielding comprehensive suggestions about the underlying causal mechanisms. The large sample size, extensive follow-up, multiple measurement points, large geographic coverage, biological sampling and the possibility to link data to national registries renders them unique. The use of novel approaches, such as the digital phenotyping data in the novel application-based Mom2B cohort recording even voice qualities and digital phenotyping, or the Danish study design paralleling a natural experiment are considered strengths of such research. CONCLUSIONS Nordic data sources have contributed substantially to the existing evidence, and can guide future work focused on the study of background, genetic and environmental factors to ultimately define vulnerable groups at risk for psychiatric disorders following childbirth.
Collapse
Affiliation(s)
- Maria A Karalexi
- Department for Women's and Children's Health, Uppsala University, Uppsala, Sweden
| | - Malin Eberhard-Gran
- Norwegian Research Centre for Women's Health, Women and Children's Division, Oslo University Hospital, Oslo, Norway.,Institute of Clinical Medicine, University of Oslo, Oslo, Norway
| | - Unnur Anna Valdimarsdóttir
- Center of Public Health Sciences, Faculty of Medicine, University of Iceland, Reykjavik, Iceland.,Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.,Department of Epidemiology, Harvard TH Chan School of Public Health, Harvard University, Boston, MA, USA
| | - Hasse Karlsson
- Department of Psychiatry and Centre for Population Health Research, University of Turku and Turku University Hospital, Turku, Finland
| | - Trine Munk-Olsen
- The National Centre for Register-based Research, Aarhus University, Aarhus, Denmark.,Department of Clinical Research, University of Southern Denmark, Odense, Denmark
| | - Alkistis Skalkidou
- Department for Women's and Children's Health, Uppsala University, Uppsala, Sweden
| |
Collapse
|
19
|
Digital tools for the assessment of pharmacological treatment for depressive disorder: State of the art. Eur Neuropsychopharmacol 2022; 60:100-116. [PMID: 35671641 DOI: 10.1016/j.euroneuro.2022.05.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 05/13/2022] [Accepted: 05/17/2022] [Indexed: 12/23/2022]
Abstract
Depression is an invalidating disorder, marked by phenotypic heterogeneity. Clinical assessments for treatment adjustments and data-collection for pharmacological research often rely on subjective representations of functioning. Better phenotyping through digital applications may add unseen information and facilitate disentangling the clinical characteristics and impact of depression and its pharmacological treatment in everyday life. Researchers, physicians, and patients benefit from well-understood digital phenotyping approaches to assess the treatment efficacy and side-effects. This review discusses the current possibilities and pitfalls of wearables and technology for the assessment of the pharmacological treatment of depression. Their applications in the whole spectrum of treatment for depression, including diagnosis, treatment of an episode, and monitoring of relapse risk and prevention are discussed. Multiple aspects are to be considered, including concerns that come with collecting sensitive data and health recordings. Also, privacy and trust are addressed. Available applications range from questionnaire-like apps to objective assessment of behavioural patterns and promises in handling suicidality. Nonetheless, interpretation and integration of this high-resolution information with other phenotyping levels, remains challenging. This review provides a state-of-the-art description of wearables and technology in digital phenotyping for monitoring pharmacological treatment in depression, focusing on the challenges and opportunities of its application in clinical trials and research.
Collapse
|
20
|
Krantz MS, Kerchberger VE, Wei WQ. Novel Analysis Methods to Mine Immune-Mediated Phenotypes and Find Genetic Variation Within the Electronic Health Record (Roadmap for Phenotype to Genotype: Immunogenomics). THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY. IN PRACTICE 2022; 10:1757-1762. [PMID: 35487368 PMCID: PMC9624141 DOI: 10.1016/j.jaip.2022.04.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 04/13/2022] [Accepted: 04/18/2022] [Indexed: 06/14/2023]
Abstract
The field of immunogenomics has the opportunity for accelerated genetic discovery aided by the maturation of electronic health records (EHRs) linked to DNA biobanks. Novel analysis methods in deep phenotyping of EHR data allow the full realization of the paired and increasingly dense genetic/phenotypic information available. This enables researchers to uncover genetic risk factors for the prevention and optimal treatment of immune-mediated diseases and immune-mediated adverse drug reactions. This article reviews the background of EHRs linked to DNA biobanks, potential applications to immunogenomic discovery, and current and emerging techniques in EHR-based deep phenotyping.
Collapse
Affiliation(s)
- Matthew S Krantz
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tenn.
| | - V Eric Kerchberger
- Division of Allergy, Pulmonary and Critical Care Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, Tenn; Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tenn
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tenn
| |
Collapse
|
21
|
Schalkamp AK, Rahman N, Monzón-Sandoval J, Sandor C. Deep phenotyping for precision medicine in Parkinson's disease. Dis Model Mech 2022; 15:dmm049376. [PMID: 35647913 PMCID: PMC9178512 DOI: 10.1242/dmm.049376] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
A major challenge in medical genomics is to understand why individuals with the same disorder have different clinical symptoms and why those who carry the same mutation may be affected by different disorders. In every complex disorder, identifying the contribution of different genetic and non-genetic risk factors is a key obstacle to understanding disease mechanisms. Genetic studies rely on precise phenotypes and are unable to uncover the genetic contributions to a disorder when phenotypes are imprecise. To address this challenge, deeply phenotyped cohorts have been developed for which detailed, fine-grained data have been collected. These cohorts help us to investigate the underlying biological pathways and risk factors to identify treatment targets, and thus to advance precision medicine. The neurodegenerative disorder Parkinson's disease has a diverse phenotypical presentation and modest heritability, and its underlying disease mechanisms are still being debated. As such, considerable efforts have been made to develop deeply phenotyped cohorts for this disorder. Here, we focus on Parkinson's disease and explore how deep phenotyping can help address the challenges raised by genetic and phenotypic heterogeneity. We also discuss recent methods for data collection and computation, as well as methodological challenges that have to be overcome.
Collapse
Affiliation(s)
| | | | | | - Cynthia Sandor
- UK Dementia Research Institute at Cardiff University,Division of Psychological Medicine and Clinical Neuroscience, Haydn Ellis Building, Maindy Road, Cardiff CF24 4HQ, UK
| |
Collapse
|
22
|
Deng L, Zhang X, Yang T, Liu M, Chen L, Jiang T. PIAT: an evolutionarily intelligent system for deep phenotyping of Chinese electronic health records. IEEE J Biomed Health Inform 2022; 26:4142-4152. [PMID: 35609107 DOI: 10.1109/jbhi.2022.3177421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Electronic health record (EHR) resources are valuable but remain underexplored because most clinical information, especially phenotype information, is buried in the free text of EHRs. An intelligent annotation tool plays an important role in unlocking the full potential of EHRs by transforming free-text phenotype information into a computer-readable form. Deep phenotyping has shown its advantage in representing phenotype information in EHRs with high fidelity; however, most existing annotation tools are not suitable for the deep phenotyping task. Here, we developed an intelligent annotation tool named PIAT with a major focus on the deep phenotyping of Chinese EHRs. PIAT can improve the annotation efficiency for EHR-based deep phenotyping with a simple but effective interactive interface, automatic preannotation support, and a learning mechanism. Specifically, experts can proofread automatic annotation results from the annotation algorithm in the web-based interactive interface, and EHRs reviewed by experts can be used for evolving the underlying annotation algorithm. In this way, the annotation process of deep phenotyping EHRs will become easier. In conclusion, we create a powerful intelligent system for the deep phenotyping of Chinese EHRs. It is hoped that our work will inspire further studies in constructing intelligent systems for deep phenotyping English and non-English EHRs.
Collapse
|
23
|
Khoury P, Srinivasan R, Kakumanu S, Ochoa S, Keswani A, Sparks R, Rider NL. A Framework for Augmented Intelligence in Allergy and Immunology Practice and Research—A Work Group Report of the AAAAI Health Informatics, Technology, and Education Committee. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY: IN PRACTICE 2022; 10:1178-1188. [PMID: 35300959 PMCID: PMC9205719 DOI: 10.1016/j.jaip.2022.01.047] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 01/19/2022] [Accepted: 01/20/2022] [Indexed: 10/18/2022]
Abstract
Artificial and augmented intelligence (AI) and machine learning (ML) methods are expanding into the health care space. Big data are increasingly used in patient care applications, diagnostics, and treatment decisions in allergy and immunology. How these technologies will be evaluated, approved, and assessed for their impact is an important consideration for researchers and practitioners alike. With the potential of ML, deep learning, natural language processing, and other assistive methods to redefine health care usage, a scaffold for the impact of AI technology on research and patient care in allergy and immunology is needed. An American Academy of Asthma Allergy and Immunology Health Information Technology and Education subcommittee workgroup was convened to perform a scoping review of AI within health care as well as the specialty of allergy and immunology to address impacts on allergy and immunology practice and research as well as potential challenges including education, AI governance, ethical and equity considerations, and potential opportunities for the specialty. There are numerous potential clinical applications of AI in allergy and immunology that range from disease diagnosis to multidimensional data reduction in electronic health records or immunologic datasets. For appropriate application and interpretation of AI, specialists should be involved in the design, validation, and implementation of AI in allergy and immunology. Challenges include incorporation of data science and bioinformatics into training of future allergists-immunologists.
Collapse
|
24
|
Methods to Improve Molecular Diagnosis in Genomic Cold Cases in Pediatric Neurology. Genes (Basel) 2022; 13:genes13020333. [PMID: 35205378 PMCID: PMC8871714 DOI: 10.3390/genes13020333] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 02/06/2022] [Accepted: 02/07/2022] [Indexed: 02/04/2023] Open
Abstract
During the last decade, genetic testing has emerged as an important etiological diagnostic tool for Mendelian diseases, including pediatric neurological conditions. A genetic diagnosis has a considerable impact on disease management and treatment; however, many cases remain undiagnosed after applying standard diagnostic sequencing techniques. This review discusses various methods to improve the molecular diagnostic rates in these genomic cold cases. We discuss extended analysis methods to consider, non-Mendelian inheritance models, mosaicism, dual/multiple diagnoses, periodic re-analysis, artificial intelligence tools, and deep phenotyping, in addition to integrating various omics methods to improve variant prioritization. Last, novel genomic technologies, including long-read sequencing, artificial long-read sequencing, and optical genome mapping are discussed. In conclusion, a more comprehensive molecular analysis and a timely re-analysis of unsolved cases are imperative to improve diagnostic rates. In addition, our current understanding of the human genome is still limited due to restrictions in technologies. Novel technologies are now available that improve upon some of these limitations and can capture all human genomic variation more accurately. Last, we recommend a more routine implementation of high molecular weight DNA extraction methods that is coherent with the ability to use and/or optimally benefit from these novel genomic methods.
Collapse
|
25
|
Li S, Deng L, Zhang X, Chen L, Yang T, Qi Y, Jiang T. Deep Phenotyping on Chinese Electronic Health Records by Recognizing Linguistic Patterns of Phenotypic Narratives with a Sequence Motif Discovery Tool: Algorithm Development and Validation (Preprint). J Med Internet Res 2022; 24:e37213. [PMID: 35657661 PMCID: PMC9206202 DOI: 10.2196/37213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 04/21/2022] [Accepted: 05/12/2022] [Indexed: 11/23/2022] Open
Abstract
Background Phenotype information in electronic health records (EHRs) is mainly recorded in unstructured free text, which cannot be directly used for clinical research. EHR-based deep-phenotyping methods can structure phenotype information in EHRs with high fidelity, making it the focus of medical informatics. However, developing a deep-phenotyping method for non-English EHRs (ie, Chinese EHRs) is challenging. Although numerous EHR resources exist in China, fine-grained annotation data that are suitable for developing deep-phenotyping methods are limited. It is challenging to develop a deep-phenotyping method for Chinese EHRs in such a low-resource scenario. Objective In this study, we aimed to develop a deep-phenotyping method with good generalization ability for Chinese EHRs based on limited fine-grained annotation data. Methods The core of the methodology was to identify linguistic patterns of phenotype descriptions in Chinese EHRs with a sequence motif discovery tool and perform deep phenotyping of Chinese EHRs by recognizing linguistic patterns in free text. Specifically, 1000 Chinese EHRs were manually annotated based on a fine-grained information model, PhenoSSU (Semantic Structured Unit of Phenotypes). The annotation data set was randomly divided into a training set (n=700, 70%) and a testing set (n=300, 30%). The process for mining linguistic patterns was divided into three steps. First, free text in the training set was encoded as single-letter sequences (P: phenotype, A: attribute). Second, a biological sequence analysis tool—MEME (Multiple Expectation Maximums for Motif Elicitation)—was used to identify motifs in the single-letter sequences. Finally, the identified motifs were reduced to a series of regular expressions representing linguistic patterns of PhenoSSU instances in Chinese EHRs. Based on the discovered linguistic patterns, we developed a deep-phenotyping method for Chinese EHRs, including a deep learning–based method for named entity recognition and a pattern recognition–based method for attribute prediction. Results In total, 51 sequence motifs with statistical significance were mined from 700 Chinese EHRs in the training set and were combined into six regular expressions. It was found that these six regular expressions could be learned from a mean of 134 (SD 9.7) annotated EHRs in the training set. The deep-phenotyping algorithm for Chinese EHRs could recognize PhenoSSU instances with an overall accuracy of 0.844 on the test set. For the subtask of entity recognition, the algorithm achieved an F1 score of 0.898 with the Bidirectional Encoder Representations from Transformers–bidirectional long short-term memory and conditional random field model; for the subtask of attribute prediction, the algorithm achieved a weighted accuracy of 0.940 with the linguistic pattern–based method. Conclusions We developed a simple but effective strategy to perform deep phenotyping of Chinese EHRs with limited fine-grained annotation data. Our work will promote the second use of Chinese EHRs and give inspiration to other non–English-speaking countries.
Collapse
Affiliation(s)
- Shicheng Li
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Suzhou Institute of Systems Medicine, Suzhou, China
| | - Lizong Deng
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Suzhou Institute of Systems Medicine, Suzhou, China
| | - Xu Zhang
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Suzhou Institute of Systems Medicine, Suzhou, China
| | - Luming Chen
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Suzhou Institute of Systems Medicine, Suzhou, China
- Guangzhou Laboratory, Guangzhou, China
| | - Tao Yang
- Guangzhou Laboratory, Guangzhou, China
- Guangzhou Medical University, Guangzhou, China
| | - Yifan Qi
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Suzhou Institute of Systems Medicine, Suzhou, China
| | - Taijiao Jiang
- Institute of Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, China
- Suzhou Institute of Systems Medicine, Suzhou, China
- Guangzhou Laboratory, Guangzhou, China
| |
Collapse
|
26
|
Tang AS, Oskotsky T, Havaldar S, Mantyh WG, Bicak M, Solsberg CW, Woldemariam S, Zeng B, Hu Z, Oskotsky B, Dubal D, Allen IE, Glicksberg BS, Sirota M. Deep phenotyping of Alzheimer's disease leveraging electronic medical records identifies sex-specific clinical associations. Nat Commun 2022; 13:675. [PMID: 35115528 PMCID: PMC8814236 DOI: 10.1038/s41467-022-28273-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 01/18/2022] [Indexed: 12/14/2022] Open
Abstract
Alzheimer's Disease (AD) is a neurodegenerative disorder that is still not fully understood. Sex modifies AD vulnerability, but the reasons for this are largely unknown. We utilize two independent electronic medical record (EMR) systems across 44,288 patients to perform deep clinical phenotyping and network analysis to gain insight into clinical characteristics and sex-specific clinical associations in AD. Embeddings and network representation of patient diagnoses demonstrate greater comorbidity interactions in AD in comparison to matched controls. Enrichment analysis identifies multiple known and new diagnostic, medication, and lab result associations across the whole cohort and in a sex-stratified analysis. With this data-driven method of phenotyping, we can represent AD complexity and generate hypotheses of clinical factors that can be followed-up for further diagnostic and predictive analyses, mechanistic understanding, or drug repurposing and therapeutic approaches.
Collapse
Affiliation(s)
- Alice S Tang
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA.
- Graduate Program in Bioengineering, UCSF, San Francisco, CA, USA.
- School of Medicine, UCSF, San Francisco, CA, USA.
| | - Tomiko Oskotsky
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
- Department of Pediatrics, UCSF, San Francisco, CA, USA
| | - Shreyas Havaldar
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - William G Mantyh
- Department of Neurology, University of Minnesota School of Medicine, Minneapolis, MN, USA
| | - Mesude Bicak
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Caroline Warly Solsberg
- Pharmaceutical Sciences and Pharmacogenomics, UCSF, San Francisco, CA, USA
- Department of Neurology and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94158, USA
- Memory and Aging Center, UCSF, San Francisco, CA, USA
| | - Sarah Woldemariam
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
| | - Billy Zeng
- School of Medicine, UCSF, San Francisco, CA, USA
| | - Zicheng Hu
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
| | - Boris Oskotsky
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA
| | - Dena Dubal
- Department of Neurology and Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - Isabel E Allen
- Department of Epidemiology and Biostatistics, UCSF, San Francisco, CA, USA
| | - Benjamin S Glicksberg
- Hasso Plattner Institute for Digital Health at Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Marina Sirota
- Bakar Computational Health Sciences Institute, UCSF, San Francisco, CA, USA.
- Department of Pediatrics, UCSF, San Francisco, CA, USA.
| |
Collapse
|
27
|
Maintz L, Welchowski T, Herrmann N, Brauer J, Kläschen AS, Fimmers R, Schmid M, Bieber T, Schmid-Grendelmeier P, Traidl-Hoffmann C, Akdis C, Lauener R, Brüggen MC, Rhyner C, Bersuch E, Renner E, Reiger M, Dreher A, Hammel G, Luschkova D, Lang C. Machine Learning-Based Deep Phenotyping of Atopic Dermatitis: Severity-Associated Factors in Adolescent and Adult Patients. JAMA Dermatol 2021; 157:1414-1424. [PMID: 34757407 DOI: 10.1001/jamadermatol.2021.3668] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Importance Atopic dermatitis (AD) is the most common chronic inflammatory skin disease and is driven by a complex pathophysiology underlying highly heterogeneous phenotypes. Current advances in precision medicine emphasize the need for stratification. Objective To perform deep phenotyping and identification of severity-associated factors in adolescent and adult patients with AD. Design, Setting, and Participants Cross-sectional data from the baseline visit of a prospective longitudinal study investigating the phenotype among inpatients and outpatients with AD from the Department of Dermatology and Allergy of the University Hospital Bonn enrolled between November 2016 and February 2020. Main Outcomes and Measures Patients were stratified by severity groups using the Eczema Area and Severity Index (EASI). The associations of 130 factors with AD severity were analyzed applying a machine learning-gradient boosting approach with cross-validation-based tuning as well as multinomial logistic regression. Results A total of 367 patients (157 male [42.8%]; mean [SD] age, 39 [17] years; 94% adults) were analyzed. Among the participants, 177 (48.2%) had mild disease (EASI ≤7), 120 (32.7%) had moderate disease (EASI >7 and ≤ 21), and 70 (19.1%) had severe disease (EASI >21). Atopic stigmata (cheilitis: odds ratio [OR], 8.10; 95% CI, 3.35-10.59; white dermographism: OR, 4.42; 95% CI, 1.68-11.64; Hertoghe sign: OR, 2.75; 95% CI, 1.27-5.93; nipple eczema: OR, 4.97; 95% CI, 1.56-15.78) was associated with increased probability of severe AD, while female sex was associated with reduced probability (OR, 0.30; 95% CI, 0.13-0.66). The probability of severe AD was associated with total serum immunoglobulin E levels greater than 1708 IU/mL and eosinophil values greater than 6.8%. Patients aged 12 to 21 years or older than 52 years had an elevated probability of severe AD; patients aged 22 to 51 years had an elevated probability of mild AD. Age at AD onset older than 12 years was associated with increased probability of severe AD up to a peak at 30 years; age at onset older than 33 years was associated with moderate to severe AD; and childhood onset was associated with mild AD (peak, 7 years). Lifestyle factors associated with severe AD were physical activity less than once per week and (former) smoking. Alopecia areata was associated with moderate (OR, 5.23; 95% CI, 1.53-17.88) and severe (OR, 4.67; 95% CI, 1.01-21.56) AD. Predictive performance of machine learning-gradient boosting vs multinomial logistic regression differed only slightly (mean multiclass area under the curve value: 0.71 [95% CI, 0.69-0.72] vs 0.68 [0.66-0.70], respectively). Conclusions and Relevance The associations found in this cross-sectional study among patients with AD might contribute to a deeper disease understanding, closer monitoring of predisposed patients, and personalized prevention and therapy.
Collapse
Affiliation(s)
- Laura Maintz
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Germany.,Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland
| | - Thomas Welchowski
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland.,Department of Medical Biometry, Informatics and Epidemiology, University Hospital Bonn, Venusberg-Campus 1, Germany
| | - Nadine Herrmann
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Germany.,Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland
| | - Juliette Brauer
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Germany.,Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland
| | - Anna Sophie Kläschen
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Germany.,Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland
| | - Rolf Fimmers
- Department of Medical Biometry, Informatics and Epidemiology, University Hospital Bonn, Venusberg-Campus 1, Germany
| | - Matthias Schmid
- Department of Medical Biometry, Informatics and Epidemiology, University Hospital Bonn, Venusberg-Campus 1, Germany
| | - Thomas Bieber
- Department of Dermatology and Allergy, University Hospital Bonn, Venusberg-Campus 1, Germany.,Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland
| | | | - Peter Schmid-Grendelmeier
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland.,Allergy Unit, Department of Dermatology, University Hospital of Zürich, Zürich, Switzerland
| | - Claudia Traidl-Hoffmann
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland.,Department of Environmental Medicine, Faculty of Medicine, University of Augsburg, Augsburg, Germany.,Institute of Environmental Medicine, Helmholtz Zentrum Muenchen, Augsburg, Germany
| | - Cezmi Akdis
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland.,Swiss Institute of Allergy and Asthma Research (SIAF), Davos, Switzerland
| | - Roger Lauener
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland.,Children's Hospital of Eastern Switzerland, St Gallen, Switzerland
| | - Marie-Charlotte Brüggen
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland.,Allergy Unit, Department of Dermatology, University Hospital of Zürich, Zürich, Switzerland.,Faculty of Medicine, University of Zurich, Zürich, Switzerland
| | - Claudio Rhyner
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland
| | - Eugen Bersuch
- Allergy Unit, Department of Dermatology, University Hospital of Zürich, Zürich, Switzerland
| | - Ellen Renner
- Translational Immunology in Environmental Medicine, School of Medicine, Technical University of Munich, Munich, Germany.,Hochgebirgsklinik Davos, Davos, Switzerland
| | - Matthias Reiger
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland.,Department of Environmental Medicine, Faculty of Medicine, University of Augsburg, Augsburg, Germany.,Institute of Environmental Medicine, Helmholtz Zentrum Muenchen, Augsburg, Germany
| | - Anita Dreher
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland
| | - Gertrud Hammel
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland.,Department of Environmental Medicine, Faculty of Medicine, University of Augsburg, Augsburg, Germany.,Institute of Environmental Medicine, Helmholtz Zentrum Muenchen, Augsburg, Germany
| | - Daria Luschkova
- Christine Kühne-Center for Allergy Research and Education Davos (CK-CARE), Davos, Switzerland.,Department of Environmental Medicine, Faculty of Medicine, University of Augsburg, Augsburg, Germany.,Institute of Environmental Medicine, Helmholtz Zentrum Muenchen, Augsburg, Germany
| | - Claudia Lang
- Allergy Unit, Department of Dermatology, University Hospital of Zürich, Zürich, Switzerland
| |
Collapse
|
28
|
Su C, Xu Z, Hoffman K, Goyal P, Safford MM, Lee J, Alvarez-Mulett S, Gomez-Escobar L, Price DR, Harrington JS, Torres LK, Martinez FJ, Campion TR, Wang F, Schenck EJ. Identifying organ dysfunction trajectory-based subphenotypes in critically ill patients with COVID-19. Sci Rep 2021; 11:15872. [PMID: 34354174 PMCID: PMC8342520 DOI: 10.1038/s41598-021-95431-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Accepted: 07/14/2021] [Indexed: 12/13/2022] Open
Abstract
COVID-19-associated respiratory failure offers the unprecedented opportunity to evaluate the differential host response to a uniform pathogenic insult. Understanding whether there are distinct subphenotypes of severe COVID-19 may offer insight into its pathophysiology. Sequential Organ Failure Assessment (SOFA) score is an objective and comprehensive measurement that measures dysfunction severity of six organ systems, i.e., cardiovascular, central nervous system, coagulation, liver, renal, and respiration. Our aim was to identify and characterize distinct subphenotypes of COVID-19 critical illness defined by the post-intubation trajectory of SOFA score. Intubated COVID-19 patients at two hospitals in New York city were leveraged as development and validation cohorts. Patients were grouped into mild, intermediate, and severe strata by their baseline post-intubation SOFA. Hierarchical agglomerative clustering was performed within each stratum to detect subphenotypes based on similarities amongst SOFA score trajectories evaluated by Dynamic Time Warping. Distinct worsening and recovering subphenotypes were identified within each stratum, which had distinct 7-day post-intubation SOFA progression trends. Patients in the worsening suphenotypes had a higher mortality than those in the recovering subphenotypes within each stratum (mild stratum, 29.7% vs. 10.3%, p = 0.033; intermediate stratum, 29.3% vs. 8.0%, p = 0.002; severe stratum, 53.7% vs. 22.2%, p < 0.001). Pathophysiologic biomarkers associated with progression were distinct at each stratum, including findings suggestive of inflammation in low baseline severity of illness versus hemophagocytic lymphohistiocytosis in higher baseline severity of illness. The findings suggest that there are clear worsening and recovering subphenotypes of COVID-19 respiratory failure after intubation, which are more predictive of outcomes than baseline severity of illness. Distinct progression biomarkers at differential baseline severity of illness suggests a heterogeneous pathobiology in the progression of COVID-19 respiratory failure.
Collapse
Affiliation(s)
- Chang Su
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA
| | - Zhenxing Xu
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA
| | - Katherine Hoffman
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA
| | - Parag Goyal
- Division of General Internal Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
| | - Monika M Safford
- Division of General Internal Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
| | - Jerry Lee
- Weill Cornell Medical College, Weill Cornell Medicine, New York, NY, USA
| | - Sergio Alvarez-Mulett
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Luis Gomez-Escobar
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - David R Price
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - John S Harrington
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Lisa K Torres
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Fernando J Martinez
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Thomas R Campion
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, 425 E 61 St., New York, NY, 10065, USA.
| | - Edward J Schenck
- New York-Presbyterian Hospital, Weill Cornell Medicine, 1300 York Ave., Box 96, New York, NY, 10065, USA.
- Division of Pulmonary and Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA.
| |
Collapse
|
29
|
Su C, Zhang Y, Flory JH, Weiner MG, Kaushal R, Schenck EJ, Wang F. Clinical subphenotypes in COVID-19: derivation, validation, prediction, temporal patterns, and interaction with social determinants of health. NPJ Digit Med 2021; 4:110. [PMID: 34262117 PMCID: PMC8280198 DOI: 10.1038/s41746-021-00481-w] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 06/21/2021] [Indexed: 02/08/2023] Open
Abstract
The coronavirus disease 2019 (COVID-19) is heterogeneous and our understanding of the biological mechanisms of host response to the viral infection remains limited. Identification of meaningful clinical subphenotypes may benefit pathophysiological study, clinical practice, and clinical trials. Here, our aim was to derive and validate COVID-19 subphenotypes using machine learning and routinely collected clinical data, assess temporal patterns of these subphenotypes during the pandemic course, and examine their interaction with social determinants of health (SDoH). We retrospectively analyzed 14418 COVID-19 patients in five major medical centers in New York City (NYC), between March 1 and June 12, 2020. Using clustering analysis, 4 biologically distinct subphenotypes were derived in the development cohort (N = 8199). Importantly, the identified subphenotypes were highly predictive of clinical outcomes (especially 60-day mortality). Sensitivity analyses in the development cohort, and rederivation and prediction in the internal (N = 3519) and external (N = 3519) validation cohorts confirmed the reproducibility and usability of the subphenotypes. Further analyses showed varying subphenotype prevalence across the peak of the outbreak in NYC. We also found that SDoH specifically influenced mortality outcome in Subphenotype IV, which is associated with older age, worse clinical manifestation, and high comorbidity burden. Our findings may lead to a better understanding of how COVID-19 causes disease in different populations and potentially benefit clinical trial development. The temporal patterns and SDoH implications of the subphenotypes may add insights to health policy to reduce social disparity in the pandemic.
Collapse
Affiliation(s)
- Chang Su
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Yongkang Zhang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - James H Flory
- Memorial Sloan-Kettering Cancer Center, New York, NY, USA
| | - Mark G Weiner
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
| | - Rainu Kaushal
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.
- New York-Presbyterian Hospital, Weill Cornell Medicine, New York, NY, USA.
- Department of Medicine, Weill Cornell Medical College, New York, NY, USA.
| | - Edward J Schenck
- New York-Presbyterian Hospital, Weill Cornell Medicine, New York, NY, USA.
- Division of Pulmonary & Critical Care Medicine, Joan and Sanford I. Weill Department of Medicine, Weill Cornell Medicine, New York, NY, USA.
| | - Fei Wang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA.
| |
Collapse
|
30
|
DeLozier S, Bland HT, McPheeters M, Wells Q, Farber-Eger E, Bejan CA, Fabbri D, Rosenbloom T, Roden D, Johnson KB, Wei WQ, Peterson J, Bastarache L. Phenotyping coronavirus disease 2019 during a global health pandemic: Lessons learned from the characterization of an early cohort. J Biomed Inform 2021; 117:103777. [PMID: 33838341 PMCID: PMC8026248 DOI: 10.1016/j.jbi.2021.103777] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Revised: 02/09/2021] [Accepted: 04/03/2021] [Indexed: 01/08/2023]
Abstract
From the start of the coronavirus disease 2019 (COVID-19) pandemic, researchers have looked to electronic health record (EHR) data as a way to study possible risk factors and outcomes. To ensure the validity and accuracy of research using these data, investigators need to be confident that the phenotypes they construct are reliable and accurate, reflecting the healthcare settings from which they are ascertained. We developed a COVID-19 registry at a single academic medical center and used data from March 1 to June 5, 2020 to assess differences in population-level characteristics in pandemic and non-pandemic years respectively. Median EHR length, previously shown to impact phenotype performance in type 2 diabetes, was significantly shorter in the SARS-CoV-2 positive group relative to a 2019 influenza tested group (median 3.1 years vs 8.7; Wilcoxon rank sum P = 1.3e-52). Using three phenotyping methods of increasing complexity (billing codes alone and domain-specific algorithms provided by an EHR vendor and clinical experts), common medical comorbidities were abstracted from COVID-19 EHRs, defined by the presence of a positive laboratory test (positive predictive value 100%, recall 93%). After combining performance data across phenotyping methods, we observed significantly lower false negative rates for those records billed for a comprehensive care visit (p = 4e-11) and those with complete demographics data recorded (p = 7e-5). In an early COVID-19 cohort, we found that phenotyping performance of nine common comorbidities was influenced by median EHR length, consistent with previous studies, as well as by data density, which can be measured using portable metrics including CPT codes. Here we present those challenges and potential solutions to creating deeply phenotyped, acute COVID-19 cohorts.
Collapse
Affiliation(s)
- Sarah DeLozier
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA.
| | - Harris T Bland
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Melissa McPheeters
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Quinn Wells
- Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Pierce Avenue, 383 Preston Research Building, Nashville, TN 37232, USA
| | - Eric Farber-Eger
- Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Pierce Avenue, 383 Preston Research Building, Nashville, TN 37232, USA
| | - Cosmin A Bejan
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Daniel Fabbri
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Trent Rosenbloom
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Dan Roden
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA; Division of Cardiovascular Medicine, Vanderbilt University Medical Center, Pierce Avenue, 383 Preston Research Building, Nashville, TN 37232, USA
| | - Kevin B Johnson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Wei-Qi Wei
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Josh Peterson
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University Medical Center, West End Ave, Suite 1475, Nashville, TN 37203, USA
| |
Collapse
|
31
|
Park J, You SC, Jeong E, Weng C, Park D, Roh J, Lee DY, Cheong JY, Choi JW, Kang M, Park RW. A Framework (SOCRATex) for Hierarchical Annotation of Unstructured Electronic Health Records and Integration Into a Standardized Medical Database: Development and Usability Study. JMIR Med Inform 2021; 9:e23983. [PMID: 33783361 PMCID: PMC8044740 DOI: 10.2196/23983] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 11/14/2020] [Accepted: 01/23/2021] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Although electronic health records (EHRs) have been widely used in secondary assessments, clinical documents are relatively less utilized owing to the lack of standardized clinical text frameworks across different institutions. OBJECTIVE This study aimed to develop a framework for processing unstructured clinical documents of EHRs and integration with standardized structured data. METHODS We developed a framework known as Staged Optimization of Curation, Regularization, and Annotation of clinical text (SOCRATex). SOCRATex has the following four aspects: (1) extracting clinical notes for the target population and preprocessing the data, (2) defining the annotation schema with a hierarchical structure, (3) performing document-level hierarchical annotation using the annotation schema, and (4) indexing annotations for a search engine system. To test the usability of the proposed framework, proof-of-concept studies were performed on EHRs. We defined three distinctive patient groups and extracted their clinical documents (ie, pathology reports, radiology reports, and admission notes). The documents were annotated and integrated into the Observational Medical Outcomes Partnership (OMOP)-common data model (CDM) database. The annotations were used for creating Cox proportional hazard models with different settings of clinical analyses to measure (1) all-cause mortality, (2) thyroid cancer recurrence, and (3) 30-day hospital readmission. RESULTS Overall, 1055 clinical documents of 953 patients were extracted and annotated using the defined annotation schemas. The generated annotations were indexed into an unstructured textual data repository. Using the annotations of pathology reports, we identified that node metastasis and lymphovascular tumor invasion were associated with all-cause mortality among colon and rectum cancer patients (both P=.02). The other analyses involving measuring thyroid cancer recurrence using radiology reports and 30-day hospital readmission using admission notes in depressive disorder patients also showed results consistent with previous findings. CONCLUSIONS We propose a framework for hierarchical annotation of textual data and integration into a standardized OMOP-CDM medical database. The proof-of-concept studies demonstrated that our framework can effectively process and integrate diverse clinical documents with standardized structured data for clinical research.
Collapse
Affiliation(s)
- Jimyung Park
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Republic of Korea
| | - Seng Chan You
- Department of Preventive Medicine and Public Health, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Eugene Jeong
- Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, United States
| | - Chunhua Weng
- Department of Biomedical Informatics, Columbia University, New York, NY, United States
| | - Dongsu Park
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Jin Roh
- Department of Pathology, Ajou University Hospital, Suwon, Republic of Korea
| | - Dong Yun Lee
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Jae Youn Cheong
- Department of Gastroenterology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Jin Wook Choi
- Department of Radiology, Ajou University School of Medicine, Suwon, Republic of Korea
| | - Mira Kang
- Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology, Sungkyunkwan University, Seoul, Republic of Korea
| | - Rae Woong Park
- Department of Biomedical Sciences, Ajou University Graduate School of Medicine, Suwon, Republic of Korea
- Department of Biomedical Informatics, Ajou University School of Medicine, Suwon, Republic of Korea
| |
Collapse
|
32
|
Geva A, Liu M, Panickan VA, Avillach P, Cai T, Mandl KD. A high-throughput phenotyping algorithm is portable from adult to pediatric populations. J Am Med Inform Assoc 2021; 28:1265-1269. [PMID: 33594412 DOI: 10.1093/jamia/ocaa343] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 11/27/2020] [Accepted: 12/28/2020] [Indexed: 11/14/2022] Open
Abstract
OBJECTIVE Multimodal automated phenotyping (MAP) is a scalable, high-throughput phenotyping method, developed using electronic health record (EHR) data from an adult population. We tested transportability of MAP to a pediatric population. MATERIALS AND METHODS Without additional feature engineering or supervised training, we applied MAP to a pediatric population enrolled in a biobank and evaluated performance against physician-reviewed medical records. We also compared performance of MAP at the pediatric institution and the original adult institution where MAP was developed, including for 6 phenotypes validated at both institutions against physician-reviewed medical records. RESULTS MAP performed equally well in the pediatric setting (average AUC 0.98) as it did at the general adult hospital system (average AUC 0.96). MAP's performance in the pediatric sample was similar across the 6 specific phenotypes also validated against gold-standard labels in the adult biobank. CONCLUSIONS MAP is highly transportable across diverse populations and has potential for wide-scale use.
Collapse
Affiliation(s)
- Alon Geva
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.,Division of Critical Care Medicine, Department of Anesthesiology, Critical Care, and Pain Medicine, Boston Children's Hospital, Boston, Massachusetts, USA.,Department of Anaesthesia, Harvard Medical School, Boston, Massachusetts, USA
| | - Molei Liu
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, USA
| | - Vidul A Panickan
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | - Paul Avillach
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
| | - Tianxi Cai
- Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, Massachusetts, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA
| | - Kenneth D Mandl
- Computational Health Informatics Program, Boston Children's Hospital, Boston, Massachusetts, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, USA.,Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
| |
Collapse
|
33
|
Majnarić LT, Babič F, O’Sullivan S, Holzinger A. AI and Big Data in Healthcare: Towards a More Comprehensive Research Framework for Multimorbidity. J Clin Med 2021; 10:jcm10040766. [PMID: 33672914 PMCID: PMC7918668 DOI: 10.3390/jcm10040766] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Revised: 02/02/2021] [Accepted: 02/11/2021] [Indexed: 12/11/2022] Open
Abstract
Multimorbidity refers to the coexistence of two or more chronic diseases in one person. Therefore, patients with multimorbidity have multiple and special care needs. However, in practice it is difficult to meet these needs because the organizational processes of current healthcare systems tend to be tailored to a single disease. To improve clinical decision making and patient care in multimorbidity, a radical change in the problem-solving approach to medical research and treatment is needed. In addition to the traditional reductionist approach, we propose interactive research supported by artificial intelligence (AI) and advanced big data analytics. Such research approach, when applied to data routinely collected in healthcare settings, provides an integrated platform for research tasks related to multimorbidity. This may include, for example, prediction, correlation, and classification problems based on multiple interaction factors. However, to realize the idea of this paradigm shift in multimorbidity research, the optimization, standardization, and most importantly, the integration of electronic health data into a common national and international research infrastructure is needed. Ultimately, there is a need for the integration and implementation of efficient AI approaches, particularly deep learning, into clinical routine directly within the workflows of the medical professionals.
Collapse
Affiliation(s)
- Ljiljana Trtica Majnarić
- Department of Internal Medicine, Family Medicine and the History of Medicine, Faculty of Medicine, University Josip Juraj Strossmayer, 31000 Osijek, Croatia;
- Department of Public Health, Faculty of Dental Medicine, University Josip Juraj Strossmayer, 31000 Osijek, Croatia
| | - František Babič
- Department of Cybernetics and Artificial Intelligence, Faculty of Electrical Engineering and Informatics, Technical University of Košice, 066 01 Košice, Slovakia
- Correspondence: ; Tel.: +421-55-602-4220
| | - Shane O’Sullivan
- Department of Pathology, Faculdade de Medicina, Universidade de São Paulo, 05508-220 São Paulo, Brazil;
| | - Andreas Holzinger
- Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, 8036 Graz, Austria;
| |
Collapse
|
34
|
Chu J, Chen J, Chen X, Dong W, Shi J, Huang Z. Knowledge-aware multi-center clinical dataset adaptation: Problem, method, and application. J Biomed Inform 2021; 115:103710. [PMID: 33581323 DOI: 10.1016/j.jbi.2021.103710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 02/05/2021] [Accepted: 02/06/2021] [Indexed: 11/30/2022]
Abstract
Adaptable utilization of clinical data collected from multiple centers, prompted by the need to overcome the shifts between the dataset distributions, and exploit these different datasets for potential clinical applications, has received significant attention in recent years. In this study, we propose a novel approach to this task by infusing an external knowledge graph (KG) into multi-center clinical data mining. Specifically, we propose an adversarial learning model to capture shared patient feature representations from multi-center heterogeneous clinical datasets, and employ an external KG to enrich the semantics of the patient sample by providing both clinical center-specific and center-general knowledge features, which are trained with a graph convolutional autoencoder. We evaluate the proposed model on a real clinical dataset extracted from the general cardiology wards of a Chinese hospital and a well-known public clinical dataset (MIMIC III, pertaining to ICU clinical settings) for the task of predicting acute kidney injury in patients with heart failure. The achieved experimental results demonstrate the efficacy of our proposed model.
Collapse
Affiliation(s)
- Jiebin Chu
- College of Biomedical Engineering and Instrument Science, Zhejiang University, China
| | - Jinbiao Chen
- College of Biomedical Engineering and Instrument Science, Zhejiang University, China
| | - Xiaofang Chen
- College of Biomedical Engineering and Instrument Science, Zhejiang University, China
| | - Wei Dong
- Department of Cardiology, Chinese PLA General Hospital, China
| | - Jinlong Shi
- Department of Medical Innovation Research, Medical Big Data Center, Chinese PLA General Hospital, China
| | - Zhengxing Huang
- College of Biomedical Engineering and Instrument Science, Zhejiang University, China.
| |
Collapse
|
35
|
Ma X, Imai T, Shinohara E, Kasai S, Kato K, Kagawa R, Ohe K. EHR2CCAS: A framework for mapping EHR to disease knowledge presenting causal chain of disorders - chronic kidney disease example. J Biomed Inform 2021; 115:103692. [PMID: 33548543 DOI: 10.1016/j.jbi.2021.103692] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 01/26/2021] [Accepted: 01/27/2021] [Indexed: 10/22/2022]
Abstract
OBJECTIVE The goal of this work was to capture diseases in patients by comprehending the fine-grained medical conditions and disease progression manifested by transitions in medical conditions. We realize this by introducing our earlier work on a state-of-the-art knowledge presentation, which defines a disease as a causal chain of abnormal states (CCAS). Here, we propose a framework, EHR2CCAS, for constructing a system to map electronic health record (EHR) data to CCAS. MATERIALS AND METHODS EHR2CCAS is a framework consisting of modules that access heterogeneous EHR to estimate the presence of abnormal states in a CCAS for a patient in a given time window. EHR2CCAS applies expert-driven (rule-based) and data-driven (machine learning) methods to identify abnormal states from structured and unstructured EHR data. It features data-driven approaches for unlocking clinical texts and imputations based on the EHR temporal properties and the causal CCAS structure. This study presents the CCAS of chronic kidney disease as an example. A mapping system between the EHR from the University of Tokyo Hospital and CCAS of chronic kidney disease was constructed and evaluated against expert annotation. RESULTS The system achieved high prediction performance in identifying abnormal states that had strong agreement among annotators. Our handling of narrative varieties in texts and our imputation of the presence of an abnormal state markedly improved the prediction performance. EHR2CCAS presents patient data describing the temporal presence of abnormal states in CCAS, which is useful in individual disease progression management. Further analysis of the differentiation of transition among abnormal states outputted by EHR2CCAS can contribute to detecting disease subtypes. CONCLUSION This work represents the first step toward combining disease knowledge and EHR to extract abnormality related to a disease defined as fine-grained abnormal states and transitions among them. This can aid in disease progression management and deep phenotyping.
Collapse
Affiliation(s)
- Xiaojun Ma
- Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
| | - Takeshi Imai
- Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
| | - Emiko Shinohara
- Department of Artificial Intelligence in Healthcare, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Satoshi Kasai
- Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan; Department of Healthcare Information Management, The University of Tokyo Hospital, Tokyo, Japan
| | - Kosuke Kato
- Department of Obstetrics and Gynecology, The University of Tokyo Hospital, Tokyo, Japan
| | - Rina Kagawa
- Department of Biomedical Informatics and Management, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan
| | - Kazuhiko Ohe
- Department of Biomedical Informatics, Graduate School of Medicine, The University of Tokyo, Tokyo, Japan; Department of Healthcare Information Management, The University of Tokyo Hospital, Tokyo, Japan
| |
Collapse
|
36
|
Davidson L, Boland MR. Towards deep phenotyping pregnancy: a systematic review on artificial intelligence and machine learning methods to improve pregnancy outcomes. Brief Bioinform 2021; 22:6065792. [PMID: 33406530 PMCID: PMC8424395 DOI: 10.1093/bib/bbaa369] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Revised: 10/13/2020] [Accepted: 11/18/2020] [Indexed: 12/16/2022] Open
Abstract
Objective Development of novel informatics methods focused on improving pregnancy outcomes remains an active area of research. The purpose of this study is to systematically review the ways that artificial intelligence (AI) and machine learning (ML), including deep learning (DL), methodologies can inform patient care during pregnancy and improve outcomes. Materials and methods We searched English articles on EMBASE, PubMed and SCOPUS. Search terms included ML, AI, pregnancy and informatics. We included research articles and book chapters, excluding conference papers, editorials and notes. Results We identified 127 distinct studies from our queries that were relevant to our topic and included in the review. We found that supervised learning methods were more popular (n = 69) than unsupervised methods (n = 9). Popular methods included support vector machines (n = 30), artificial neural networks (n = 22), regression analysis (n = 17) and random forests (n = 16). Methods such as DL are beginning to gain traction (n = 13). Common areas within the pregnancy domain where AI and ML methods were used the most include prenatal care (e.g. fetal anomalies, placental functioning) (n = 73); perinatal care, birth and delivery (n = 20); and preterm birth (n = 13). Efforts to translate AI into clinical care include clinical decision support systems (n = 24) and mobile health applications (n = 9). Conclusions Overall, we found that ML and AI methods are being employed to optimize pregnancy outcomes, including modern DL methods (n = 13). Future research should focus on less-studied pregnancy domain areas, including postnatal and postpartum care (n = 2). Also, more work on clinical adoption of AI methods and the ethical implications of such adoption is needed.
Collapse
Affiliation(s)
- Lena Davidson
- MS degree at College of St. Scholastica, Duluth, MN, USA
| | - Mary Regina Boland
- Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania
| |
Collapse
|
37
|
Abstract
PURPOSE OF REVIEW Healthcare has already been impacted by the fourth industrial revolution exemplified by tip of spear technology, such as artificial intelligence and quantum computing. Yet, there is much to be accomplished as systems remain suboptimal, and full interoperability of digital records is not realized. Given the footprint of technology in healthcare, the field of clinical immunology will certainly see improvements related to these tools. RECENT FINDINGS Biomedical informatics spans the gamut of technology in biomedicine. Within this distinct field, advances are being made, which allow for engineering of systems to automate disease detection, create computable phenotypes and improve record portability. Within clinical immunology, technologies are emerging along these lines and are expected to continue. SUMMARY This review highlights advancements in digital health including learning health systems, electronic phenotyping, artificial intelligence and use of registries. Technological advancements for improving diagnosis and care of patients with primary immunodeficiency diseases is also highlighted.
Collapse
|