1
|
Oyovwi MOS, Ohwin EP, Rotu RA, Olowe TG. Internet-Based Abnormal Chromosomal Diagnosis During Pregnancy Using a Noninvasive Innovative Approach to Detecting Chromosomal Abnormalities in the Fetus: Scoping Review. JMIR BIOINFORMATICS AND BIOTECHNOLOGY 2024; 5:e58439. [PMID: 39412876 PMCID: PMC11525087 DOI: 10.2196/58439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 06/13/2024] [Accepted: 08/18/2024] [Indexed: 10/18/2024]
Abstract
BACKGROUND Chromosomal abnormalities are genetic disorders caused by chromosome errors, leading to developmental delays, birth defects, and miscarriages. Currently, invasive procedures such as amniocentesis or chorionic villus sampling are mostly used, which carry a risk of miscarriage. This has led to the need for a noninvasive and innovative approach to detect and prevent chromosomal abnormalities during pregnancy. OBJECTIVE This review aims to describe and appraise the potential of internet-based abnormal chromosomal preventive measures as a noninvasive approach to detecting and preventing chromosomal abnormalities during pregnancy. METHODS A thorough review of existing literature and research on chromosomal abnormalities and noninvasive approaches to prenatal diagnosis and therapy was conducted. Electronic databases such as PubMed, Google Scholar, ScienceDirect, CENTRAL, CINAHL, Embase, OVID MEDLINE, OVID PsycINFO, Scopus, ACM, and IEEE Xplore were searched for relevant studies and articles published in the last 5 years. The keywords used included chromosomal abnormalities, prenatal diagnosis, noninvasive, and internet-based, and diagnosis. RESULTS The review of literature revealed that internet-based abnormal chromosomal diagnosis is a potential noninvasive approach to detecting and preventing chromosomal abnormalities during pregnancy. This innovative approach involves the use of advanced technology, including high-resolution ultrasound, cell-free DNA testing, and bioinformatics, to analyze fetal DNA from maternal blood samples. It allows early detection of chromosomal abnormalities, enabling timely interventions and treatment to prevent adverse outcomes. Furthermore, with the advancement of technology, internet-based abnormal chromosomal diagnosis has emerged as a safe alternative with benefits including its cost-effectiveness, increased accessibility and convenience, potential for earlier detection and intervention, and ethical considerations. CONCLUSIONS Internet-based abnormal chromosomal diagnosis has the potential to revolutionize prenatal care by offering a safe and noninvasive alternative to invasive procedures. It has the potential to improve the detection of chromosomal abnormalities, leading to better pregnancy outcomes and reduced risk of miscarriage. Further research and development in this field is needed to make this approach more accessible and affordable for pregnant women.
Collapse
Affiliation(s)
| | - Ejiro Peggy Ohwin
- Department of Human Physiology, Faculty of Basic Medical Science, Delta State University, Abraka, Nigeria
| | | | - Temitope Gideon Olowe
- Department of Obstetrics & Gynaecology, University of Medical Sciences, Ondo, Nigeria
| |
Collapse
|
2
|
Martinson AK, Chin AT, Butte MJ, Rider NL. Artificial Intelligence and Machine Learning for Inborn Errors of Immunity: Current State and Future Promise. THE JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY. IN PRACTICE 2024; 12:2695-2704. [PMID: 39127104 DOI: 10.1016/j.jaip.2024.08.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Revised: 07/10/2024] [Accepted: 08/01/2024] [Indexed: 08/12/2024]
Abstract
Artificial intelligence (AI) and machine learning (ML) research within medicine has exponentially increased over the last decade, with studies showcasing the potential of AI/ML algorithms to improve clinical practice and outcomes. Ongoing research and efforts to develop AI-based models have expanded to aid in the identification of inborn errors of immunity (IEI). The use of larger electronic health record data sets, coupled with advances in phenotyping precision and enhancements in ML techniques, has the potential to significantly improve the early recognition of IEI, thereby increasing access to equitable care. In this review, we provide a comprehensive examination of AI/ML for IEI, covering the spectrum from data preprocessing for AI/ML analysis to current applications within immunology, and address the challenges associated with implementing clinical decision support systems to refine the diagnosis and management of IEI.
Collapse
Affiliation(s)
| | - Aaron T Chin
- Department of Pediatrics, Division of Immunology, Allergy and Rheumatology, University of California, Los Angeles, Los Angeles, Calif
| | - Manish J Butte
- Department of Pediatrics, Division of Immunology, Allergy and Rheumatology, University of California, Los Angeles, Los Angeles, Calif
| | - Nicholas L Rider
- Department of Health Systems & Implementation Science, Virginia Tech Carilion School of Medicine, Roanoke, Va; Department of Medicine, Division of Allergy-Immunology, Carilion Clinic, Roanoke, Va.
| |
Collapse
|
3
|
Gülşen M, Yalçın SS. Fostering Tomorrow: Uniting Artificial Intelligence and Social Pediatrics for Comprehensive Child Well-being. Turk Arch Pediatr 2024; 59:345-352. [PMID: 39110287 PMCID: PMC11332429 DOI: 10.5152/turkarchpediatr.2024.24076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 05/29/2024] [Indexed: 08/21/2024]
Abstract
This comprehensive review explores the integration of artificial intelligence (AI) in the field of social pediatrics, emphasizing its potential to revolutionize child healthcare. Social pediatrics, a specialized branch within the discipline, focuses on the significant influence of societal, environmental, and economic factors on children's health and development. This field adopts a holistic approach, integrating medical, psychological, and environmental considerations. This review aims to explore the potential of AI in revolutionizing child healthcare from social pediatrics perspective. To achieve that, we explored AI applications in preventive care, growth monitoring, nutritional guidance, environmental risk factor prediction, and early detection of child abuse. The findings highlight AI's significant contributions in various areas of social pediatrics. Artificial intelligence's proficiency in handling large datasets is shown to enhance diagnostic processes, personalize treatments, and improve overall healthcare management. Notable advancements are observed in preventive care, growth monitoring, nutritional counseling, predicting environmental risks, and early child abuse detection. We find that integrating AI into social pediatric healthcare aims to enhance the effectiveness, accessibility, and equity of pediatric health services. This integration ensures high-quality care for every child, regardless of their social background. The study elucidates AI's multifaceted applications in social pediatrics, including natural language processing, machine learning algorithms for health outcome predictions, and AI-driven tools for health and environmental monitoring, collectively fostering a more efficient, informed, and responsive pediatric healthcare system.
Collapse
Affiliation(s)
- Murat Gülşen
- Department of Autism, Special Mental Needs and Rare Diseases, Turkish Ministry of Health, Ankara, Türkiye
- Division of Social Pediatrics, Department of Pediatrics, Hacettepe University Faculty of Medicine, Ankara, Türkiye
| | - Sıddıka Songül Yalçın
- Division of Social Pediatrics, Department of Pediatrics, Hacettepe University Faculty of Medicine, Ankara, Türkiye
| |
Collapse
|
4
|
Shyr C, Hu Y, Bastarache L, Cheng A, Hamid R, Harris P, Xu H. Identifying and Extracting Rare Diseases and Their Phenotypes with Large Language Models. JOURNAL OF HEALTHCARE INFORMATICS RESEARCH 2024; 8:438-461. [PMID: 38681753 PMCID: PMC11052982 DOI: 10.1007/s41666-023-00155-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/24/2023] [Accepted: 11/13/2023] [Indexed: 05/01/2024]
Abstract
Purpose Phenotyping is critical for informing rare disease diagnosis and treatment, but disease phenotypes are often embedded in unstructured text. While natural language processing (NLP) can automate extraction, a major bottleneck is developing annotated corpora. Recently, prompt learning with large language models (LLMs) has been shown to lead to generalizable results without any (zero-shot) or few annotated samples (few-shot), but none have explored this for rare diseases. Our work is the first to study prompt learning for identifying and extracting rare disease phenotypes in the zero- and few-shot settings. Methods We compared the performance of prompt learning with ChatGPT and fine-tuning with BioClinicalBERT. We engineered novel prompts for ChatGPT to identify and extract rare diseases and their phenotypes (e.g., diseases, symptoms, and signs), established a benchmark for evaluating its performance, and conducted an in-depth error analysis. Results Overall, fine-tuning BioClinicalBERT resulted in higher performance (F1 of 0.689) than ChatGPT (F1 of 0.472 and 0.610 in the zero- and few-shot settings, respectively). However, ChatGPT achieved higher accuracy for rare diseases and signs in the one-shot setting (F1 of 0.778 and 0.725). Conversational, sentence-based prompts generally achieved higher accuracy than structured lists. Conclusion Prompt learning using ChatGPT has the potential to match or outperform fine-tuning BioClinicalBERT at extracting rare diseases and signs with just one annotated sample. Given its accessibility, ChatGPT could be leveraged to extract these entities without relying on a large, annotated corpus. While LLMs can support rare disease phenotyping, researchers should critically evaluate model outputs to ensure phenotyping accuracy.
Collapse
Affiliation(s)
- Cathy Shyr
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| | - Yan Hu
- School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX 77225 USA
| | - Lisa Bastarache
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| | - Alex Cheng
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| | - Rizwan Hamid
- Division of Medical Genetics and Genomic Medicine, Vanderbilt University Medical Center, Nashville, TN 37203 USA
| | - Paul Harris
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37203 USA
- Department of Biomedical Engineering, Vanderbilt University Medical Center, 2525 West End Avenue, Nashville, TN 37203 USA
| | - Hua Xu
- Section of Biomedical Informatics and Data Science, Yale School of Medicine, 100 College Street, New Haven, CT 06510 USA
| |
Collapse
|
5
|
Faviez C, Chen X, Garcelon N, Zaidan M, Billot K, Petzold F, Faour H, Douillet M, Rozet JM, Cormier-Daire V, Attié-Bitach T, Lyonnet S, Saunier S, Burgun A. Objectivizing issues in the diagnosis of complex rare diseases: lessons learned from testing existing diagnosis support systems on ciliopathies. BMC Med Inform Decis Mak 2024; 24:134. [PMID: 38789985 PMCID: PMC11127295 DOI: 10.1186/s12911-024-02538-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 05/17/2024] [Indexed: 05/26/2024] Open
Abstract
BACKGROUND There are approximately 8,000 different rare diseases that affect roughly 400 million people worldwide. Many of them suffer from delayed diagnosis. Ciliopathies are rare monogenic disorders characterized by a significant phenotypic and genetic heterogeneity that raises an important challenge for clinical diagnosis. Diagnosis support systems (DSS) applied to electronic health record (EHR) data may help identify undiagnosed patients, which is of paramount importance to improve patients' care. Our objective was to evaluate three online-accessible rare disease DSSs using phenotypes derived from EHRs for the diagnosis of ciliopathies. METHODS Two datasets of ciliopathy cases, either proven or suspected, and two datasets of controls were used to evaluate the DSSs. Patient phenotypes were automatically extracted from their EHRs and converted to Human Phenotype Ontology terms. We tested the ability of the DSSs to diagnose cases in contrast to controls based on Orphanet ontology. RESULTS A total of 79 cases and 38 controls were selected. Performances of the DSSs on ciliopathy real world data (best DSS with area under the ROC curve = 0.72) were not as good as published performances on the test set used in the DSS development phase. None of these systems obtained results which could be described as "expert-level". Patients with multisystemic symptoms were generally easier to diagnose than patients with isolated symptoms. Diseases easily confused with ciliopathy generally affected multiple organs and had overlapping phenotypes. Four challenges need to be considered to improve the performances: to make the DSSs interoperable with EHR systems, to validate the performances in real-life settings, to deal with data quality, and to leverage methods and resources for rare and complex diseases. CONCLUSION Our study provides insights into the complexities of diagnosing highly heterogenous rare diseases and offers lessons derived from evaluation existing DSSs in real-world settings. These insights are not only beneficial for ciliopathy diagnosis but also hold relevance for the enhancement of DSS for various complex rare disorders, by guiding the development of more clinically relevant rare disease DSSs, that could support early diagnosis and finally make more patients eligible for treatment.
Collapse
Affiliation(s)
- Carole Faviez
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France.
- HeKA, Inria Paris, Paris, F-75012, France.
- Universite Paris Cite, Paris, France.
| | - Xiaoyi Chen
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France
- HeKA, Inria Paris, Paris, F-75012, France
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Nicolas Garcelon
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France
- HeKA, Inria Paris, Paris, F-75012, France
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Mohamad Zaidan
- Service de Néphrologie, Dialyse et Transplantation, Hôpital Universitaire Bicêtre, Assistance Publique-Hôpitaux de Paris (AP-HP), Kremlin Bicêtre, F-94270, France
| | - Katy Billot
- Laboratory of Renal Hereditary Diseases, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
| | - Friederike Petzold
- Laboratory of Renal Hereditary Diseases, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
- Division of Nephrology, University of Leipzig Medical Center, Leipzig, Germany
| | - Hassan Faour
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Maxime Douillet
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Jean-Michel Rozet
- Laboratory of Genetics in Ophthalmology, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
| | - Valérie Cormier-Daire
- Reference Centre for Constitutional Bone Diseases, laboratory of Osteochondrodysplasia, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
- Service de médecine génomique des maladies rares, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
| | - Tania Attié-Bitach
- Service d'Histologie-Embryologie-Cytogénétique, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
| | - Stanislas Lyonnet
- Service de médecine génomique des maladies rares, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
- Laboratory of Embryology and Genetics of Congenital Malformations, INSERM UMR 1163, Imagine Institute, Paris Cité, Paris, F-75015, France
| | - Sophie Saunier
- Laboratory of Renal Hereditary Diseases, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
| | - Anita Burgun
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France
- HeKA, Inria Paris, Paris, F-75012, France
- Department of Medical Informatics, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
| |
Collapse
|
6
|
Sullivan J, Benítez A, Roth J, Andrews JS, Shah D, Butcher E, Jones A, Cross JH. A systematic literature review on the global epidemiology of Dravet syndrome and Lennox-Gastaut syndrome: Prevalence, incidence, diagnosis, and mortality. Epilepsia 2024; 65:1240-1263. [PMID: 38252068 DOI: 10.1111/epi.17866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 12/14/2023] [Accepted: 12/14/2023] [Indexed: 01/23/2024]
Abstract
Dravet syndrome (DS) and Lennox-Gastaut syndrome (LGS) are rare developmental and epileptic encephalopathies associated with seizure and nonseizure symptoms. A comprehensive understanding of how many individuals are affected globally, the diagnostic journey they face, and the extent of mortality associated with these conditions is lacking. Here, we summarize and evaluate published data on the epidemiology of DS and LGS in terms of prevalence, incidence, diagnosis, genetic mutations, and mortality and sudden unexpected death in epilepsy (SUDEP) rates. The full study protocol is registered on PROSPERO (CRD42022316930). After screening 2172 deduplicated records, 91 unique records were included; 67 provided data on DS only, 17 provided data on LGS only, and seven provided data on both. Case definitions varied considerably across studies, particularly for LGS. Incidence and prevalence estimates per 100 000 individuals were generally higher for LGS than for DS (LGS: incidence proportion = 14.5-28, prevalence = 5.8-60.8; DS: incidence proportion = 2.2-6.5, prevalence = 1.2-6.5). Diagnostic delay was frequently reported for LGS, with a wider age range at diagnosis reported than for DS (DS, 1.6-9.2 years; LGS, 2-15 years). Genetic screening data were reported by 63 studies; all screened for SCN1A variants, and only one study specifically focused on individuals with LGS. Individuals with DS had a higher mortality estimate per 1000 person-years than individuals with LGS (DS, 15.84; LGS, 6.12) and a lower median age at death. SUDEP was the most frequently reported cause of death for individuals with DS. Only four studies reported mortality information for LGS, none of which included SUDEP. This systematic review highlights the paucity of epidemiological data available for DS and especially LGS, demonstrating the need for further research and adoption of standardized diagnostic criteria.
Collapse
Affiliation(s)
- Joseph Sullivan
- Department of Neurology, University of California, San Francisco, San Francisco, California, USA
| | - Arturo Benítez
- Takeda Development Center Americas, Cambridge, Massachusetts, USA
| | - Jeannine Roth
- Takeda Pharmaceuticals International, Zurich, Switzerland
| | - J Scott Andrews
- Takeda Development Center Americas, Cambridge, Massachusetts, USA
| | - Drishti Shah
- Takeda Development Center Americas, Cambridge, Massachusetts, USA
| | | | | | - J Helen Cross
- University College London, National Institute for Health and Care Research Biomedical Research Centre, London, UK
| |
Collapse
|
7
|
Cohen AM, Kaner J, Miller R, Kopesky JW, Hersh W. Automatically pre-screening patients for the rare disease aromatic l-amino acid decarboxylase deficiency using knowledge engineering, natural language processing, and machine learning on a large EHR population. J Am Med Inform Assoc 2024; 31:692-704. [PMID: 38134953 PMCID: PMC10873832 DOI: 10.1093/jamia/ocad244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Revised: 11/28/2023] [Accepted: 12/01/2023] [Indexed: 12/24/2023] Open
Abstract
OBJECTIVES Electronic health record (EHR) data may facilitate the identification of rare diseases in patients, such as aromatic l-amino acid decarboxylase deficiency (AADCd), an autosomal recessive disease caused by pathogenic variants in the dopa decarboxylase gene. Deficiency of the AADC enzyme results in combined severe reductions in monoamine neurotransmitters: dopamine, serotonin, epinephrine, and norepinephrine. This leads to widespread neurological complications affecting motor, behavioral, and autonomic function. The goal of this study was to use EHR data to identify previously undiagnosed patients who may have AADCd without available training cases for the disease. MATERIALS AND METHODS A multiple symptom and related disease annotated dataset was created and used to train individual concept classifiers on annotated sentence data. A multistep algorithm was then used to combine concept predictions into a single patient rank value. RESULTS Using an 8000-patient dataset that the algorithms had not seen before ranking, the top and bottom 200 ranked patients were manually reviewed for clinical indications of performing an AADCd diagnostic screening test. The top-ranked patients were 22.5% positively assessed for diagnostic screening, with 0% for the bottom-ranked patients. This result is statistically significant at P < .0001. CONCLUSION This work validates the approach that large-scale rare-disease screening can be accomplished by combining predictions for relevant individual symptoms and related conditions which are much more common and for which training data is easier to create.
Collapse
Affiliation(s)
- Aaron M Cohen
- Department of Medical Informatics and Clinical Epidemiology, School of Medicine, Oregon Health & Science University, Portland, OR 97239, United States
| | - Jolie Kaner
- Department of Medical Informatics and Clinical Epidemiology, School of Medicine, Oregon Health & Science University, Portland, OR 97239, United States
| | - Ryan Miller
- PTC Therapeutics, South Plainfield, NJ 07080, United States
| | | | - William Hersh
- Department of Medical Informatics and Clinical Epidemiology, School of Medicine, Oregon Health & Science University, Portland, OR 97239, United States
| |
Collapse
|
8
|
Faviez C, Vincent M, Garcelon N, Boyer O, Knebelmann B, Heidet L, Saunier S, Chen X, Burgun A. Performance and clinical utility of a new supervised machine-learning pipeline in detecting rare ciliopathy patients based on deep phenotyping from electronic health records and semantic similarity. Orphanet J Rare Dis 2024; 19:55. [PMID: 38336713 PMCID: PMC10858490 DOI: 10.1186/s13023-024-03063-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 02/03/2024] [Indexed: 02/12/2024] Open
Abstract
BACKGROUND Rare diseases affect approximately 400 million people worldwide. Many of them suffer from delayed diagnosis. Among them, NPHP1-related renal ciliopathies need to be diagnosed as early as possible as potential treatments have been recently investigated with promising results. Our objective was to develop a supervised machine learning pipeline for the detection of NPHP1 ciliopathy patients from a large number of nephrology patients using electronic health records (EHRs). METHODS AND RESULTS We designed a pipeline combining a phenotyping module re-using unstructured EHR data, a semantic similarity module to address the phenotype dependence, a feature selection step to deal with high dimensionality, an undersampling step to address the class imbalance, and a classification step with multiple train-test split for the small number of rare cases. The pipeline was applied to thirty NPHP1 patients and 7231 controls and achieved good performances (sensitivity 86% with specificity 90%). A qualitative review of the EHRs of 40 misclassified controls showed that 25% had phenotypes belonging to the ciliopathy spectrum, which demonstrates the ability of our system to detect patients with similar conditions. CONCLUSIONS Our pipeline reached very encouraging performance scores for pre-diagnosing ciliopathy patients. The identified patients could then undergo genetic testing. The same data-driven approach can be adapted to other rare diseases facing underdiagnosis challenges.
Collapse
Affiliation(s)
- Carole Faviez
- Centre de Recherche des Cordeliers, Université Paris Cité, Sorbonne Université, INSERM UMR 1138, 75006, Paris, France.
- Inria, 75012, Paris, France.
| | - Marc Vincent
- Université Paris Cité, Imagine Institute, Data Science Platform, INSERM UMR 1163, 75015, Paris, France
| | - Nicolas Garcelon
- Centre de Recherche des Cordeliers, Université Paris Cité, Sorbonne Université, INSERM UMR 1138, 75006, Paris, France
- Inria, 75012, Paris, France
- Université Paris Cité, Imagine Institute, Data Science Platform, INSERM UMR 1163, 75015, Paris, France
| | - Olivia Boyer
- Department of Pediatric Nephrology, APHP-Centre, Reference Center for Inherited Renal Diseases (MARHEA), Imagine Institute, Hôpital Necker-Enfants Malades, Université Paris Cité, 75015, Paris, France
- Laboratory of Renal Hereditary Diseases, INSERM UMR 1163, Imagine Institute, Université Paris Cité, 75015, Paris, France
| | - Bertrand Knebelmann
- Nephrology and Transplantation Department, MARHEA, Hôpital Necker-Enfants Malades, AP-HP, Université Paris Cité, 75015, Paris, France
| | - Laurence Heidet
- Department of Pediatric Nephrology, APHP-Centre, Reference Center for Inherited Renal Diseases (MARHEA), Imagine Institute, Hôpital Necker-Enfants Malades, Université Paris Cité, 75015, Paris, France
| | - Sophie Saunier
- Laboratory of Renal Hereditary Diseases, INSERM UMR 1163, Imagine Institute, Université Paris Cité, 75015, Paris, France
| | - Xiaoyi Chen
- Centre de Recherche des Cordeliers, Université Paris Cité, Sorbonne Université, INSERM UMR 1138, 75006, Paris, France
- Inria, 75012, Paris, France
- Université Paris Cité, Imagine Institute, Data Science Platform, INSERM UMR 1163, 75015, Paris, France
| | - Anita Burgun
- Centre de Recherche des Cordeliers, Université Paris Cité, Sorbonne Université, INSERM UMR 1138, 75006, Paris, France
- Inria, 75012, Paris, France
- Département d'informatique Médicale, Hôpital Necker-Enfants Malades, AP-HP, 75015, Paris, France
| |
Collapse
|
9
|
Lo Barco T, Garcelon N, Neuraz A, Nabbout R. Natural history of rare diseases using natural language processing of narrative unstructured electronic health records: The example of Dravet syndrome. Epilepsia 2024; 65:350-361. [PMID: 38065926 DOI: 10.1111/epi.17855] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 12/07/2023] [Accepted: 12/07/2023] [Indexed: 12/31/2023]
Abstract
OBJECTIVE The increasing implementation of electronic health records allows the use of advanced text-mining methods for establishing new patient phenotypes and stratification, and for revealing outcome correlations. In this study, we aimed to explore the electronic narrative clinical reports of a cohort of patients with Dravet syndrome (DS) longitudinally followed at our center, to identify the capacity of this methodology to retrace natural history of DS during the early years. METHODS We used a document-based clinical data warehouse employing natural language processing to recognize the phenotype concepts in the narrative medical reports. We included patients with DS who have a medical report produced before the age of 2 years and a follow-up after the age of 3 years ("DS cohort," 56 individuals). We selected two control populations, a "general control cohort" (275 individuals) and a "neurological control cohort" (281 individuals), with similar characteristics in terms of gender, number of reports, and age at last report. To find concepts specifically associated with DS, we performed a phenome-wide association study using Cox regression, comparing the reports of the three cohorts. We then performed a qualitative analysis of the surviving concepts based on their median age at first appearance. RESULTS A total of 76 concepts were prevalent in the reports of children with DS. Concepts appearing during the first 2 years were mostly related with the epilepsy features at the onset of DS (convulsive and prolonged seizures triggered by fever, often requiring in-hospital care). Subsequently, concepts related to new types of seizures and to drug resistance appeared. A series of non-seizure-related concepts emerged after the age of 2-3 years, referring to the nonseizure comorbidities classically associated with DS. SIGNIFICANCE The extraction of clinical terms by narrative reports of children with DS allows outlining the known natural history of this rare disease in early childhood. This original model of "longitudinal phenotyping" could be applied to other rare and very rare conditions with poor natural history description.
Collapse
Affiliation(s)
- Tommaso Lo Barco
- Department of Pediatric Neurology, Necker-Enfants Malades Hospital, Assistance Publique-Hôpitaux de Paris, Reference Center for Rare Epilepsies, Member of European Reference Network EpiCARE, Université Paris Cité, Paris, France
| | - Nicolas Garcelon
- Data Science Platform, Institut National de la Santé et de la Recherche Médicale Unité Mixte de Recherche 1163, Imagine Institute, Université Paris Cité, Paris, France
| | - Antoine Neuraz
- Data Science Platform, Institut National de la Santé et de la Recherche Médicale Unité Mixte de Recherche 1163, Imagine Institute, Université Paris Cité, Paris, France
| | - Rima Nabbout
- Department of Pediatric Neurology, Necker-Enfants Malades Hospital, Assistance Publique-Hôpitaux de Paris, Reference Center for Rare Epilepsies, Member of European Reference Network EpiCARE, Université Paris Cité, Paris, France
- Translational Research for Neurological Disorders, Institut National de la Santé et de la Recherche Médicale Unité Mixte de Recherche 1163, Imagine Institute, Université Paris Cité, Paris, France
| |
Collapse
|
10
|
Mora S, Turrisi R, Chiarella L, Consales A, Tassi L, Mai R, Nobili L, Barla A, Arnulfo G. NLP-based tools for localization of the epileptogenic zone in patients with drug-resistant focal epilepsy. Sci Rep 2024; 14:2349. [PMID: 38287042 PMCID: PMC10825198 DOI: 10.1038/s41598-024-51846-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 01/10/2024] [Indexed: 01/31/2024] Open
Abstract
Epilepsy surgery is an option for people with focal onset drug-resistant (DR) seizures but a delayed or incorrect diagnosis of epileptogenic zone (EZ) location limits its efficacy. Seizure semiological manifestations and their chronological appearance contain valuable information on the putative EZ location but their interpretation relies on extensive experience. The aim of our work is to support the localization of EZ in DR patients automatically analyzing the semiological description of seizures contained in video-EEG reports. Our sample is composed of 536 descriptions of seizures extracted from Electronic Medical Records of 122 patients. We devised numerical representations of anamnestic records and seizures descriptions, exploiting Natural Language Processing (NLP) techniques, and used them to feed Machine Learning (ML) models. We performed three binary classification tasks: localizing the EZ in the right or left hemisphere, temporal or extra-temporal, and frontal or posterior regions. Our computational pipeline reached performances above 70% in all tasks. These results show that NLP-based numerical representation combined with ML-based classification models may help in localizing the origin of the seizures relying only on seizures-related semiological text data alone. Accurate early recognition of EZ could enable a more appropriate patient management and a faster access to epilepsy surgery to potential candidates.
Collapse
Affiliation(s)
- Sara Mora
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, 16145, Genoa, Italy.
| | - Rosanna Turrisi
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, 16145, Genoa, Italy
- MaLGa Machine Learning Genoa Center, University of Genoa, 16146, Genoa, Italy
| | - Lorenzo Chiarella
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Child and Maternal Health (DINOGMI), University of Genoa, 16132, Genoa, Italy
- Child Neuropsychiatry Unit, IRCCS Istituto Giannina Gaslini, Member of the European Reference Network EpiCARE, 16147, Genoa, Italy
| | - Alessandro Consales
- Division of Neurosurgery, IRCCS Istituto Giannina Gaslini, 16147, Genoa, Italy
| | - Laura Tassi
- "Claudio Munari" Epilepsy Surgery Center, Niguarda Hospital, 20162, Milan, Italy
| | - Roberto Mai
- "Claudio Munari" Epilepsy Surgery Center, Niguarda Hospital, 20162, Milan, Italy
| | - Lino Nobili
- Department of Neuroscience, Rehabilitation, Ophthalmology, Genetics, Child and Maternal Health (DINOGMI), University of Genoa, 16132, Genoa, Italy
- Child Neuropsychiatry Unit, IRCCS Istituto Giannina Gaslini, Member of the European Reference Network EpiCARE, 16147, Genoa, Italy
| | - Annalisa Barla
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, 16145, Genoa, Italy
- MaLGa Machine Learning Genoa Center, University of Genoa, 16146, Genoa, Italy
| | - Gabriele Arnulfo
- Department of Informatics, Bioengineering, Robotics and System Engineering (DIBRIS), University of Genoa, 16145, Genoa, Italy
- Neuroscience Center, Helsinki Institute of Life Science (HiLife), University of Helsinki, 00014, Helsinki, Finland
| |
Collapse
|
11
|
Bazoge A, Morin E, Daille B, Gourraud PA. Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review. JMIR Med Inform 2023; 11:e42477. [PMID: 38100200 PMCID: PMC10757232 DOI: 10.2196/42477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Revised: 01/16/2023] [Accepted: 09/07/2023] [Indexed: 12/17/2023] Open
Abstract
BACKGROUND In recent years, health data collected during the clinical care process have been often repurposed for secondary use through clinical data warehouses (CDWs), which interconnect disparate data from different sources. A large amount of information of high clinical value is stored in unstructured text format. Natural language processing (NLP), which implements algorithms that can operate on massive unstructured textual data, has the potential to structure the data and make clinical information more accessible. OBJECTIVE The aim of this review was to provide an overview of studies applying NLP to textual data from CDWs. It focuses on identifying the (1) NLP tasks applied to data from CDWs and (2) NLP methods used to tackle these tasks. METHODS This review was performed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. We searched for relevant articles in 3 bibliographic databases: PubMed, Google Scholar, and ACL Anthology. We reviewed the titles and abstracts and included articles according to the following inclusion criteria: (1) focus on NLP applied to textual data from CDWs, (2) articles published between 1995 and 2021, and (3) written in English. RESULTS We identified 1353 articles, of which 194 (14.34%) met the inclusion criteria. Among all identified NLP tasks in the included papers, information extraction from clinical text (112/194, 57.7%) and the identification of patients (51/194, 26.3%) were the most frequent tasks. To address the various tasks, symbolic methods were the most common NLP methods (124/232, 53.4%), showing that some tasks can be partially achieved with classical NLP techniques, such as regular expressions or pattern matching that exploit specialized lexica, such as drug lists and terminologies. Machine learning (70/232, 30.2%) and deep learning (38/232, 16.4%) have been increasingly used in recent years, including the most recent approaches based on transformers. NLP methods were mostly applied to English language data (153/194, 78.9%). CONCLUSIONS CDWs are central to the secondary use of clinical texts for research purposes. Although the use of NLP on data from CDWs is growing, there remain challenges in this field, especially with regard to languages other than English. Clinical NLP is an effective strategy for accessing, extracting, and transforming data from CDWs. Information retrieved with NLP can assist in clinical research and have an impact on clinical practice.
Collapse
Affiliation(s)
- Adrien Bazoge
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
- Nantes Université, CHU de Nantes, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, F-44000 Nantes, France
| | - Emmanuel Morin
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
| | - Béatrice Daille
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000 Nantes, France
| | - Pierre-Antoine Gourraud
- Nantes Université, CHU de Nantes, Pôle Hospitalo-Universitaire 11: Santé Publique, Clinique des données, INSERM, CIC 1413, F-44000 Nantes, France
- Nantes Université, INSERM, CHU de Nantes, École Centrale Nantes, Centre de Recherche Translationnelle en Transplantation et Immunologie, CR2TI, F-44000 Nantes, France
| |
Collapse
|
12
|
Macri CZ, Teoh SC, Bacchi S, Tan I, Casson R, Sun MT, Selva D, Chan W. A case study in applying artificial intelligence-based named entity recognition to develop an automated ophthalmic disease registry. Graefes Arch Clin Exp Ophthalmol 2023; 261:3335-3344. [PMID: 37535181 PMCID: PMC10587337 DOI: 10.1007/s00417-023-06190-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Revised: 06/23/2023] [Accepted: 07/23/2023] [Indexed: 08/04/2023] Open
Abstract
PURPOSE Advances in artificial intelligence (AI)-based named entity extraction (NER) have improved the ability to extract diagnostic entities from unstructured, narrative, free-text data in electronic health records. However, there is a lack of ready-to-use tools and workflows to encourage the use among clinicians who often lack experience and training in AI. We sought to demonstrate a case study for developing an automated registry of ophthalmic diseases accompanied by a ready-to-use low-code tool for clinicians. METHODS We extracted deidentified electronic clinical records from a single centre's adult outpatient ophthalmology clinic from November 2019 to May 2022. We used a low-code annotation software tool (Prodigy) to annotate diagnoses and train a bespoke spaCy NER model to extract diagnoses and create an ophthalmic disease registry. RESULTS A total of 123,194 diagnostic entities were extracted from 33,455 clinical records. After decapitalisation and removal of non-alphanumeric characters, there were 5070 distinct extracted diagnostic entities. The NER model achieved a precision of 0.8157, recall of 0.8099, and F score of 0.8128. CONCLUSION We presented a case study using low-code artificial intelligence-based NLP tools to produce an automated ophthalmic disease registry. The workflow created a NER model with a moderate overall ability to extract diagnoses from free-text electronic clinical records. We have produced a ready-to-use tool for clinicians to implement this low-code workflow in their institutions and encourage the uptake of artificial intelligence methods for case finding in electronic health records.
Collapse
Affiliation(s)
- Carmelo Z Macri
- Discipline of Ophthalmology and Visual Sciences, The University of Adelaide, Adelaide, South Australia, Australia.
- Department of Ophthalmology, The Royal Adelaide Hospital, Adelaide, South Australia, Australia.
| | - Sheng Chieh Teoh
- Department of Ophthalmology, The Royal Adelaide Hospital, Adelaide, South Australia, Australia
| | - Stephen Bacchi
- Discipline of Ophthalmology and Visual Sciences, The University of Adelaide, Adelaide, South Australia, Australia
- Department of Ophthalmology, The Royal Adelaide Hospital, Adelaide, South Australia, Australia
| | - Ian Tan
- Department of Ophthalmology, The Royal Adelaide Hospital, Adelaide, South Australia, Australia
| | - Robert Casson
- Discipline of Ophthalmology and Visual Sciences, The University of Adelaide, Adelaide, South Australia, Australia
- Department of Ophthalmology, The Royal Adelaide Hospital, Adelaide, South Australia, Australia
| | - Michelle T Sun
- Discipline of Ophthalmology and Visual Sciences, The University of Adelaide, Adelaide, South Australia, Australia
- Department of Ophthalmology, The Royal Adelaide Hospital, Adelaide, South Australia, Australia
| | - Dinesh Selva
- Discipline of Ophthalmology and Visual Sciences, The University of Adelaide, Adelaide, South Australia, Australia
- Department of Ophthalmology, The Royal Adelaide Hospital, Adelaide, South Australia, Australia
| | - WengOnn Chan
- Discipline of Ophthalmology and Visual Sciences, The University of Adelaide, Adelaide, South Australia, Australia
- Department of Ophthalmology, The Royal Adelaide Hospital, Adelaide, South Australia, Australia
| |
Collapse
|
13
|
Domaradzki J, Walkowiak D. Emotional experiences of family caregivers of children with Dravet syndrome. Epilepsy Behav 2023; 142:109193. [DOI: https:/doi.org/10.1016/j.yebeh.2023.109193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 07/24/2023]
|
14
|
Domaradzki J, Walkowiak D. Emotional experiences of family caregivers of children with Dravet syndrome. Epilepsy Behav 2023; 142:109193. [PMID: 37028149 DOI: 10.1016/j.yebeh.2023.109193] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 03/21/2023] [Accepted: 03/23/2023] [Indexed: 04/09/2023]
Abstract
BACKGROUND Since the psychosocial implications of Dravet syndrome (DS) are much more serious and far-reaching than in other types of epilepsy, caring for a DS child seriously affects the entire family. This study describes the emotional experiences of family caregivers of DS children and evaluates the way caregiving affects their perceived quality of life. METHODS An anonymous, self-administered online questionnaire was sent to family caregivers of DS children through the online patient advocacy organization the Association for People with Severe Refractory Epilepsy DRAVET.PL. It focussed on the psychosocial impact of caregiving for DS children, the perceived burden of caregiving, caregivers' emotional experiences and feelings related to caregiving, and the impact of DS on the perceived quality of life. RESULTS Caregivers stressed that caring for a DS child is associated with a significant psychosocial and emotional burden that affects the entire family. Although most caregivers reported that it was the child's health problems and behavioral and psychological disorders that were the most challenging aspects of caregiving, they were also burdened by the lack of emotional support. As caregivers were profoundly engaged in caregiving, they experienced a variety of distressing emotions, including feelings of helplessness, anxiety and fear, anticipated grief, depression, and impulsivity. Many caregivers also reported that their children's disease disrupted their relationships with their spouses, family, and healthy children. As caregivers reported experiencing role overload, physical fatigue, and mental exhaustion, they stressed the extent to which caregiving for DS children impaired their quality of life, their social and professional life, and was a source of financial burden. CONCLUSIONS As this study identified specific burden domains affecting DS caregivers' well-being family carers often need special attention, support, and help. To alleviate the humanistic burden of DS carers a bio-psychosocial approach focusing on physical, mental, and psychosocial interventions should include both DS children and their caregivers.
Collapse
Affiliation(s)
- Jan Domaradzki
- Department of Social Sciences and Humanities, Poznan University of Medical Sciences, Poznań, Poland.
| | - Dariusz Walkowiak
- Department of Organization and Management in Health Care, Poznan University of Medical Sciences, Poznań, Poland
| |
Collapse
|
15
|
Gombolay GY, Gopalan N, Bernasconi A, Nabbout R, Megerian JT, Siegel B, Hallman-Cooper J, Bhalla S, Gombolay MC. Review of Machine Learning and Artificial Intelligence (ML/AI) for the Pediatric Neurologist. Pediatr Neurol 2023; 141:42-51. [PMID: 36773406 PMCID: PMC10040433 DOI: 10.1016/j.pediatrneurol.2023.01.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Revised: 01/03/2023] [Accepted: 01/09/2023] [Indexed: 01/15/2023]
Abstract
Artificial intelligence (AI) and a popular branch of AI known as machine learning (ML) are increasingly being utilized in medicine and to inform medical research. This review provides an overview of AI and ML (AI/ML), including definitions of common terms. We discuss the history of AI and provide instances of how AI/ML can be applied to pediatric neurology. Examples include imaging in neuro-oncology, autism diagnosis, diagnosis from charts, epilepsy, cerebral palsy, and neonatal neurology. Topics such as supervised learning, unsupervised learning, and reinforcement learning are discussed.
Collapse
Affiliation(s)
- Grace Y Gombolay
- Division of Neurology, Department of Pediatrics, Emory University School of Medicine, Atlanta Georgia; Division of Pediatric Neurology, Children's Healthcare of Atlanta, Atlanta Georgia.
| | - Nakul Gopalan
- Georgia Institute of Technology, Interactive Computing, Atlanta, Georgia
| | - Andrea Bernasconi
- Neuroimaging of Epilepsy Laboratory, McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill University, Montreal, UK
| | - Rima Nabbout
- Department of Pediatric Neurology, Necker Enfants Malades Hospital, Reference Centre for Rare Epilepsies and Member of the ERN EpiCARE, Imagine Institute UMR1163, Paris Descartes University, Paris, France
| | - Jonathan T Megerian
- Department of Pediatrics, CHOC Children's, Irvine School of Medicine, University of California, Orange, California
| | - Benjamin Siegel
- Division of Neurology, Department of Pediatrics, Emory University School of Medicine, Atlanta Georgia; Division of Pediatric Neurology, Children's Healthcare of Atlanta, Atlanta Georgia
| | - Jamika Hallman-Cooper
- Division of Neurology, Department of Pediatrics, Emory University School of Medicine, Atlanta Georgia; Division of Pediatric Neurology, Children's Healthcare of Atlanta, Atlanta Georgia
| | - Sonam Bhalla
- Division of Neurology, Department of Pediatrics, Emory University School of Medicine, Atlanta Georgia; Division of Pediatric Neurology, Children's Healthcare of Atlanta, Atlanta Georgia
| | - Matthew C Gombolay
- Georgia Institute of Technology, Interactive Computing, Atlanta, Georgia
| |
Collapse
|
16
|
Yew ANJ, Schraagen M, Otte WM, van Diessen E. Transforming epilepsy research: A systematic review on natural language processing applications. Epilepsia 2023; 64:292-305. [PMID: 36462150 PMCID: PMC10108221 DOI: 10.1111/epi.17474] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 11/23/2022] [Accepted: 12/01/2022] [Indexed: 12/05/2022]
Abstract
Despite improved ancillary investigations in epilepsy care, patients' narratives remain indispensable for diagnosing and treatment monitoring. This wealth of information is typically stored in electronic health records and accumulated in medical journals in an unstructured manner, thereby restricting complete utilization in clinical decision-making. To this end, clinical researchers increasing apply natural language processing (NLP)-a branch of artificial intelligence-as it removes ambiguity, derives context, and imbues standardized meaning from free-narrative clinical texts. This systematic review presents an overview of the current NLP applications in epilepsy and discusses the opportunities and drawbacks of NLP alongside its future implications. We searched the PubMed and Embase databases with a "natural language processing" and "epilepsy" query (March 4, 2022) and included original research articles describing the application of NLP techniques for textual analysis in epilepsy. Twenty-six studies were included. Fifty-eight percent of these studies used NLP to classify clinical records into predefined categories, improving patient identification and treatment decisions. Other applications of NLP had structured clinical information retrieval from electronic health records, scientific papers, and online posts of patients. Challenges and opportunities of NLP applications for enhancing epilepsy care and research are discussed. The field could further benefit from NLP by replicating successes in other health care domains, such as NLP-aided quality evaluation for clinical decision-making, outcome prediction, and clinical record summarization.
Collapse
Affiliation(s)
- Arister N J Yew
- University College Utrecht, Utrecht University, Utrecht, The Netherlands
| | - Marijn Schraagen
- Department of Information and Computing Sciences, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Willem M Otte
- Department of Child Neurology, Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| | - Eric van Diessen
- Department of Child Neurology, Brain Center, University Medical Center Utrecht and Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
17
|
Friedlander L, Vincent M, Berdal A, Cormier-Daire V, Lyonnet S, Garcelon N. Consideration of oral health in rare disease expertise centres: a retrospective study on 39 rare diseases using text mining extraction method. Orphanet J Rare Dis 2022; 17:317. [PMID: 35987771 PMCID: PMC9392290 DOI: 10.1186/s13023-022-02467-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 08/13/2022] [Indexed: 11/13/2022] Open
Abstract
Background Around 8000 rare diseases are currently defined. In the context of individual vulnerability and more specifically the one induced by rare diseases, ensuring oral health is a particularly important issue. The objective of the study is to evaluate the pattern of oral health care course for patients with any rare genetic disease. Description of oral phenotypic signs—which predict a theoretical dental health care course—and effective orientation into an oral healthcare were evaluated.
Materials and methods We set up a retrospective cohort study to describe the consideration of patient oral health and potential orientation to an oral health care course who have at least been seen once between 1 January 2017 and 1 January 2020 in Necker Enfants Malades Hospital. We recruited patients from this study using the data warehouse, Dr Warehouse® (DrWH), from Necker-Enfants Malades Hospital.
Results The study sample included 39 rare diseases, 2712 patients, with 54.7% girls and 45.3% boys. In the sample studied, 27.9% of patients had an acquisition delay or a pervasive developmental disorder. Among the patient files studied, oral and dental phenotypic signs were described for 18.40% of the patients, and an orientation in an oral healthcare was made in 15.60% of patients. The overall "network" effect was significantly associated with description of phenotypic signs (corrected p = 1.44e−77) and orientation to an oral healthcare (corrected p = 23.58e−44). Taking the Defiscience network (rare diseases of cerebral development and intellectual disability) as a reference for the odd ratio analysis, OSCAR, TETECOU, FILNEMUS, FIMARAD, MHEMO networks stand out from the other networks for their significantly higher consideration of oral phenotypic signs and orientation in an oral healthcare.
Conclusion To our knowledge, no study has explored the management of oral health in so many rare diseases. The expected benefits of this study are, among others, a better understanding, and a better knowledge of the oral care, or at least of the consideration of oral care, in patients with rare diseases. Moreover, with the will to improve the knowledge on genetic diseases, oral heath must have a major place in the deep patient phenotyping. Therefore, interdisciplinary consultations with health professionals from different fields are crucial.
Collapse
|
18
|
Chen X, Faviez C, Vincent M, Briseño-Roa L, Faour H, Annereau JP, Lyonnet S, Zaidan M, Saunier S, Garcelon N, Burgun A. Patient-Patient Similarity-Based Screening of a Clinical Data Warehouse to Support Ciliopathy Diagnosis. Front Pharmacol 2022; 13:786710. [PMID: 35401179 PMCID: PMC8993144 DOI: 10.3389/fphar.2022.786710] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 02/21/2022] [Indexed: 11/13/2022] Open
Abstract
A timely diagnosis is a key challenge for many rare diseases. As an expanding group of rare and severe monogenic disorders with a broad spectrum of clinical manifestations, ciliopathies, notably renal ciliopathies, suffer from important underdiagnosis issues. Our objective is to develop an approach for screening large-scale clinical data warehouses and detecting patients with similar clinical manifestations to those from diagnosed ciliopathy patients. We expect that the top-ranked similar patients will benefit from genetic testing for an early diagnosis. The dependence and relatedness between phenotypes were taken into account in our similarity model through medical concept embedding. The relevance of each phenotype to each patient was also considered by adjusted aggregation of phenotype similarity into patient similarity. A ranking model based on the best-subtype-average similarity was proposed to address the phenotypic overlapping and heterogeneity of ciliopathies. Our results showed that using less than one-tenth of learning sources, our language and center specific embedding provided comparable or better performances than other existing medical concept embeddings. Combined with the best-subtype-average ranking model, our patient-patient similarity-based screening approach was demonstrated effective in two large scale unbalanced datasets containing approximately 10,000 and 60,000 controls with kidney manifestations in the clinical data warehouse (about 2 and 0.4% of prevalence, respectively). Our approach will offer the opportunity to identify candidate patients who could go through genetic testing for ciliopathy. Earlier diagnosis, before irreversible end-stage kidney disease, will enable these patients to benefit from appropriate follow-up and novel treatments that could alleviate kidney dysfunction.
Collapse
Affiliation(s)
- Xiaoyi Chen
- Centre de Recherche des Cordeliers, INSERM, Sorbonne Université, Université de Paris, Paris, France.,HeKA, Inria, Paris, France.,Data Science Platform, Imagine Institute, Université de Paris, INSERM UMR 1163, Paris, France
| | - Carole Faviez
- Centre de Recherche des Cordeliers, INSERM, Sorbonne Université, Université de Paris, Paris, France.,HeKA, Inria, Paris, France
| | - Marc Vincent
- Data Science Platform, Imagine Institute, Université de Paris, INSERM UMR 1163, Paris, France
| | | | - Hassan Faour
- Data Science Platform, Imagine Institute, Université de Paris, INSERM UMR 1163, Paris, France
| | | | | | - Mohamad Zaidan
- Service de Néphrologie, Hôpital Universitaire Bicêtre, Kremlin Bicêtre, France
| | - Sophie Saunier
- Laboratory of Renal Hereditary Diseases, Imagine Institute, Université de Paris, INSERM UMR 1163, Paris, France
| | - Nicolas Garcelon
- Centre de Recherche des Cordeliers, INSERM, Sorbonne Université, Université de Paris, Paris, France.,HeKA, Inria, Paris, France.,Data Science Platform, Imagine Institute, Université de Paris, INSERM UMR 1163, Paris, France
| | - Anita Burgun
- Centre de Recherche des Cordeliers, INSERM, Sorbonne Université, Université de Paris, Paris, France.,HeKA, Inria, Paris, France.,Department of Medical Informatics, Hôpital Necker-Enfant Malades, AP-HP, Paris, France
| |
Collapse
|
19
|
Dong T, Zhu M, Li R, Wang X. Challenges of Utilizing Medical Big Data in Reproductive Health Research. FRONTIERS IN REPRODUCTIVE HEALTH 2022; 4:800760. [PMID: 36303614 PMCID: PMC9580750 DOI: 10.3389/frph.2022.800760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Accepted: 02/04/2022] [Indexed: 11/28/2022] Open
Abstract
In the background of the “Three-Child Policy” introduced by the Chinese government, reproductive health has become one of the most important public health issues. With the promotion of digitization management of medical care institutions for women and children in the country, there will be chances to acquire medical big data of obstetrics and pediatrics. Here the authors are presenting their opinions on the challenges of the management and utilization of reproductive big data.
Collapse
Affiliation(s)
- Tianyu Dong
- Tripod (Nanjing) Clinical Research Co., Ltd., Nanjing, China
| | - Min Zhu
- Department of Health IT Solution, Shanghai Synyi Medical Technology Co., Ltd., Shanghai, China
| | - Rui Li
- Department of Health IT Solution, Shanghai Synyi Medical Technology Co., Ltd., Shanghai, China
| | - Xu Wang
- Department of Endocrinology, Children's Hospital of Nanjing Medical University, Nanjing, China
- *Correspondence: Xu Wang
| |
Collapse
|
20
|
Crema C, Attardi G, Sartiano D, Redolfi A. Natural language processing in clinical neuroscience and psychiatry: A review. Front Psychiatry 2022; 13:946387. [PMID: 36186874 PMCID: PMC9515453 DOI: 10.3389/fpsyt.2022.946387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 08/22/2022] [Indexed: 11/13/2022] Open
Abstract
Natural language processing (NLP) is rapidly becoming an important topic in the medical community. The ability to automatically analyze any type of medical document could be the key factor to fully exploit the data it contains. Cutting-edge artificial intelligence (AI) architectures, particularly machine learning and deep learning, have begun to be applied to this topic and have yielded promising results. We conducted a literature search for 1,024 papers that used NLP technology in neuroscience and psychiatry from 2010 to early 2022. After a selection process, 115 papers were evaluated. Each publication was classified into one of three categories: information extraction, classification, and data inference. Automated understanding of clinical reports in electronic health records has the potential to improve healthcare delivery. Overall, the performance of NLP applications is high, with an average F1-score and AUC above 85%. We also derived a composite measure in the form of Z-scores to better compare the performance of NLP models and their different classes as a whole. No statistical differences were found in the unbiased comparison. Strong asymmetry between English and non-English models, difficulty in obtaining high-quality annotated data, and train biases causing low generalizability are the main limitations. This review suggests that NLP could be an effective tool to help clinicians gain insights from medical reports, clinical research forms, and more, making NLP an effective tool to improve the quality of healthcare services.
Collapse
Affiliation(s)
- Claudio Crema
- Laboratory of Neuroinformatics, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | | | - Daniele Sartiano
- Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche, Pisa, Italy
| | - Alberto Redolfi
- Laboratory of Neuroinformatics, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| |
Collapse
|