1
|
Esteban-Medina M, de la Oliva Roque VM, Herráiz-Gil S, Peña-Chilet M, Dopazo J, Loucera C. drexml: A command line tool and Python package for drug repurposing. Comput Struct Biotechnol J 2024; 23:1129-1143. [PMID: 38510973 PMCID: PMC10950807 DOI: 10.1016/j.csbj.2024.02.027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/27/2024] [Accepted: 02/27/2024] [Indexed: 03/22/2024] Open
Abstract
We introduce drexml, a command line tool and Python package for rational data-driven drug repurposing. The package employs machine learning and mechanistic signal transduction modeling to identify drug targets capable of regulating a particular disease. In addition, it employs explainability tools to contextualize potential drug targets within the functional landscape of the disease. The methodology is validated in Fanconi Anemia and Familial Melanoma, two distinct rare diseases where there is a pressing need for solutions. In the Fanconi Anemia case, the model successfully predicts previously validated repurposed drugs, while in the Familial Melanoma case, it identifies a promising set of drugs for further investigation.
Collapse
Affiliation(s)
- Marina Esteban-Medina
- Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
- Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocío, Seville, Spain
| | - Víctor Manuel de la Oliva Roque
- Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
- Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocío, Seville, Spain
| | - Sara Herráiz-Gil
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER-ISCIII), U714, Madrid, Spain
- Departamento de Bioingeniería, Universidad Carlos III de Madrid (UC3M), Madrid, Spain
- Regenerative Medicine and Tissue Engineering Group, Instituto de Investigación Sanitaria-Fundación Jiménez Díaz University Hospital (IIS-FJD), Madrid, Spain
- Epithelial Biomedicine Division, Centro de Investigaciones Energéticas, Medioambientales y Tecnológicas (CIEMAT), Madrid, Spain
| | - María Peña-Chilet
- Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
- Platform of Big Data, AI and Biostatistics, Health Research Institute La Fe (IISLAFE), Valencia, Spain
| | - Joaquín Dopazo
- Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
- Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocío, Seville, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER-ISCIII), U715, Seville, Spain
- FPS/ELIXIR-es, Hospital Virgen del Rocío, Seville, Spain
| | - Carlos Loucera
- Platform for Computational Medicine, Andalusian Public Foundation Progress and Health-FPS, Seville, Spain
- Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocío, Seville, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER-ISCIII), U715, Seville, Spain
| |
Collapse
|
2
|
Becerra-Muñoz VM, Gómez Sáenz JT, Escribano Subías P. [The importance of data in Pulmonary Arterial Hypertension: from international registries to Machine Learning]. Med Clin (Barc) 2024; 162:591-598. [PMID: 38383269 DOI: 10.1016/j.medcli.2023.12.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/03/2023] [Accepted: 12/05/2023] [Indexed: 02/23/2024]
Abstract
Real-world registries have been critical to building the scientific knowledge of rare diseases, including Pulmonary Arterial Hypertension (PAH). In the past 4 decades, a considerable number of registries on this condition have allowed to improve the pathology and its subgroupś definition, to advance in the understanding of its pathophysiology, to elaborate prognostic scales and to check the transferability of the results from clinical trials to clinical practice. However, in a moment where a huge amount of data from multiple sources is available, they are not always taken into account by the registries. For that reason, Machine Learning (ML) offer a unique opportunity to manage all these data and, finally, to obtain tools that may help to get an earlier diagnose, to help to deduce the prognosis and, in the end, to advance in Personalized Medicine. Thus, we present a narrative revision with the aims of, in one hand, summing up the aspects in which data extraction is important in rare diseases -focusing on the knowledge gained from PAH real-world registries- and, on the other hand, describing some of the achievements and the potential use of the ML techniques on PAH.
Collapse
Affiliation(s)
- Víctor Manuel Becerra-Muñoz
- Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBERCV), Madrid, España; Servicio de Cardiología, Instituto de Investigación Biomédica de Málaga (IBIMA), Málaga, España; Hospital Universitario Virgen de la Victoria, Universidad de Málaga (UMA), Málaga, España.
| | - José Tomás Gómez Sáenz
- Centro de Salud de Nájera, La Rioja, España; Sociedad Española de Médicos de Atención Primaria (SEMERGEN), Madrid, España
| | - Pilar Escribano Subías
- Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBERCV), Madrid, España; Hospital Universitario 12 de Octubre, Madrid, España
| |
Collapse
|
3
|
Faviez C, Chen X, Garcelon N, Zaidan M, Billot K, Petzold F, Faour H, Douillet M, Rozet JM, Cormier-Daire V, Attié-Bitach T, Lyonnet S, Saunier S, Burgun A. Objectivizing issues in the diagnosis of complex rare diseases: lessons learned from testing existing diagnosis support systems on ciliopathies. BMC Med Inform Decis Mak 2024; 24:134. [PMID: 38789985 PMCID: PMC11127295 DOI: 10.1186/s12911-024-02538-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Accepted: 05/17/2024] [Indexed: 05/26/2024] Open
Abstract
BACKGROUND There are approximately 8,000 different rare diseases that affect roughly 400 million people worldwide. Many of them suffer from delayed diagnosis. Ciliopathies are rare monogenic disorders characterized by a significant phenotypic and genetic heterogeneity that raises an important challenge for clinical diagnosis. Diagnosis support systems (DSS) applied to electronic health record (EHR) data may help identify undiagnosed patients, which is of paramount importance to improve patients' care. Our objective was to evaluate three online-accessible rare disease DSSs using phenotypes derived from EHRs for the diagnosis of ciliopathies. METHODS Two datasets of ciliopathy cases, either proven or suspected, and two datasets of controls were used to evaluate the DSSs. Patient phenotypes were automatically extracted from their EHRs and converted to Human Phenotype Ontology terms. We tested the ability of the DSSs to diagnose cases in contrast to controls based on Orphanet ontology. RESULTS A total of 79 cases and 38 controls were selected. Performances of the DSSs on ciliopathy real world data (best DSS with area under the ROC curve = 0.72) were not as good as published performances on the test set used in the DSS development phase. None of these systems obtained results which could be described as "expert-level". Patients with multisystemic symptoms were generally easier to diagnose than patients with isolated symptoms. Diseases easily confused with ciliopathy generally affected multiple organs and had overlapping phenotypes. Four challenges need to be considered to improve the performances: to make the DSSs interoperable with EHR systems, to validate the performances in real-life settings, to deal with data quality, and to leverage methods and resources for rare and complex diseases. CONCLUSION Our study provides insights into the complexities of diagnosing highly heterogenous rare diseases and offers lessons derived from evaluation existing DSSs in real-world settings. These insights are not only beneficial for ciliopathy diagnosis but also hold relevance for the enhancement of DSS for various complex rare disorders, by guiding the development of more clinically relevant rare disease DSSs, that could support early diagnosis and finally make more patients eligible for treatment.
Collapse
Affiliation(s)
- Carole Faviez
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France.
- HeKA, Inria Paris, Paris, F-75012, France.
- Universite Paris Cite, Paris, France.
| | - Xiaoyi Chen
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France
- HeKA, Inria Paris, Paris, F-75012, France
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Nicolas Garcelon
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France
- HeKA, Inria Paris, Paris, F-75012, France
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Mohamad Zaidan
- Service de Néphrologie, Dialyse et Transplantation, Hôpital Universitaire Bicêtre, Assistance Publique-Hôpitaux de Paris (AP-HP), Kremlin Bicêtre, F-94270, France
| | - Katy Billot
- Laboratory of Renal Hereditary Diseases, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
| | - Friederike Petzold
- Laboratory of Renal Hereditary Diseases, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
- Division of Nephrology, University of Leipzig Medical Center, Leipzig, Germany
| | - Hassan Faour
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Maxime Douillet
- Data Science Platform, Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, F-75015, France
| | - Jean-Michel Rozet
- Laboratory of Genetics in Ophthalmology, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
| | - Valérie Cormier-Daire
- Reference Centre for Constitutional Bone Diseases, laboratory of Osteochondrodysplasia, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
- Service de médecine génomique des maladies rares, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
| | - Tania Attié-Bitach
- Service d'Histologie-Embryologie-Cytogénétique, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
| | - Stanislas Lyonnet
- Service de médecine génomique des maladies rares, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
- Laboratory of Embryology and Genetics of Congenital Malformations, INSERM UMR 1163, Imagine Institute, Paris Cité, Paris, F-75015, France
| | - Sophie Saunier
- Laboratory of Renal Hereditary Diseases, Imagine Institute, INSERM UMR 1163, Université Paris Cité, Paris, F-75015, France
| | - Anita Burgun
- Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université Paris Cité, Paris, F-75006, France
- HeKA, Inria Paris, Paris, F-75012, France
- Department of Medical Informatics, Hôpital Necker-Enfants Malades, AP-HP, Paris, F-75015, France
| |
Collapse
|
4
|
Rei L, Pita Costa J, Zdolšek Draksler T. Automatic Classification and Visualization of Text Data on Rare Diseases. J Pers Med 2024; 14:545. [PMID: 38793127 PMCID: PMC11121901 DOI: 10.3390/jpm14050545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 05/04/2024] [Accepted: 05/15/2024] [Indexed: 05/26/2024] Open
Abstract
More than 7000 rare diseases affect over 400 million people, posing significant challenges for medical research and healthcare. The integration of precision medicine with artificial intelligence offers promising solutions. This work introduces a classifier developed to discern whether research and news articles pertain to rare or non-rare diseases. Our methodology involves extracting 709 rare disease MeSH terms from Mondo and MeSH to improve rare disease categorization. We evaluate our classifier on abstracts from PubMed/MEDLINE and an expert-annotated news dataset, which includes news articles on four selected rare neurodevelopmental disorders (NDDs)-considered the largest category of rare diseases-from a total of 16 analyzed. We achieved F1 scores of 85% for abstracts and 71% for news articles, demonstrating robustness across both datasets and highlighting the potential of integrating artificial intelligence and ontologies to improve disease classification. Although the results are promising, they also indicate the need for further refinement in managing data heterogeneity. Our classifier improves the identification and categorization of medical information, essential for advancing research, enhancing information access, influencing policy, and supporting personalized treatments. Future work will focus on expanding disease classification to distinguish between attributes such as infectious and hereditary diseases, addressing data heterogeneity, and incorporating multilingual capabilities.
Collapse
Affiliation(s)
- Luis Rei
- Jožef Stefan Institute (IJS), Jamova 39, 1000 Ljubljana, Slovenia; (L.R.); (T.Z.D.)
- Jožef Stefan Institute, Jožef Stefan International Postgraduate School, Jamova 39, 1000 Ljubljana, Slovenia
| | - Joao Pita Costa
- International Research Centre for Artificial Intelligence under the Auspices of UNESCO (IRCAI), Jamova 39, 1000 Ljubljana, Slovenia
- Quintelligence, Teslova 30, 1000 Ljubljana, Slovenia
| | - Tanja Zdolšek Draksler
- Jožef Stefan Institute (IJS), Jamova 39, 1000 Ljubljana, Slovenia; (L.R.); (T.Z.D.)
- International Research Centre for Artificial Intelligence under the Auspices of UNESCO (IRCAI), Jamova 39, 1000 Ljubljana, Slovenia
- IDefine Europe, Jamova 39, 1000 Ljubljana, Slovenia
| |
Collapse
|
5
|
Raycheva R, Kostadinov K, Mitova E, Iskrov G, Stefanov G, Vakevainen M, Elomaa K, Man YS, Gross E, Zschüntzsch J, Röttger R, Stefanov R. Landscape analysis of available European data sources amenable for machine learning and recommendations on usability for rare diseases screening. Orphanet J Rare Dis 2024; 19:147. [PMID: 38582900 PMCID: PMC10998425 DOI: 10.1186/s13023-024-03162-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 03/30/2024] [Indexed: 04/08/2024] Open
Abstract
BACKGROUND Patient registries and databases are essential tools for advancing clinical research in the area of rare diseases, as well as for enhancing patient care and healthcare planning. The primary aim of this study is a landscape analysis of available European data sources amenable to machine learning (ML) and their usability for Rare Diseases screening, in terms of findable, accessible, interoperable, reusable(FAIR), legal, and business considerations. Second, recommendations will be proposed to provide a better understanding of the health data ecosystem. METHODS In the period of March 2022 to December 2022, a cross-sectional study using a semi-structured questionnaire was conducted among potential respondents, identified as main contact person of a health-related databases. The design of the self-completed questionnaire survey instrument was based on information drawn from relevant scientific publications, quantitative and qualitative research, and scoping review on challenges in mapping European rare disease (RD) databases. To determine database characteristics associated with the adherence to the FAIR principles, legal and business aspects of database management Bayesian models were fitted. RESULTS In total, 330 unique replies were processed and analyzed, reflecting the same number of distinct databases (no duplicates included). In terms of geographical scope, we observed 24.2% (n = 80) national, 10.0% (n = 33) regional, 8.8% (n = 29) European, and 5.5% (n = 18) international registries coordinated in Europe. Over 80.0% (n = 269) of the databases were still active, with approximately 60.0% (n = 191) established after the year 2000 and 71.0% last collected new data in 2022. Regarding their geographical scope, European registries were associated with the highest overall FAIR adherence, while registries with regional and "other" geographical scope were ranked at the bottom of the list with the lowest proportion. Responders' willingness to share data as a contribution to the goals of the Screen4Care project was evaluated at the end of the survey. This question was completed by 108 respondents; however, only 18 of them (16.7%) expressed a direct willingness to contribute to the project by sharing their databases. Among them, an equal split between pro-bono and paid services was observed. CONCLUSIONS The most important results of our study demonstrate not enough sufficient FAIR principles adherence and low willingness of the EU health databases to share patient information, combined with some legislation incapacities, resulting in barriers to the secondary use of data.
Collapse
Affiliation(s)
- Ralitsa Raycheva
- Department of Social Medicine and Public Health, Faculty of Public Health, Medical University of Plovdiv, Plovdiv, Bulgaria.
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria.
| | - Kostadin Kostadinov
- Department of Social Medicine and Public Health, Faculty of Public Health, Medical University of Plovdiv, Plovdiv, Bulgaria
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Elena Mitova
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Georgi Iskrov
- Department of Social Medicine and Public Health, Faculty of Public Health, Medical University of Plovdiv, Plovdiv, Bulgaria
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Georgi Stefanov
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| | - Merja Vakevainen
- Pfizer Biopharmaceuticals Group, Medical Affairs, Helsinki, Finland
| | | | - Yuen-Sum Man
- Global Medical Affairs Rare Disease, Novo Nordisk Health Care AG, Zurich, Switzerland
| | - Edith Gross
- EURORDIS - Rare Diseases Europe, 96 Rue Didot, Paris, 75014, France
| | - Jana Zschüntzsch
- Department of Neurology, University Medical Center, Göttingen, Germany
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Rumen Stefanov
- Department of Social Medicine and Public Health, Faculty of Public Health, Medical University of Plovdiv, Plovdiv, Bulgaria
- Bulgarian Association for Promotion of Education and Science, Institute for Rare Disease, Plovdiv, Bulgaria
| |
Collapse
|
6
|
Ceskoutsé RFT, Bomgni AB, Gnimpieba Zanfack DR, Agany DDM, Bouetou Bouetou T, Gnimpieba Zohim E. Sub-clustering based recommendation system for stroke patient: Identification of a specific drug class for a given patient. Comput Biol Med 2024; 171:108117. [PMID: 38335820 PMCID: PMC10981530 DOI: 10.1016/j.compbiomed.2024.108117] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Revised: 01/29/2024] [Accepted: 02/04/2024] [Indexed: 02/12/2024]
Abstract
Stroke is one of the leading causes of death worldwide. Previous studies have explored machine learning techniques for early detection of stroke patients using content-based recommendation systems. However, these models often struggle with timely detection of medications, which can be critical for patient management and decision-making regarding the prescription of new drugs. In this study, we developed a content-based recommendation model using three machine learning algorithms: Gaussian Mixture Model (GMM), Affinity Propagation (AP), and K-Nearest Neighbors (KNN), to aid Healthcare Professionals (HCP) in quickly detecting medications based on the symptoms of a patient with stroke. Our model focused on three classes of drugs: antihypertensive, anticoagulant, and fibrate. Each machine learning algorithm was used to accomplish specific tasks, thereby reducing the partial search space, computational cost, and accurately detecting a primary drug class without loss of precision and accuracy. Our proposed model, called CRGANNC (Clustering Recommendation Gaussian Affinity Nearest Neighbors Classifier), effectively addresses the sparsity and scalability issues faced by content-based recommendation models. The CRGANNC model dynamically partition clusters into sub-clusters with variable numbers based on the group, and can diagnose healthy, sick, and at-risk patients, and recommend drugs to the HCP. In addition to our analysis, we developed a semi-artificial dataset with new features such as weakness, dizziness, headache, nausea, and vomiting, using a pipeline. This dataset serves as a valuable resource for researchers in the sensitive domain of stroke, providing a starting point for building and testing models when real data is often restricted. Our work not only contributes to the development of predictive models for stroke but also establishes a framework for creating similar datasets in other sensitive domains, accelerating research efforts and improving patient care. Our experiments were conducted on our dataset consisting of 9691 patient records, with 1206 records for stroke attacks and 8485 healthy patients. The CRGANNC model achieved an average precision of 0.98, recall of 0.95 and F1-score of 0.96 across all three drugs classes. Furthermore, our model demonstrated significant improvement in computational efficiency compared to existing content-based recommendation models, reducing the processing time by 25.80% . This results indicate the effectiveness of our model in accurately detecting medications for stroke patients based on their symptoms.
Collapse
Affiliation(s)
- Ribot Fleury T Ceskoutsé
- Ecole Nationale Supérieure Polytechnique, University of Yaounde I, P.O. Box. 8390, Yaoundé, Cameroon.
| | - Alain Bertrand Bomgni
- University of South Dakota, 4800 N Career Avenue, 57107, SD, USA; Departement of Mathematics and computer science, University of Dschang, P.O. Box. 67, Dschang, Cameroon.
| | - David R Gnimpieba Zanfack
- Laboratory of Innovative Technologies (LTI), University of Picardie Jule Verne (UPJV), 48 Rue Raspail, 02100 Saint Quentin, France.
| | - Diing D M Agany
- University of South Dakota, 4800 N Career Avenue, 57107, SD, USA.
| | - Thomas Bouetou Bouetou
- Ecole Nationale Supérieure Polytechnique, University of Yaounde I, P.O. Box. 8390, Yaoundé, Cameroon.
| | | |
Collapse
|
7
|
Abdullahi T, Singh R, Eickhoff C. Learning to Make Rare and Complex Diagnoses With Generative AI Assistance: Qualitative Study of Popular Large Language Models. JMIR MEDICAL EDUCATION 2024; 10:e51391. [PMID: 38349725 PMCID: PMC10900078 DOI: 10.2196/51391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 11/07/2023] [Accepted: 12/11/2023] [Indexed: 02/15/2024]
Abstract
BACKGROUND Patients with rare and complex diseases often experience delayed diagnoses and misdiagnoses because comprehensive knowledge about these diseases is limited to only a few medical experts. In this context, large language models (LLMs) have emerged as powerful knowledge aggregation tools with applications in clinical decision support and education domains. OBJECTIVE This study aims to explore the potential of 3 popular LLMs, namely Bard (Google LLC), ChatGPT-3.5 (OpenAI), and GPT-4 (OpenAI), in medical education to enhance the diagnosis of rare and complex diseases while investigating the impact of prompt engineering on their performance. METHODS We conducted experiments on publicly available complex and rare cases to achieve these objectives. We implemented various prompt strategies to evaluate the performance of these models using both open-ended and multiple-choice prompts. In addition, we used a majority voting strategy to leverage diverse reasoning paths within language models, aiming to enhance their reliability. Furthermore, we compared their performance with the performance of human respondents and MedAlpaca, a generative LLM specifically designed for medical tasks. RESULTS Notably, all LLMs outperformed the average human consensus and MedAlpaca, with a minimum margin of 5% and 13%, respectively, across all 30 cases from the diagnostic case challenge collection. On the frequently misdiagnosed cases category, Bard tied with MedAlpaca but surpassed the human average consensus by 14%, whereas GPT-4 and ChatGPT-3.5 outperformed MedAlpaca and the human respondents on the moderately often misdiagnosed cases category with minimum accuracy scores of 28% and 11%, respectively. The majority voting strategy, particularly with GPT-4, demonstrated the highest overall score across all cases from the diagnostic complex case collection, surpassing that of other LLMs. On the Medical Information Mart for Intensive Care-III data sets, Bard and GPT-4 achieved the highest diagnostic accuracy scores, with multiple-choice prompts scoring 93%, whereas ChatGPT-3.5 and MedAlpaca scored 73% and 47%, respectively. Furthermore, our results demonstrate that there is no one-size-fits-all prompting approach for improving the performance of LLMs and that a single strategy does not universally apply to all LLMs. CONCLUSIONS Our findings shed light on the diagnostic capabilities of LLMs and the challenges associated with identifying an optimal prompting strategy that aligns with each language model's characteristics and specific task requirements. The significance of prompt engineering is highlighted, providing valuable insights for researchers and practitioners who use these language models for medical training. Furthermore, this study represents a crucial step toward understanding how LLMs can enhance diagnostic reasoning in rare and complex medical cases, paving the way for developing effective educational tools and accurate diagnostic aids to improve patient care and outcomes.
Collapse
Affiliation(s)
- Tassallah Abdullahi
- Department of Computer Science, Brown University, Providence, RI, United States
| | - Ritambhara Singh
- Department of Computer Science, Brown University, Providence, RI, United States
- Center for Computational Molecular Biology, Brown University, Providence, RI, United States
| | | |
Collapse
|
8
|
Waters MR, Inkman M, Jayachandran K, Kowalchuk RM, Robinson C, Schwarz JK, Swamidass SJ, Griffith OL, Szymanski JJ, Zhang J. GAiN: An integrative tool utilizing generative adversarial neural networks for augmented gene expression analysis. PATTERNS (NEW YORK, N.Y.) 2024; 5:100910. [PMID: 38370125 PMCID: PMC10873154 DOI: 10.1016/j.patter.2023.100910] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 10/23/2023] [Accepted: 12/07/2023] [Indexed: 02/20/2024]
Abstract
Big genomic data and artificial intelligence (AI) are ushering in an era of precision medicine, providing opportunities to study previously under-represented subtypes and rare diseases rather than categorize them as variances. However, clinical researchers face challenges in accessing such novel technologies as well as reliable methods to study small datasets or subcohorts with unique phenotypes. To address this need, we developed an integrative approach, GAiN, to capture patterns of gene expression from small datasets on the basis of an ensemble of generative adversarial networks (GANs) while leveraging big population data. Where conventional biostatistical methods fail, GAiN reliably discovers differentially expressed genes (DEGs) and enriched pathways between two cohorts with limited numbers of samples (n = 10) when benchmarked against a gold standard. GAiN is freely available at GitHub. Thus, GAiN may serve as a crucial tool for gene expression analysis in scenarios with limited samples, as in the context of rare diseases, under-represented populations, or limited investigator resources.
Collapse
Affiliation(s)
- Michael R. Waters
- Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Matthew Inkman
- Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Kay Jayachandran
- Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | | | - Clifford Robinson
- Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO 63108, USA
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Julie K. Schwarz
- Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO 63108, USA
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, USA
- Department of Cell Biology and Physiology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - S. Joshua Swamidass
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO 63110, USA
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63105, USA
- Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO 63105, USA
| | - Obi L. Griffith
- Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
- McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Jeffrey J. Szymanski
- Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO 63108, USA
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Jin Zhang
- Department of Radiation Oncology, Washington University School of Medicine, St. Louis, MO 63108, USA
- Alvin J. Siteman Cancer Center, Washington University School of Medicine, St. Louis, MO 63110, USA
- Institute for Informatics (I), Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
9
|
Liu Y, Li L, Li J, Liu H, Geru A, Wang Y, Li Y, Sia CH, Lip GYH, Yang Q, Zhou X. Development and Validation of a Predictive Model for Intracranial Haemorrhage in Patients on Direct Oral Anticoagulants. Clin Appl Thromb Hemost 2024; 30:10760296241271338. [PMID: 39140863 PMCID: PMC11325470 DOI: 10.1177/10760296241271338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024] Open
Abstract
BACKGROUND Intracranial haemorrhage (ICH) poses a significant threat to patients on Direct Oral Anticoagulants (DOACs), with existing risk scores inadequately predicting ICH risk in these patients. We aim to develop and validate a predictive model for ICH risk in DOAC-treated patients. METHODS 24,794 patients treated with a DOAC were identified in a province-wide electronic medical and health data platform in Tianjin, China. The cohort was randomly split into a 4:1 ratio for model development and validation. We utilized forward stepwise selection, Least Absolute Shrinkage and Selection Operator (LASSO), and eXtreme Gradient Boosting (XGBoost) to select predictors. Model performance was compared using the area under the curve (AUC) and net reclassification index (NRI). The optimal model was stratified and compared with the DOAC model. RESULTS The median age is 68.0 years, and 50.4% of participants are male. The XGBoost model, incorporating six independent factors (history of hemorrhagic stroke, peripheral artery disease, venous thromboembolism, hypertension, age, low-density lipoprotein cholesterol levels), demonstrated superior performance in the development dateset. It showed moderate discrimination (AUC: 0.68, 95% CI: 0.64-0.73), outperforming existing DOAC scores (ΔAUC = 0.063, P = 0.003; NRI = 0.374, P < 0.001). Risk categories significantly stratified ICH risk (low risk: 0.26%, moderate risk: 0.74%, high risk: 5.51%). Finally, the model demonstrated consistent predictive performance in the internal validation. CONCLUSION In a real-world Chinese population using DOAC therapy, this study presents a reliable predictive model for ICH risk. The XGBoost model, integrating six key risk factors, offers a valuable tool for individualized risk assessment in the context of oral anticoagulation therapy.
Collapse
Affiliation(s)
- Yuanyuan Liu
- Department of Cardiology, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin 300052, China
- Department of Cardiology, Qingzhou People's Hospital, Weifang, Shandong 262500, China
| | - Linjie Li
- Department of Cardiology, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin 300052, China
| | - Jingge Li
- Department of Cardiology, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin 300052, China
| | - Hangkuan Liu
- Department of Cardiology, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin 300052, China
| | - A Geru
- Department of Cardiology, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin 300052, China
| | - Yulong Wang
- Department of Cardiology, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin 300052, China
| | - Yongle Li
- Department of Cardiology, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin 300052, China
| | - Ching-Hui Sia
- Yong Loo-Lin School of Medicine, National University of Singapore, 1E, Kent, Ridge Road, Singapore 119228, Singapore
- Department of Cardiology, National University Heart Centre, 5 Lower Kent Ridge Rd, Singapore 119074, Singapore
| | - Gregory Y H Lip
- Liverpool Centre for Cardiovascular Science at University of Liverpool, Liverpool John Moores University and Liverpool Heart & Chest Hospital, Liverpool, UK
- Danish Center for Health Services Research, Department of Clinical Medicine, Aalborg University, Aalborg, Denmark
| | - Qing Yang
- Department of Cardiology, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin 300052, China
| | - Xin Zhou
- Department of Cardiology, Tianjin Medical University General Hospital, 154 Anshan Road, Heping District, Tianjin 300052, China
| |
Collapse
|
10
|
Lee H, Yang HL, Ryu HG, Jung CW, Cho YJ, Yoon SB, Yoon HK, Lee HC. Real-time machine learning model to predict in-hospital cardiac arrest using heart rate variability in ICU. NPJ Digit Med 2023; 6:215. [PMID: 37993540 PMCID: PMC10665411 DOI: 10.1038/s41746-023-00960-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 11/05/2023] [Indexed: 11/24/2023] Open
Abstract
Predicting in-hospital cardiac arrest in patients admitted to an intensive care unit (ICU) allows prompt interventions to improve patient outcomes. We developed and validated a machine learning-based real-time model for in-hospital cardiac arrest predictions using electrocardiogram (ECG)-based heart rate variability (HRV) measures. The HRV measures, including time/frequency domains and nonlinear measures, were calculated from 5 min epochs of ECG signals from ICU patients. A light gradient boosting machine (LGBM) algorithm was used to develop the proposed model for predicting in-hospital cardiac arrest within 0.5-24 h. The LGBM model using 33 HRV measures achieved an area under the receiver operating characteristic curve of 0.881 (95% CI: 0.875-0.887) and an area under the precision-recall curve of 0.104 (95% CI: 0.093-0.116). The most important feature was the baseline width of the triangular interpolation of the RR interval histogram. As our model uses only ECG data, it can be easily applied in clinical practice.
Collapse
Affiliation(s)
- Hyeonhoon Lee
- Department of Anesthesiology and Pain Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Data Science Research, Innovative Medical Technology Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
| | - Hyun-Lim Yang
- Department of Anesthesiology and Pain Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Medical Device Development Support, Innovative Medical Technology Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
| | - Ho Geol Ryu
- Department of Anesthesiology and Pain Medicine, Seoul National University College of Medicine, Seoul National University Hospital, Seoul, Republic of Korea
- Department of Critical Care Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Chul-Woo Jung
- Department of Anesthesiology and Pain Medicine, Seoul National University College of Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Youn Joung Cho
- Department of Anesthesiology and Pain Medicine, Seoul National University College of Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Soo Bin Yoon
- Department of Anesthesiology and Pain Medicine, Seoul National University College of Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Hyun-Kyu Yoon
- Department of Anesthesiology and Pain Medicine, Seoul National University College of Medicine, Seoul National University Hospital, Seoul, Republic of Korea
| | - Hyung-Chul Lee
- Department of Data Science Research, Innovative Medical Technology Research Institute, Seoul National University Hospital, Seoul, Republic of Korea.
- Department of Anesthesiology and Pain Medicine, Seoul National University College of Medicine, Seoul National University Hospital, Seoul, Republic of Korea.
| |
Collapse
|
11
|
Addad VV, Palma LMP, Vaisbich MH, Pacheco Barbosa AM, da Rocha NC, de Almeida Cardoso MM, de Almeida JTC, de Paula de Sordi MA, Machado-Rugolo J, Arantes LF, de Andrade LGM. A comprehensive model for assessing and classifying patients with thrombotic microangiopathy: the TMA-INSIGHT score. Thromb J 2023; 21:119. [PMID: 37993892 PMCID: PMC10664252 DOI: 10.1186/s12959-023-00564-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 11/13/2023] [Indexed: 11/24/2023] Open
Abstract
BACKGROUND Thrombotic Microangiopathy (TMA) is a syndrome characterized by the presence of anemia, thrombocytopenia and organ damage and has multiple etiologies. The primary aim is to develop an algorithm to classify TMA (TMA-INSIGHT score). METHODS This was a single-center retrospective cohort study including hospitalized patients with TMA at a single center. We included all consecutive patients diagnosed with TMA between 2012 and 2021. TMA was defined based on the presence of anemia (hemoglobin level < 10 g/dL) and thrombocytopenia (platelet count < 150,000/µL), signs of hemolysis, and organ damage. We classified patients in eight categories: infections; Malignant Hypertension; Transplant; Malignancy; Pregnancy; Thrombotic Thrombocytopenic Purpura (TTP); Shiga toxin-mediated hemolytic uremic syndrome (STEC-SHU) and Complement Mediated TMA (aHUS). We fitted a model to classify patients using clinical characteristics, biochemical exams, and mean arterial pressure at presentation. RESULTS We retrospectively retrieved TMA phenotypes using automatic strategies in electronic health records in almost 10 years (n = 2407). Secondary TMA was found in 97.5% of the patients. Primary TMA was found in 2.47% of the patients (TTP and aHUS). The best model was LightGBM with accuracy of 0.979, and multiclass ROC-AUC of 0.966. The predictions had higher accuracy in most TMA classes, although the confidence was lower in aHUS and STEC-HUS cases. CONCLUSION Secondary conditions were the most common etiologies of TMA. We retrieved comorbidities, associated conditions, and mean arterial pressure to fit a model to predict TMA and define TMA phenotypic characteristics. This is the first multiclass model to predict TMA including primary and secondary conditions.
Collapse
Affiliation(s)
- Vanessa Vilani Addad
- Department of Internal Medicine - UNESP, Univ Estadual Paulista, Rubião Jr, s/n, Botucatu/SP, 18618-687, Brazil
| | - Lilian Monteiro Pereira Palma
- Department of Pediatrics, Universidade Estadual de Campinas, R. Tessália Vieira de Camargo, 126 - Cidade Universitária, Campinas/SP, 13083-887, Brazil
| | - Maria Helena Vaisbich
- Pediatric Nephrology Service, Child Institute, University of São Paulo, Av. Dr. Enéas Carvalho de Aguiar, 647, São Paulo, SP, 05403-000, Brazil
| | | | - Naila Camila da Rocha
- Department of Internal Medicine - UNESP, Univ Estadual Paulista, Rubião Jr, s/n, Botucatu/SP, 18618-687, Brazil
| | | | | | | | - Juliana Machado-Rugolo
- Health Technology Assessment Center of Hospital das Clínicas - HCFMB, Botucatu, SP, Brazil
| | | | | |
Collapse
|
12
|
Abdulazeem H, Whitelaw S, Schauberger G, Klug SJ. A systematic review of clinical health conditions predicted by machine learning diagnostic and prognostic models trained or validated using real-world primary health care data. PLoS One 2023; 18:e0274276. [PMID: 37682909 PMCID: PMC10491005 DOI: 10.1371/journal.pone.0274276] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 08/29/2023] [Indexed: 09/10/2023] Open
Abstract
With the advances in technology and data science, machine learning (ML) is being rapidly adopted by the health care sector. However, there is a lack of literature addressing the health conditions targeted by the ML prediction models within primary health care (PHC) to date. To fill this gap in knowledge, we conducted a systematic review following the PRISMA guidelines to identify health conditions targeted by ML in PHC. We searched the Cochrane Library, Web of Science, PubMed, Elsevier, BioRxiv, Association of Computing Machinery (ACM), and IEEE Xplore databases for studies published from January 1990 to January 2022. We included primary studies addressing ML diagnostic or prognostic predictive models that were supplied completely or partially by real-world PHC data. Studies selection, data extraction, and risk of bias assessment using the prediction model study risk of bias assessment tool were performed by two investigators. Health conditions were categorized according to international classification of diseases (ICD-10). Extracted data were analyzed quantitatively. We identified 106 studies investigating 42 health conditions. These studies included 207 ML prediction models supplied by the PHC data of 24.2 million participants from 19 countries. We found that 92.4% of the studies were retrospective and 77.3% of the studies reported diagnostic predictive ML models. A majority (76.4%) of all the studies were for models' development without conducting external validation. Risk of bias assessment revealed that 90.8% of the studies were of high or unclear risk of bias. The most frequently reported health conditions were diabetes mellitus (19.8%) and Alzheimer's disease (11.3%). Our study provides a summary on the presently available ML prediction models within PHC. We draw the attention of digital health policy makers, ML models developer, and health care professionals for more future interdisciplinary research collaboration in this regard.
Collapse
Affiliation(s)
- Hebatullah Abdulazeem
- Chair of Epidemiology, Department of Sport and Health Sciences, Technical University of Munich (TUM), Munich, Germany
| | - Sera Whitelaw
- Faculty of Medicine and Health Sciences, McGill University, Montreal, Quebec, Canada
| | - Gunther Schauberger
- Chair of Epidemiology, Department of Sport and Health Sciences, Technical University of Munich (TUM), Munich, Germany
| | - Stefanie J. Klug
- Chair of Epidemiology, Department of Sport and Health Sciences, Technical University of Munich (TUM), Munich, Germany
| |
Collapse
|
13
|
Banerjee J, Taroni JN, Allaway RJ, Prasad DV, Guinney J, Greene C. Machine learning in rare disease. Nat Methods 2023:10.1038/s41592-023-01886-z. [PMID: 37248386 DOI: 10.1038/s41592-023-01886-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Accepted: 04/22/2023] [Indexed: 05/31/2023]
Abstract
High-throughput profiling methods (such as genomics or imaging) have accelerated basic research and made deep molecular characterization of patient samples routine. These approaches provide a rich portrait of genes, molecular pathways and cell types involved in disease phenotypes. Machine learning (ML) can be a useful tool for extracting disease-relevant patterns from high-dimensional datasets. However, depending upon the complexity of the biological question, machine learning often requires many samples to identify recurrent and biologically meaningful patterns. Rare diseases are inherently limited in clinical cases, leading to few samples to study. In this Perspective, we outline the challenges and emerging solutions for using ML for small sample sets, specifically in rare diseases. Advances in ML methods for rare diseases are likely to be informative for applications beyond rare diseases for which few samples exist with high-dimensional data. We propose that the method community prioritize the development of ML techniques for rare disease research.
Collapse
Affiliation(s)
| | - Jaclyn N Taroni
- Childhood Cancer Data Lab, Alex's Lemonade Stand Foundation, Philadelphia, PA, USA
| | | | | | | | - Casey Greene
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, USA.
| |
Collapse
|
14
|
Valero-Tena E, Roca-Espiau M, Verdú-Díaz J, Diaz-Manera J, Andrade-Campos M, Giraldo P. Advantages of digital technology in the assessment of bone marrow involvement in Gaucher's disease. Front Med (Lausanne) 2023; 10:1098472. [PMID: 37250646 PMCID: PMC10213682 DOI: 10.3389/fmed.2023.1098472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2022] [Accepted: 04/10/2023] [Indexed: 05/31/2023] Open
Abstract
Gaucher disease (GD) is a genetic lysosomal disorder characterized by high bone marrow (BM) involvement and skeletal complications. The pathophysiology of these complications is not fully elucidated. Magnetic resonance imaging (MRI) is the gold standard to evaluate BM. This study aimed to apply machine-learning techniques in a cohort of Spanish GD patients by a structured bone marrow MRI reporting model at diagnosis and follow-up to predict the evolution of the bone disease. In total, 441 digitalized MRI studies from 131 patients (M: 69, F:62) were reevaluated by a blinded expert radiologist who applied a structured report template. The studies were classified into categories carried out at different stages as follows: A: baseline; B: between 1 and 4 y of follow-up; C: between 5 and 9 y; and D: after 10 years of follow-up. Demographics, genetics, biomarkers, clinical data, and cumulative years of therapy were included in the model. At the baseline study, the mean age was 37.3 years (1-80), and the median Spanish MRI score (S-MRI) was 8.40 (male patients: 9.10 vs. female patients: 7.71) (p < 0.001). BM clearance was faster and deeper in women during follow-up. Genotypes that do not include the c.1226A>G variant have a higher degree of infiltration and complications (p = 0.017). A random forest machine-learning model identified that BM infiltration degree, age at the start of therapy, and femur infiltration were the most important factors to predict the risk and severity of the bone disease. In conclusion, a structured bone marrow MRI reporting in GD is useful to standardize the collected data and facilitate clinical management and academic collaboration. Artificial intelligence methods applied to these studies can help to predict bone disease complications.
Collapse
Affiliation(s)
- Esther Valero-Tena
- Departamento de Medicina Interna y Reumatología, Hospital MAZ, Zaragoza, Spain
- Fundación Española para el Estudio y Terapéutica de la Enfermedad de Gaucher y otras Lisosomales (FEETEG), Zaragoza, Spain
| | - Mercedes Roca-Espiau
- Fundación Española para el Estudio y Terapéutica de la Enfermedad de Gaucher y otras Lisosomales (FEETEG), Zaragoza, Spain
| | - Jose Verdú-Díaz
- John Walton Muscular Dystrophy Research Center, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Jordi Diaz-Manera
- John Walton Muscular Dystrophy Research Center, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Marcio Andrade-Campos
- Fundación Española para el Estudio y Terapéutica de la Enfermedad de Gaucher y otras Lisosomales (FEETEG), Zaragoza, Spain
- Grupo Español de Enfermedades de Depósito Lisosomal de la SEHH (GEEDL), Madrid, Spain
- Grupo de Investigación en Hematología, Instituto de Investigación Hospital del Mar, IMIM-Parc de Salut Mar, Barcelona, Spain
| | - Pilar Giraldo
- Fundación Española para el Estudio y Terapéutica de la Enfermedad de Gaucher y otras Lisosomales (FEETEG), Zaragoza, Spain
- Grupo Español de Enfermedades de Depósito Lisosomal de la SEHH (GEEDL), Madrid, Spain
| |
Collapse
|
15
|
Yapar D, Yapar A, Tokgöz MA, Bilge U. Decision tree-based data mining approach for the evaluation of survival in primary malignant bone tumors: A surveillance, epidemiology and end results database study. J Orthop Surg (Hong Kong) 2023; 31:10225536231189780. [PMID: 37548295 DOI: 10.1177/10225536231189780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 08/08/2023] Open
Abstract
PURPOSE This study aimed to conduct a large-scale population-based study to understand the epidemiological characteristics of Primary Malignant Bone Tumors (PMBTs) and determine the prognostic factors by concurrently using the classical statistical method and data mining methods. METHODS Patients included in this study were extracted from the National Cancer Institute's Surveillance, Epidemiology and End Results (SEER) database: "Incidence-SEER Research Data, 18 Registries, Nov 2020 Sub". Patients with unclassified and incomplete information were excluded. This search algorithm resulted in a dataset comprising 6234 cases. Survival analyses were performed with Kaplan-Meier curves and the Log-rank test. Multivariate Cox regression analysis determined the independent prognostic factors of PMBT. A decision tree-based data mining technique was used in this study to confirm the prognostic factors. RESULTS 5-years survival rate was 63.6% and 10-years survival rate was 55.3% in the patients with PMBT. Sex, age, median household income, histology, primary site, grade, stage, metastasis, and the total number of malignant tumors were determined as independent risk factors associated with overall survival (OS) in the multivariate COX regression analysis. The prognostic factors resulting in five terminal nodes in the decision tree (DT) included stage, age, and grade. The stage was the most important determining factor for vital status. The terminal node with the shortest number of surviving patients included 801 (72.3%) deaths in 1102 patients with distant stage, and hazard ratio was calculated as 5.4 (95% CI: 4.9-5.9; p < .001). These patients had a median survival of only 17 months. CONCLUSIONS Rules extracted from DTs provide information about risk factors in specific patient groups and can be used by clinicians making decisions on individual patients. We recommend using DTs in combination with COX regression analysis to determine risk factors and the effect of these factors on survival.
Collapse
Affiliation(s)
- Dilek Yapar
- Turkish Ministry of Health, Muratpasa District Health Directorate, Antalya, Turkey
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Akdeniz University, Antalya, Turkey
| | - Aliekber Yapar
- Department of Orthopaedics and Traumatology, Antalya Training and Research Hospital, Antalya, Turkey
| | - Mehmet Ali Tokgöz
- Department of Orthopaedics and Traumatology, Faculty of Medicine, Gazi University, Ankara, Turkey
| | - Uğur Bilge
- Department of Biostatistics and Medical Informatics, Faculty of Medicine, Akdeniz University, Antalya, Turkey
| |
Collapse
|
16
|
Bogatu A, Wysocka M, Wysocki O, Butterworth H, Pillai M, Allison J, Landers D, Kilgour E, Thistlethwaite F, Freitas A. Meta-analysis informed machine learning: Supporting cytokine storm detection during CAR-T cell Therapy. J Biomed Inform 2023; 142:104367. [PMID: 37105509 DOI: 10.1016/j.jbi.2023.104367] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 04/11/2023] [Accepted: 04/17/2023] [Indexed: 04/29/2023]
Abstract
Cytokine release syndrome (CRS), also known as cytokine storm, is one of the most consequential adverse effects of chimeric antigen receptor therapies that have shown otherwise promising results in cancer treatment. When emerging, CRS could be identified by the analysis of specific cytokine and chemokine profiles that tend to exhibit similarities across patients. In this paper, we exploit these similarities using machine learning algorithms and set out to pioneer a meta-review informed method for the identification of CRS based on specific cytokine peak concentrations and evidence from previous clinical studies. To this end we also address a widespread challenge of the applicability of machine learning in general: reduced training data availability. We do so by augmenting available (but often insufficient) patient cytokine concentrations with statistical knowledge extracted from domain literature. We argue that such methods could support clinicians in analyzing suspect cytokine profiles by matching them against the said CRS knowledge from past clinical studies, with the ultimate aim of swift CRS diagnosis. We evaluate our proposed methods under several design choices, achieving performance of more than 90% in terms of CRS identification accuracy, and showing that many of our choices outperform a purely data-driven alternative. During evaluation with real-world CRS clinical data, we emphasize the potential of our proposed method of producing interpretable results, in addition to being effective in identifying the onset of cytokine storm.
Collapse
Affiliation(s)
- Alex Bogatu
- Department of Computer Science, The University of Manchester, United Kingdom; Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, United Kingdom.
| | - Magdalena Wysocka
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, United Kingdom
| | - Oskar Wysocki
- Department of Computer Science, The University of Manchester, United Kingdom; Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, United Kingdom
| | | | - Manon Pillai
- Division of Cancer Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, UK
| | - Jennifer Allison
- Division of Cancer Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, UK
| | - Dónal Landers
- Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, United Kingdom
| | - Elaine Kilgour
- Cancer Biomarker Centre, CRUK Manchester Institute, United Kingdom
| | - Fiona Thistlethwaite
- Division of Cancer Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, UK; The Christie NHS Foundation Trust, Manchester, UK
| | - André Freitas
- Department of Computer Science, The University of Manchester, United Kingdom; Digital Experimental Cancer Medicine Team, Cancer Biomarker Centre, CRUK Manchester Institute, United Kingdom; Idiap Research Institute, Switzerland
| |
Collapse
|
17
|
Liu CY, Cheng CY, Yang SY, Chai JW, Chen WH, Chang PY. Mortality Evaluation and Life Expectancy Prediction of Patients with Hepatocellular Carcinoma with Data Mining. Healthcare (Basel) 2023; 11:healthcare11060925. [PMID: 36981582 PMCID: PMC10048888 DOI: 10.3390/healthcare11060925] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 03/11/2023] [Accepted: 03/15/2023] [Indexed: 03/30/2023] Open
Abstract
BACKGROUND The complexity of systemic variables and comorbidities makes it difficult to determine the best treatment for patients with hepatocellular carcinoma (HCC). It is impossible to perform a multidimensional evaluation of every patient, but the development of guidelines based on analyses of said complexities would be the next best option. Whereas conventional statistics are often inadequate for developing multivariate predictive models, data mining has proven more capable. Patients, methods and findings: Clinical profiles and treatment responses of 537 patients diagnosed with Barcelona Clinic Liver Cancer stages B and C from 2009 to 2019 were retrospectively analyzed using 4 decision tree algorithms. A combination of 19 treatments, 7 biomarkers, and 4 states of hepatitis was tested to determine which combinations would result in survival times greater than a year in duration. Just 2 of the algorithms produced complete models through single trees, which made them only the ones suitable for clinical judgement. A combination of alpha fetoprotein ≤210.5 mcg/L, glutamic oxaloacetic transaminase ≤1.13 µkat/L, and total bilirubin ≤ 0.0283 mmol/L was shown to be a good predictor of survival >1 year, and the most effective treatments for such patients were radio-frequency ablation (RFA) and transarterial chemoembolization (TACE) with radiation therapy (RT). In patients without this combination, the best treatments were RFA, TACE with RT and targeted drug therapy, and TACE with targeted drug therapy and immunotherapy. The main limitation of this study was its small sample. With a small sample size, we may have developed a less reliable model system, failing to produce any clinically important results or outcomes. CONCLUSION Data mining can produce models to help clinicians predict survival time at the time of initial HCC diagnosis and then choose the most suitable treatment.
Collapse
Affiliation(s)
- Che-Yu Liu
- Department of Radiology, Taichung Veterans General Hospital, Taichung 407, Taiwan
| | - Chen-Yang Cheng
- Department of Industrial Engineering and Management, National Taipei University of Technology, Taipei 106, Taiwan
| | - Szu-Ying Yang
- Nursing Department, Taichung Veterans General Hospital, Taichung 407, Taiwan
| | - Jyh-Wen Chai
- Department of Radiology, Taichung Veterans General Hospital, Taichung 407, Taiwan
- Section of Radiology, College of Medicine, China Medical University, Taichung 404, Taiwan
- College of Medicine, National Chung Hsing University, Taichung 402, Taiwan
| | - Wei-Hao Chen
- Institute of Business & Management, National Yang Ming Chiao Tung University, Taipei 100, Taiwan
| | - Pi-Yi Chang
- Department of Radiology, Taichung Veterans General Hospital, Taichung 407, Taiwan
- Department of Industrial Engineering and Enterprise Information, Tunghai University, Taichung 407, Taiwan
| |
Collapse
|
18
|
Wilson LJ, Kiffer FC, Berrios DC, Bryce-Atkinson A, Costes SV, Gevaert O, Matarèse BFE, Miller J, Mukherjee P, Peach K, Schofield PN, Slater LT, Langen B. Machine intelligence for radiation science: summary of the Radiation Research Society 67th annual meeting symposium. Int J Radiat Biol 2023:1-10. [PMID: 36735963 DOI: 10.1080/09553002.2023.2173823] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
The era of high-throughput techniques created big data in the medical field and research disciplines. Machine intelligence (MI) approaches can overcome critical limitations on how those large-scale data sets are processed, analyzed, and interpreted. The 67th Annual Meeting of the Radiation Research Society featured a symposium on MI approaches to highlight recent advancements in the radiation sciences and their clinical applications. This article summarizes three of those presentations regarding recent developments for metadata processing and ontological formalization, data mining for radiation outcomes in pediatric oncology, and imaging in lung cancer.
Collapse
Affiliation(s)
- Lydia J Wilson
- Department of Radiation Oncology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Frederico C Kiffer
- Department of Anesthesia and Critical Care Medicine, The Children's Hospital of Philadelphia Research Institute, Philadelphia, PA, USA
| | | | - Abigail Bryce-Atkinson
- Division of Cancer Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | | | - Olivier Gevaert
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford, CA, USA
- Department of Biomedical Data Science, Stanford University, Palo Alto, CA, USA
| | - Bruno F E Matarèse
- The Cavendish Laboratory, University of Cambridge, Cambridge, UK
- Department of Haematology, School of Clinical Medicine, University of Cambridge, Cambridge, UK
| | - Jack Miller
- NASA Ames Research Center, Moffett Field, CA, USA
- KBR, NASA Ames Research Center, Moffett Field, CA, USA
| | - Pritam Mukherjee
- Stanford Center for Biomedical Informatics Research, Department of Medicine, Stanford, CA, USA
- Radiology and Imaging Sciences, NIH Clinical Center, Bethesda, MD, USA
| | - Kristen Peach
- Department of Bionetics, NASA Ames Research Center, Moffett Field, CA, USA
| | - Paul N Schofield
- Department of Physiology Development and Neuroscience, University of Cambridge, Cambridge, UK
| | - Luke T Slater
- Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
- Institute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust, Birmingham, UK
- MRC Health Data Research UK (HDR UK), Midlands, UK
| | - Britta Langen
- Department of Radiation Oncology, Section of Molecular Radiation Biology, UT Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|
19
|
Bergen DJM, Maurizi A, Formosa MM, McDonald GLK, El-Gazzar A, Hassan N, Brandi ML, Riancho JA, Rivadeneira F, Ntzani E, Duncan EL, Gregson CL, Kiel DP, Zillikens MC, Sangiorgi L, Högler W, Duran I, Mäkitie O, Van Hul W, Hendrickx G. High Bone Mass Disorders: New Insights From Connecting the Clinic and the Bench. J Bone Miner Res 2023; 38:229-247. [PMID: 36161343 PMCID: PMC10092806 DOI: 10.1002/jbmr.4715] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 09/05/2022] [Accepted: 09/22/2022] [Indexed: 02/04/2023]
Abstract
Monogenic high bone mass (HBM) disorders are characterized by an increased amount of bone in general, or at specific sites in the skeleton. Here, we describe 59 HBM disorders with 50 known disease-causing genes from the literature, and we provide an overview of the signaling pathways and mechanisms involved in the pathogenesis of these disorders. Based on this, we classify the known HBM genes into HBM (sub)groups according to uniform Gene Ontology (GO) terminology. This classification system may aid in hypothesis generation, for both wet lab experimental design and clinical genetic screening strategies. We discuss how functional genomics can shape discovery of novel HBM genes and/or mechanisms in the future, through implementation of omics assessments in existing and future model systems. Finally, we address strategies to improve gene identification in unsolved HBM cases and highlight the importance for cross-laboratory collaborations encompassing multidisciplinary efforts to transfer knowledge generated at the bench to the clinic. © 2022 The Authors. Journal of Bone and Mineral Research published by Wiley Periodicals LLC on behalf of American Society for Bone and Mineral Research (ASBMR).
Collapse
Affiliation(s)
- Dylan J M Bergen
- School of Physiology, Pharmacology, and Neuroscience, Faculty of Life Sciences, University of Bristol, Bristol, UK.,Musculoskeletal Research Unit, Translational Health Sciences, Bristol Medical School, Faculty of Health Sciences, University of Bristol, Bristol, UK
| | - Antonio Maurizi
- Department of Biotechnological and Applied Clinical Sciences, University of L'Aquila, L'Aquila, Italy
| | - Melissa M Formosa
- Department of Applied Biomedical Science, Faculty of Health Sciences, University of Malta, Msida, Malta.,Center for Molecular Medicine and Biobanking, University of Malta, Msida, Malta
| | - Georgina L K McDonald
- School of Physiology, Pharmacology, and Neuroscience, Faculty of Life Sciences, University of Bristol, Bristol, UK
| | - Ahmed El-Gazzar
- Department of Paediatrics and Adolescent Medicine, Johannes Kepler University Linz, Linz, Austria
| | - Neelam Hassan
- Musculoskeletal Research Unit, Translational Health Sciences, Bristol Medical School, Faculty of Health Sciences, University of Bristol, Bristol, UK
| | | | - José A Riancho
- Department of Internal Medicine, Hospital U M Valdecilla, University of Cantabria, IDIVAL, Santander, Spain
| | - Fernando Rivadeneira
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Evangelia Ntzani
- Department of Hygiene and Epidemiology, Medical School, University of Ioannina, Ioannina, Greece.,Center for Evidence Synthesis in Health, Policy and Practice, Center for Research Synthesis in Health, School of Public Health, Brown University, Providence, RI, USA.,Institute of Biosciences, University Research Center of loannina, University of Ioannina, Ioannina, Greece
| | - Emma L Duncan
- Department of Twin Research & Genetic Epidemiology, School of Life Course Sciences, Faculty of Life Sciences and Medicine, King's College London, London, UK.,Department of Endocrinology, Guy's and St Thomas' NHS Foundation Trust, London, UK
| | - Celia L Gregson
- Musculoskeletal Research Unit, Translational Health Sciences, Bristol Medical School, Faculty of Health Sciences, University of Bristol, Bristol, UK
| | - Douglas P Kiel
- Marcus Institute for Aging Research, Hebrew SeniorLife and Department of Medicine Beth Israel Deaconess Medical Center and Harvard Medical School, Broad Institute of MIT & Harvard, Cambridge, MA, USA
| | - M Carola Zillikens
- Department of Internal Medicine, Erasmus University Medical Center, Rotterdam, The Netherlands
| | - Luca Sangiorgi
- Department of Rare Skeletal Diseases, IRCCS Rizzoli Orthopaedic Institute, Bologna, Italy
| | - Wolfgang Högler
- Department of Paediatrics and Adolescent Medicine, Johannes Kepler University Linz, Linz, Austria.,Institute of Metabolism and Systems Research, University of Birmingham, Birmingham, UK
| | | | - Outi Mäkitie
- Children's Hospital, University of Helsinki and Helsinki University Hospital, Helsinki, Finland.,Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Helsinki, Finland.,Folkhälsan Research Centre, Folkhälsan Institute of Genetics, Helsinki, Finland
| | - Wim Van Hul
- Department of Medical Genetics, University of Antwerp, Antwerp, Belgium
| | | |
Collapse
|
20
|
Nelson AE, Arbeeva L. Narrative Review of Machine Learning in Rheumatic and Musculoskeletal Diseases for Clinicians and Researchers: Biases, Goals, and Future Directions. J Rheumatol 2022; 49:1191-1200. [PMID: 35840150 PMCID: PMC9633365 DOI: 10.3899/jrheum.220326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/21/2022] [Indexed: 11/22/2022]
Abstract
There has been rapid growth in the use of artificial intelligence (AI) analytics in medicine in recent years, including in rheumatic and musculoskeletal diseases (RMDs). Such methods represent a challenge to clinicians, patients, and researchers, given the "black box" nature of most algorithms, the unfamiliarity of the terms, and the lack of awareness of potential issues around these analyses. Therefore, this review aims to introduce this subject area in a way that is relevant and meaningful to clinicians and researchers. We hope to provide some insights into relevant strengths and limitations, reporting guidelines, as well as recent examples of such analyses in key areas, with a focus on lessons learned and future directions in diagnosis, phenotyping, prognosis, and precision medicine in RMDs.
Collapse
Affiliation(s)
- Amanda E Nelson
- A.E. Nelson, MD, MSCR, Department of Medicine, Division of Rheumatology, Allergy, and Immunology, University of North Carolina at Chapel Hill;
| | - Liubov Arbeeva
- L. Arbeeva, MS, Thurston Arthritis Research Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
21
|
Jiang D, Chen ZX, Ma FX, Gong YY, Pu T, Chen JM, Liu XQ, Zhao YJ, Xie K, Hou H, Wang C, Geng XP, Liu FB. Online calculator for predicting the risk of malignancy in patients with pancreatic cystic neoplasms: A multicenter, retrospective study. World J Gastroenterol 2022; 28:5469-5482. [PMID: 36312834 PMCID: PMC9611704 DOI: 10.3748/wjg.v28.i37.5469] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/25/2022] [Accepted: 09/07/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Efficient and practical methods for predicting the risk of malignancy in patients with pancreatic cystic neoplasms (PCNs) are lacking.
AIM To establish a nomogram-based online calculator for predicting the risk of malignancy in patients with PCNs.
METHODS In this study, the clinicopathological data of target patients in three medical centers were analyzed. The independent sample t-test, Mann–Whitney U test or chi-squared test were used as appropriate for statistical analysis. After univariable and multivariable logistic regression analysis, five independent factors were screened and incorporated to develop a calculator for predicting the risk of malignancy. Finally, the concordance index (C-index), calibration, area under the curve, decision curve analysis and clinical impact curves were used to evaluate the performance of the calculator.
RESULTS Enhanced mural nodules [odds ratio (OR): 4.314; 95% confidence interval (CI): 1.618–11.503, P = 0.003], tumor diameter ≥ 40 mm (OR: 3.514; 95%CI: 1.138–10.849, P = 0.029), main pancreatic duct dilatation (OR: 3.267; 95%CI: 1.230–8.678, P = 0.018), preoperative neutrophil-to-lymphocyte ratio ≥ 2.288 (OR: 2.702; 95%CI: 1.008–7.244, P = 0.048], and preoperative serum CA19-9 concentration ≥ 34 U/mL (OR: 3.267; 95%CI: 1.274–13.007, P = 0.018) were independent risk factors for a high risk of malignancy in patients with PCNs. In the training cohort, the nomogram achieved a C-index of 0.824 for predicting the risk of malignancy. The predictive ability of the model was then validated in an external cohort (C-index: 0.893). Compared with the risk factors identified in the relevant guidelines, the current model showed better predictive performance and clinical utility.
CONCLUSION The calculator demonstrates optimal predictive performance for identifying the risk of malignancy, potentially yielding a personalized method for patient selection and decision-making in clinical practice.
Collapse
Affiliation(s)
- Dong Jiang
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Zi-Xiang Chen
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Fu-Xiao Ma
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Yu-Yong Gong
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Tian Pu
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Jiang-Ming Chen
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Xue-Qian Liu
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Yi-Jun Zhao
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Kun Xie
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Hui Hou
- Department of General Surgery, The Second Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Cheng Wang
- Department of General Surgery, The First Affiliated Hospital of University of Science and Technology of China, Hefei 230000, Anhui Province, China
| | - Xiao-Ping Geng
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| | - Fu-Bao Liu
- Department of General Surgery, The First Affiliated Hospital of Anhui Medical University, Hefei 230000, Anhui Province, China
| |
Collapse
|
22
|
Amorim M, Silva S, Machado H, Teles EL, Baptista MJ, Maia T, Nwebonyi N, de Freitas C. Benefits and Risks of Sharing Genomic Data for Research: Comparing the Views of Rare Disease Patients, Informal Carers and Healthcare Professionals. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:8788. [PMID: 35886636 PMCID: PMC9319916 DOI: 10.3390/ijerph19148788] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/10/2022] [Accepted: 07/11/2022] [Indexed: 01/25/2023]
Abstract
Assessing public and patients' expectations and concerns about genomic data sharing is essential to promote adequate data governance and engagement in rare diseases genomics research. This cross-sectional study compared the views of 159 rare disease patients, 478 informal carers and 63 healthcare professionals in Northern Portugal about the benefits and risks of sharing genomic data for research, and its associated factors. The three participant groups expressed significantly different views. The majority of patients (84.3%) and informal carers (87.4%) selected the discovery of a cure for untreatable diseases as the most important benefit. In contrast, most healthcare professionals revealed a preference for the development of new drugs and treatments (71.4%), which was the second most selected benefit by carers (48.3%), especially by the more educated (OR (95% CI): 1.58 (1.07-2.34)). Lack of security and control over information access and the extraction of information exceeding research objectives were the two most often selected risks by patients (72.6% and 50.3%, respectively) and carers (60.0% and 60.6%, respectively). Conversely, professionals were concerned with genomic data being used to discriminate citizens (68.3%), followed by the extraction of information exceeding research objectives (54.0%). The latter risk was more frequently expressed by more educated carers (OR (95% CI): 1.60 (1.06-2.41)) and less by those with blue-collar (OR (95% CI): 0.44 (0.25-0.77) and other occupations (OR (95% CI): 0.44 (0.26-0.74)). Developing communication strategies and consent approaches tailored to participants' expectations and needs can benefit the inclusiveness of genomics research that is key for patient-centred care.
Collapse
Affiliation(s)
- Mariana Amorim
- Laboratório para a Investigação Integrativa e Translacional em Saúde Populacional (ITR), 4050-600 Porto, Portugal; (M.A.); (T.M.); (N.N.)
- EPIUnit—Instituto de Saúde Pública, Universidade do Porto, 4050-600 Porto, Portugal
| | - Susana Silva
- Centro em Rede de Investigação em Antropologia, Universidade do Minho, 4710-057 Braga, Portugal;
- Instituto de Ciências Sociais, Universidade do Minho, 4710-057 Braga, Portugal;
| | - Helena Machado
- Instituto de Ciências Sociais, Universidade do Minho, 4710-057 Braga, Portugal;
| | - Elisa Leão Teles
- Centro de Referência de Doenças Hereditárias do Metabolismo, Centro Hospitalar Universitário São João, 4200-319 Porto, Portugal;
| | - Maria João Baptista
- Centro de Referência de Cardiopatias Congénitas, Centro Hospitalar Universitário São João, 4200-319 Porto, Portugal;
- Departamento de Ginecologia, Obstetrícia e Pediatria, Faculdade de Medicina, Universidade do Porto, 4200-319 Porto, Portugal
| | - Tiago Maia
- Laboratório para a Investigação Integrativa e Translacional em Saúde Populacional (ITR), 4050-600 Porto, Portugal; (M.A.); (T.M.); (N.N.)
- EPIUnit—Instituto de Saúde Pública, Universidade do Porto, 4050-600 Porto, Portugal
| | - Ngozi Nwebonyi
- Laboratório para a Investigação Integrativa e Translacional em Saúde Populacional (ITR), 4050-600 Porto, Portugal; (M.A.); (T.M.); (N.N.)
- EPIUnit—Instituto de Saúde Pública, Universidade do Porto, 4050-600 Porto, Portugal
| | - Cláudia de Freitas
- Laboratório para a Investigação Integrativa e Translacional em Saúde Populacional (ITR), 4050-600 Porto, Portugal; (M.A.); (T.M.); (N.N.)
- EPIUnit—Instituto de Saúde Pública, Universidade do Porto, 4050-600 Porto, Portugal
- Departamento de Ciências da Saúde Pública e Forenses e Educação Médica, Faculdade de Medicina, Universidade do Porto, 4200-319 Porto, Portugal
| |
Collapse
|
23
|
A Formative Study of the Implementation of Whole Genome Sequencing in Northern Ireland. Genes (Basel) 2022; 13:genes13071104. [PMID: 35885887 PMCID: PMC9316942 DOI: 10.3390/genes13071104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 06/13/2022] [Accepted: 06/14/2022] [Indexed: 02/05/2023] Open
Abstract
Background: The UK 100,000 Genomes Project was a transformational research project which facilitated whole genome sequencing (WGS) diagnostics for rare diseases. We evaluated experiences of introducing WGS in Northern Ireland, providing recommendations for future projects. Methods: This formative evaluation included (1) an appraisal of the logistics of implementing and delivering WGS, (2) a survey of participant self-reported views and experiences, (3) semi-structured interviews with healthcare staff as key informants who were involved in the delivery of WGS and (4) a workshop discussion about interprofessional collaboration with respect to molecular diagnostics. Results: We engaged with >400 participants, with detailed reflections obtained from 74 participants including patients, caregivers, key National Health Service (NHS) informants, and researchers (patient survey n = 42; semi-structured interviews n = 19; attendees of the discussion workshop n = 13). Overarching themes included the need to improve rare disease awareness, education, and support services, as well as interprofessional collaboration being central to an effective, mainstreamed molecular diagnostic service. Conclusions: Recommendations for streamlining precision medicine for patients with rare diseases include administrative improvements (e.g., streamlining of the consent process), educational improvements (e.g., rare disease training provided from undergraduate to postgraduate education alongside genomics training for non-genetic specialists) and analytical improvements (e.g., multidisciplinary collaboration and improved computational infrastructure).
Collapse
|
24
|
New Developments and Possibilities in Reanalysis and Reinterpretation of Whole Exome Sequencing Datasets for Unsolved Rare Diseases Using Machine Learning Approaches. Int J Mol Sci 2022; 23:ijms23126792. [PMID: 35743235 PMCID: PMC9224427 DOI: 10.3390/ijms23126792] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 06/13/2022] [Accepted: 06/15/2022] [Indexed: 11/21/2022] Open
Abstract
Rare diseases impact the lives of 300 million people in the world. Rapid advances in bioinformatics and genomic technologies have enabled the discovery of causes of 20–30% of rare diseases. However, most rare diseases have remained as unsolved enigmas to date. Newer tools and availability of high throughput sequencing data have enabled the reanalysis of previously undiagnosed patients. In this review, we have systematically compiled the latest developments in the discovery of the genetic causes of rare diseases using machine learning methods. Importantly, we have detailed methods available to reanalyze existing whole exome sequencing data of unsolved rare diseases. We have identified different reanalysis methodologies to solve problems associated with sequence alterations/mutations, variation re-annotation, protein stability, splice isoform malfunctions and oligogenic analysis. In addition, we give an overview of new developments in the field of rare disease research using whole genome sequencing data and other omics.
Collapse
|
25
|
Walkowiak D, Bokayeva K, Miraleyeva A, Domaradzki J. The Awareness of Rare Diseases Among Medical Students and Practicing Physicians in the Republic of Kazakhstan. An Exploratory Study. Front Public Health 2022; 10:872648. [PMID: 35462837 PMCID: PMC9031913 DOI: 10.3389/fpubh.2022.872648] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 03/18/2022] [Indexed: 11/18/2022] Open
Abstract
Although national plans or strategies for rare diseases (RDs) have been implemented in many jurisdictions research show that one of the main barriers RD patients face during medical encounter is medical professionals' low level of knowledge and experience on the diagnosis, treatment and rehabilitation of RD patients. Consequently, there is a need to increase the standards of medical education in the field of RDs and to revise the undergraduate and postgraduate training programs. However, while studies on medical education in the field of RDs has been conducted in various countries across the both Americas, Asia or the European Union, still little is known about the awareness of RDs among healthcare professionals in the Republic of Kazakhstan. Thus, we conducted a survey among 207 medical students and 101 medical doctors from the West Kazakhstan Marat Ospanov Medical University, Aktobe, Kazakhstan. The study was conducted between March and May 2021. The questionnaire assessed their knowledge about the number, examples, etiology and estimated frequency of RDs. It also evaluated respondents self-assessment of competence in RDs. Although the majority of respondents agreed that RDs constitute a serious public health issue both medical students and medical doctors showed insufficient knowledge on the etiology, epidemiology and prevalence of RDs, and many had problems with separating RDs from more common disorders. Moreover, they also lacked knowledge about and the central register of RD patients and reimbursement of orphan drugs in Kazakhstan. Finally, while almost half respondents declared having had classes about RDs during their studies most perceived their knowledge about RDs as insufficient or poor and felt unprepared for caring for RD patients. Additionally, although majority of respondents in both groups believed that all physicians, regardless of their specialization, should possess knowledge on RDs many respondents did not look for such information at all.
Collapse
Affiliation(s)
- Dariusz Walkowiak
- Department of Organization and Management in Health Care, Poznan University of Medical Sciences, Poznań, Poland
| | - Kamila Bokayeva
- Department of Pediatric Gastroenterology and Metabolic Diseases, Poznan University of Medical Sciences, Poznań, Poland
| | - Alua Miraleyeva
- Department of Psychology, West Kazakhstan Marat Ospanov Medical University, Aktobe, Kazakhstan
| | - Jan Domaradzki
- Department of Social Sciences and Humanities, Poznan University of Medical Sciences, Poznań, Poland
| |
Collapse
|
26
|
Walkowiak D, Bokayeva K, Miraleyeva A, Domaradzki J. The Awareness of Rare Diseases Among Medical Students and Practicing Physicians in the Republic of Kazakhstan. An Exploratory Study. Front Public Health 2022; 10. [DOI: https:/doi.org/10.3389/fpubh.2022.872648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2023] Open
Abstract
Although national plans or strategies for rare diseases (RDs) have been implemented in many jurisdictions research show that one of the main barriers RD patients face during medical encounter is medical professionals' low level of knowledge and experience on the diagnosis, treatment and rehabilitation of RD patients. Consequently, there is a need to increase the standards of medical education in the field of RDs and to revise the undergraduate and postgraduate training programs. However, while studies on medical education in the field of RDs has been conducted in various countries across the both Americas, Asia or the European Union, still little is known about the awareness of RDs among healthcare professionals in the Republic of Kazakhstan. Thus, we conducted a survey among 207 medical students and 101 medical doctors from the West Kazakhstan Marat Ospanov Medical University, Aktobe, Kazakhstan. The study was conducted between March and May 2021. The questionnaire assessed their knowledge about the number, examples, etiology and estimated frequency of RDs. It also evaluated respondents self-assessment of competence in RDs. Although the majority of respondents agreed that RDs constitute a serious public health issue both medical students and medical doctors showed insufficient knowledge on the etiology, epidemiology and prevalence of RDs, and many had problems with separating RDs from more common disorders. Moreover, they also lacked knowledge about and the central register of RD patients and reimbursement of orphan drugs in Kazakhstan. Finally, while almost half respondents declared having had classes about RDs during their studies most perceived their knowledge about RDs as insufficient or poor and felt unprepared for caring for RD patients. Additionally, although majority of respondents in both groups believed that all physicians, regardless of their specialization, should possess knowledge on RDs many respondents did not look for such information at all.
Collapse
|
27
|
Deep Learning-Based Microscopic Diagnosis of Odontogenic Keratocysts and Non-Keratocysts in Haematoxylin and Eosin-Stained Incisional Biopsies. Diagnostics (Basel) 2021; 11:diagnostics11122184. [PMID: 34943424 PMCID: PMC8700488 DOI: 10.3390/diagnostics11122184] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Revised: 11/21/2021] [Accepted: 11/22/2021] [Indexed: 12/23/2022] Open
Abstract
Background: The goal of the study was to create a histopathology image classification automation system that could identify odontogenic keratocysts in hematoxylin and eosin-stained jaw cyst sections. Methods: From 54 odontogenic keratocysts, 23 dentigerous cysts, and 20 radicular cysts, about 2657 microscopic pictures with 400× magnification were obtained. The images were annotated by a pathologist and categorized into epithelium, cystic lumen, and stroma of keratocysts and non-keratocysts. Preprocessing was performed in two steps; the first is data augmentation, as the Deep Learning techniques (DLT) improve their performance with increased data size. Secondly, the epithelial region was selected as the region of interest. Results: Four experiments were conducted using the DLT. In the first, a pre-trained VGG16 was employed to classify after-image augmentation. In the second, DenseNet-169 was implemented for image classification on the augmented images. In the third, DenseNet-169 was trained on the two-step preprocessed images. In the last experiment, two and three results were averaged to obtain an accuracy of 93% on OKC and non-OKC images. Conclusions: The proposed algorithm may fit into the automation system of OKC and non-OKC diagnosis. Utmost care was taken in the manual process of image acquisition (minimum 28–30 images/slide at 40× magnification covering the entire stretch of epithelium and stromal component). Further, there is scope to improve the accuracy rate and make it human bias free by using a whole slide imaging scanner for image acquisition from slides.
Collapse
|