1
|
Aversano L, Bernardi ML, Cimitile M, Maiellaro A, Pecori R. A systematic review on artificial intelligence techniques for detecting thyroid diseases. PeerJ Comput Sci 2023; 9:e1394. [PMID: 37346658 PMCID: PMC10280452 DOI: 10.7717/peerj-cs.1394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 04/21/2023] [Indexed: 06/23/2023]
Abstract
The use of artificial intelligence approaches in health-care systems has grown rapidly over the last few years. In this context, early detection of diseases is the most common area of application. In this scenario, thyroid diseases are an example of illnesses that can be effectively faced if discovered quite early. Detecting thyroid diseases is crucial in order to treat patients effectively and promptly, by saving lives and reducing healthcare costs. This work aims at systematically reviewing and analyzing the literature on various artificial intelligence-related techniques applied to the detection and identification of various diseases related to the thyroid gland. The contributions we reviewed are classified according to different viewpoints and taxonomies in order to highlight pros and cons of the most recent research in the field. After a careful selection process, we selected and reviewed 72 papers, analyzing them according to three main research questions, i.e., which diseases of the thyroid gland are detected by different artificial intelligence techniques, which datasets are used to perform the aforementioned detection, and what types of data are used to perform the detection. The review demonstrates that the majority of the considered papers deal with supervised methods to detect hypo- and hyperthyroidism. The average accuracy of detection is high (96.84%), but the usage of private and outdated datasets with a majority of clinical data is very common. Finally, we discuss the outcomes of the systematic review, pointing out advantages, disadvantages, and future developments in the application of artificial intelligence for thyroid diseases detection.
Collapse
Affiliation(s)
- Lerina Aversano
- Department of Engineering, University of Sannio, Benevento, Italy
| | | | - Marta Cimitile
- Dept. of Law and Digital Society, UnitelmaSapienza University, Rome, Italy
| | - Andrea Maiellaro
- Department of Engineering, University of Sannio, Benevento, Italy
| | - Riccardo Pecori
- Institute of Materials for Electronics and Magnetism, National Research Council, Parma, Italy
- SMARTEST Research Centre, eCampus University, Novedrate (CO), Italy
| |
Collapse
|
2
|
Zaunseder E, Mütze U, Garbade SF, Haupt S, Feyh P, Hoffmann GF, Heuveline V, Kölker S. Machine Learning Methods Improve Specificity in Newborn Screening for Isovaleric Aciduria. Metabolites 2023; 13:metabo13020304. [PMID: 36837923 PMCID: PMC9962193 DOI: 10.3390/metabo13020304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 02/10/2023] [Accepted: 02/14/2023] [Indexed: 02/22/2023] Open
Abstract
Isovaleric aciduria (IVA) is a rare disorder of leucine metabolism and part of newborn screening (NBS) programs worldwide. However, NBS for IVA is hampered by, first, the increased birth prevalence due to the identification of individuals with an attenuated disease variant (so-called "mild" IVA) and, second, an increasing number of false positive screening results due to the use of pivmecillinam contained in the medication. Recently, machine learning (ML) methods have been analyzed, analogous to new biomarkers or second-tier methods, in the context of NBS. In this study, we investigated the application of machine learning classification methods to improve IVA classification using an NBS data set containing 2,106,090 newborns screened in Heidelberg, Germany. Therefore, we propose to combine two methods, linear discriminant analysis, and ridge logistic regression as an additional step, a digital-tier, to traditional NBS. Our results show that this reduces the false positive rate by 69.9% from 103 to 31 while maintaining 100% sensitivity in cross-validation. The ML methods were able to classify mild and classic IVA from normal newborns solely based on the NBS data and revealed that besides isovalerylcarnitine (C5), the metabolite concentration of tryptophan (Trp) is important for improved classification. Overall, applying ML methods to improve the specificity of IVA could have a major impact on newborns, as it could reduce the newborns' and families' burden of false positives or over-treatment.
Collapse
Affiliation(s)
- Elaine Zaunseder
- Engineering Mathematics and Computing Lab (EMCL), Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, 69120 Heidelberg, Germany
- Data Mining and Uncertainty Quantification (DMQ), Heidelberg Institute for Theoretical Studies (HITS), 69118 Heidelberg, Germany
- Correspondence:
| | - Ulrike Mütze
- Division of Child Neurology and Metabolic Medicine, Center for Child and Adolescent Medicine, Heidelberg University Hospital, 69120 Heidelberg, Germany
| | - Sven F. Garbade
- Division of Child Neurology and Metabolic Medicine, Center for Child and Adolescent Medicine, Heidelberg University Hospital, 69120 Heidelberg, Germany
| | - Saskia Haupt
- Engineering Mathematics and Computing Lab (EMCL), Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, 69120 Heidelberg, Germany
- Data Mining and Uncertainty Quantification (DMQ), Heidelberg Institute for Theoretical Studies (HITS), 69118 Heidelberg, Germany
| | - Patrik Feyh
- Division of Child Neurology and Metabolic Medicine, Center for Child and Adolescent Medicine, Heidelberg University Hospital, 69120 Heidelberg, Germany
| | - Georg F. Hoffmann
- Division of Child Neurology and Metabolic Medicine, Center for Child and Adolescent Medicine, Heidelberg University Hospital, 69120 Heidelberg, Germany
| | - Vincent Heuveline
- Engineering Mathematics and Computing Lab (EMCL), Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, 69120 Heidelberg, Germany
- Data Mining and Uncertainty Quantification (DMQ), Heidelberg Institute for Theoretical Studies (HITS), 69118 Heidelberg, Germany
| | - Stefan Kölker
- Division of Child Neurology and Metabolic Medicine, Center for Child and Adolescent Medicine, Heidelberg University Hospital, 69120 Heidelberg, Germany
| |
Collapse
|
3
|
Zaunseder E, Haupt S, Mütze U, Garbade SF, Kölker S, Heuveline V. Opportunities and challenges in machine learning-based newborn screening-A systematic literature review. JIMD Rep 2022; 63:250-261. [PMID: 35433168 PMCID: PMC8995842 DOI: 10.1002/jmd2.12285] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Accepted: 03/17/2022] [Indexed: 01/06/2023] Open
Abstract
The development and continuous optimization of newborn screening (NBS) programs remains an important and challenging task due to the low prevalence of screened diseases and high sensitivity requirements for screening methods. Recently, different machine learning (ML) methods have been applied to support NBS. However, most studies only focus on single diseases or specific ML techniques making it difficult to draw conclusions on which methods are best to implement. Therefore, we performed a systematic literature review of peer-reviewed publications on ML-based NBS methods. Overall, 125 related papers, published in the past two decades, were collected for the study, and 17 met the inclusion criteria. We analyzed the opportunities and challenges of ML methods for NBS including data preprocessing, classification models and pattern recognition methods based on their underlying approaches, data requirements, interpretability on a modular level, and performance. In general, ML methods have the potential to reduce the false positive rate and identify so far unknown metabolic patterns within NBS data. Our analysis revealed, that, among the presented, logistic regression analysis and support vector machines seem to be valuable candidates for NBS. However, due to the variety of diseases and methods, a general recommendation for a single method in NBS is not possible. Instead, these methods should be further investigated and compared to other approaches in comprehensive studies as they show promising results in NBS applications.
Collapse
Affiliation(s)
- Elaine Zaunseder
- Engineering Mathematics and Computing Lab (EMCL), Interdisciplinary Center for Scientific Computing (IWR)Heidelberg UniversityHeidelbergGermany
- Data Mining and Uncertainty Quantification (DMQ)Heidelberg Institute for Theoretical Studies (HITS)HeidelbergGermany
| | - Saskia Haupt
- Engineering Mathematics and Computing Lab (EMCL), Interdisciplinary Center for Scientific Computing (IWR)Heidelberg UniversityHeidelbergGermany
- Data Mining and Uncertainty Quantification (DMQ)Heidelberg Institute for Theoretical Studies (HITS)HeidelbergGermany
| | - Ulrike Mütze
- Division of Child Neurology and Metabolic Medicine, Center for Child and Adolescent MedicineHeidelberg University HospitalHeidelbergGermany
| | - Sven F. Garbade
- Division of Child Neurology and Metabolic Medicine, Center for Child and Adolescent MedicineHeidelberg University HospitalHeidelbergGermany
| | - Stefan Kölker
- Division of Child Neurology and Metabolic Medicine, Center for Child and Adolescent MedicineHeidelberg University HospitalHeidelbergGermany
| | - Vincent Heuveline
- Engineering Mathematics and Computing Lab (EMCL), Interdisciplinary Center for Scientific Computing (IWR)Heidelberg UniversityHeidelbergGermany
- Data Mining and Uncertainty Quantification (DMQ)Heidelberg Institute for Theoretical Studies (HITS)HeidelbergGermany
| |
Collapse
|
4
|
Beluzo CE, Silva E, Alves LC, Bresan RC, Arruda NM, Sovat R, Carvalho T. Towards neonatal mortality risk classification: A data-driven approach using neonatal, maternal, and social factors. INFORMATICS IN MEDICINE UNLOCKED 2020; 20:100398. [PMID: 33102685 PMCID: PMC7568208 DOI: 10.1016/j.imu.2020.100398] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 07/13/2020] [Accepted: 07/14/2020] [Indexed: 11/16/2022] Open
Abstract
Infant mortality is an important health measure in a population as a crude indicator of the poverty and socioeconomic level. It also shows the availability and quality of health services and medical technology in a specific region. Although improvements have been observed in the last decades, the implementation of actions to reduce infant mortality is still a concern in many countries. To address such an important problem, this paper proposes a new support decision approach to classify newborns according to their neonatal mortality risk. Using features related to mother, newborn, and socio-demographic, we model the problem using a data-driven classification model able to provide the probability of a newborn dying until 28 t h days of life. More than a theoretical study, decision support tools as the one proposed here is relevant in countries in development as Brazil, because it aims at identifying risky neonates that may die to raise the attention of medical practitioners so that they can work harder to reduce the overall neonatal mortality. Overcoming an AUC of 96%, the proposed method is able to provide not just the probability of death risk but also an explicable interpretation of most important features for model decision, which is paramount in public health applications. Furthermore, we provide an extensive analysis across different rounds of experiments, including an analysis of pre and post partum features influence over data-driven model. Finally, different from previously conducted studies which rely on databases with less than 100,000 samples, our model takes advantage from a new proposed database, constructed using more than 1,400,000 samples comprising births and deaths extracted from public records in São Paulo-Brazil from 2012 to 2018.
Collapse
Affiliation(s)
| | - Everton Silva
- Federal Institute of São Paulo, Campinas, SP, Brazil
| | | | | | | | - Ricardo Sovat
- Federal Institute of São Paulo, Campinas, SP, Brazil
| | | |
Collapse
|