1
|
Ajuwon BI, Richardson A, Roper K, Lidbury BA. Clinical Validity of a Machine Learning Decision Support System for Early Detection of Hepatitis B Virus: A Binational External Validation Study. Viruses 2023; 15:1735. [PMID: 37632077 PMCID: PMC10458613 DOI: 10.3390/v15081735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 08/04/2023] [Accepted: 08/10/2023] [Indexed: 08/27/2023] Open
Abstract
HepB LiveTest is a machine learning decision support system developed for the early detection of hepatitis B virus (HBV). However, there is a lack of evidence on its generalisability. In this study, we aimed to externally assess the clinical validity and portability of HepB LiveTest in predicting HBV infection among independent patient cohorts from Nigeria and Australia. The performance of HepB LiveTest was evaluated by constructing receiver operating characteristic curves and estimating the area under the curve. Delong's method was used to estimate the 95% confidence interval (CI) of the area under the receiver-operating characteristic curve (AUROC). Compared to the Australian cohort, patients in the derivation cohort of HepB LiveTest and the hospital-based Nigerian cohort were younger (mean age, 45.5 years vs. 38.8 years vs. 40.8 years, respectively; p < 0.001) and had a higher incidence of HBV infection (1.9% vs. 69.4% vs. 57.3%). In the hospital-based Nigerian cohort, HepB LiveTest performed optimally with an AUROC of 0.94 (95% CI, 0.91-0.97). The model provided tailored predictions that ensured most cases of HBV infection did not go undetected. However, its discriminatory measure dropped to 0.60 (95% CI, 0.56-0.64) in the Australian cohort. These findings indicate that HepB LiveTest exhibits adequate cross-site transportability and clinical validity in the hospital-based Nigerian patient cohort but shows limited performance in the Australian cohort. Whilst HepB LiveTest holds promise for reducing HBV prevalence in underserved populations, caution is warranted when implementing the model in older populations, particularly in regions with low incidence of HBV infection.
Collapse
Affiliation(s)
- Busayo I. Ajuwon
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Canberra, ACT 2601, Australia; (K.R.); (B.A.L.)
- Department of Biosciences and Biotechnology, Faculty of Pure and Applied Sciences, Kwara State University, Malete 241103, Nigeria
| | - Alice Richardson
- Statistical Support Network, The Australian National University, Acton, Canberra, ACT 2601, Australia;
| | - Katrina Roper
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Canberra, ACT 2601, Australia; (K.R.); (B.A.L.)
| | - Brett A. Lidbury
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Canberra, ACT 2601, Australia; (K.R.); (B.A.L.)
| |
Collapse
|
2
|
Ajuwon BI, Richardson A, Roper K, Sheel M, Audu R, Salako BL, Bojuwoye MO, Katibi IA, Lidbury BA. The development of a machine learning algorithm for early detection of viral hepatitis B infection in Nigerian patients. Sci Rep 2023; 13:3244. [PMID: 36829040 PMCID: PMC9958122 DOI: 10.1038/s41598-023-30440-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2022] [Accepted: 02/23/2023] [Indexed: 02/26/2023] Open
Abstract
Access to Hepatitis B Virus (HBV) testing for people in low-resource settings has long been challenging due to the gold standard, enzyme immunoassay, being prohibitively expensive, and requiring specialised skills and facilities that are not readily available, particularly in remote and isolated laboratories. Routine pathology data in tandem with cutting-edge machine learning shows promising diagnostic potential. In this study, recursive partitioning ("trees") and Support Vector Machines (SVMs) were applied to interrogate patient dataset (n = 916) that comprised results for Hepatitis B Surface Antigen (HBsAg) and routine clinical chemistry and haematology blood tests. These algorithms were used to develop a predictive diagnostic model of HBV infection. Our SVM-based diagnostic model of infection (accuracy = 85.4%, sensitivity = 91%, specificity = 72.6%, precision = 88.2%, F1-score = 0.89, Area Under the Receiver Operating Curve, AUC = 0.90) proved to be highly accurate for discriminating HBsAg positive from negative patients, and thus rivals with immunoassay. Therefore, we propose a predictive model based on routine blood tests as a novel diagnostic for early detection of HBV infection. Early prediction of HBV infection via routine pathology markers and pattern recognition algorithms will offer decision-support to clinicians and enhance early diagnosis, which is critical for optimal clinical management and improved patient outcomes.
Collapse
Affiliation(s)
- Busayo I Ajuwon
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Australian Capital Territory, Australia.
- Department of Microbiology, Faculty of Pure and Applied Sciences, Kwara State University, Malete, Nigeria.
| | - Alice Richardson
- Statistical Support Network, The Australian National University, Acton, Australian Capital Territory, Australia
| | - Katrina Roper
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Australian Capital Territory, Australia
| | - Meru Sheel
- Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, New South Wales, Australia
| | - Rosemary Audu
- Microbiology Department, Centre for Human Virology and Genomics, The Nigerian Institute of Medical Research, Yaba, Lagos State, Nigeria
| | - Babatunde L Salako
- Director-General's Office, The Nigerian Institute of Medical Research, Yaba, Lagos State, Nigeria
| | - Matthew O Bojuwoye
- Department of Medicine, University of Ilorin Teaching Hospital, Ilorin, Kwara State, Nigeria
| | - Ibraheem A Katibi
- Department of Medicine, University of Ilorin Teaching Hospital, Ilorin, Kwara State, Nigeria
| | - Brett A Lidbury
- National Centre for Epidemiology and Population Health, ANU College of Health and Medicine, The Australian National University, Acton, Australian Capital Territory, Australia
| |
Collapse
|
3
|
Cardozo G, Tirloni SF, Pereira Moro AR, Marques JLB. Use of Artificial Intelligence in the Search for New Information Through Routine Laboratory Tests: Systematic Review. JMIR BIOINFORMATICS AND BIOTECHNOLOGY 2022; 3:e40473. [PMID: 36644762 PMCID: PMC9828303 DOI: 10.2196/40473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Revised: 08/28/2022] [Accepted: 10/31/2022] [Indexed: 11/05/2022]
Abstract
Background In recent decades, the use of artificial intelligence has been widely explored in health care. Similarly, the amount of data generated in the most varied medical processes has practically doubled every year, requiring new methods of analysis and treatment of these data. Mainly aimed at aiding in the diagnosis and prevention of diseases, this precision medicine has shown great potential in different medical disciplines. Laboratory tests, for example, almost always present their results separately as individual values. However, physicians need to analyze a set of results to propose a supposed diagnosis, which leads us to think that sets of laboratory tests may contain more information than those presented separately for each result. In this way, the processes of medical laboratories can be strongly affected by these techniques. Objective In this sense, we sought to identify scientific research that used laboratory tests and machine learning techniques to predict hidden information and diagnose diseases. Methods The methodology adopted used the population, intervention, comparison, and outcomes principle, searching the main engineering and health sciences databases. The search terms were defined based on the list of terms used in the Medical Subject Heading database. Data from this study were presented descriptively and followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses; 2020) statement flow diagram and the National Institutes of Health tool for quality assessment of articles. During the analysis, the inclusion and exclusion criteria were independently applied by 2 authors, with a third author being consulted in cases of disagreement. Results Following the defined requirements, 40 studies presenting good quality in the analysis process were selected and evaluated. We found that, in recent years, there has been a significant increase in the number of works that have used this methodology, mainly because of COVID-19. In general, the studies used machine learning classification models to predict new information, and the most used parameters were data from routine laboratory tests such as the complete blood count. Conclusions Finally, we conclude that laboratory tests, together with machine learning techniques, can predict new tests, thus helping the search for new diagnoses. This process has proved to be advantageous and innovative for medical laboratories. It is making it possible to discover hidden information and propose additional tests, reducing the number of false negatives and helping in the early discovery of unknown diseases.
Collapse
Affiliation(s)
- Glauco Cardozo
- Federal Institute of Santa Catarina Florianópolis Brazil
| | | | | | | |
Collapse
|
4
|
Cardozo G, Pintarelli GB, Andreis GR, Lopes ACW, Marques JLB. Use of Machine Learning and Routine Laboratory Tests for Diabetes Mellitus Screening. BIOMED RESEARCH INTERNATIONAL 2022; 2022:8114049. [PMID: 35392258 PMCID: PMC8983182 DOI: 10.1155/2022/8114049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 02/18/2022] [Accepted: 03/10/2022] [Indexed: 12/28/2022]
Abstract
Most patients with diabetes mellitus are asymptomatic, which leads to delayed and more complex treatment. At the same time, most individuals are routinely subjected to standard clinical laboratory examinations, which create large health datasets over a lifetime. Computer processing has been used to search for health anomalies and predict diseases using clinical examinations. This work studied machine learning models to support the screening of diabetes through routine laboratory tests using data from laboratory tests of 62,496 patients. The classification and regression models used were the K-nearest neighbor, support vector machines, Bayes naïve, random forest models, and artificial neural networks. Glycated hemoglobin, a test used for diabetes diagnosis, was used as the target. Regression models calculated glycated hemoglobin directly and were later classified. The performance of classification computer models has been studied under various subdataset partitions and combinations (e.g., healthy, prediabetic, and diabetes, as well as no healthy and no diabetes). The best single performance was achieved with the artificial neural network model when detecting prediabetes or diabetes. The artificial neural network classification model scored 78.1%, 78.7%, and 78.4% for sensitivity, precision, and F1 scores, respectively, when identifying no healthy group. Other models also had good results, depending on what is desired. Machine learning-based models can predict glycated hemoglobin values from routine laboratory tests and can be used as a screening tool to refer a patient for further testing.
Collapse
Affiliation(s)
- Glauco Cardozo
- Academic Department of Health and Services, Federal Institute of Santa Catarina, Florianopolis, SC 88020-300, Brazil
- Institute of Biomedical Engineering, Federal University of Santa Catarina, Florianopolis, SC 88040-900, Brazil
| | - Guilherme Brasil Pintarelli
- Institute of Biomedical Engineering, Federal University of Santa Catarina, Florianopolis, SC 88040-900, Brazil
| | - Guilherme Rettore Andreis
- Institute of Biomedical Engineering, Federal University of Santa Catarina, Florianopolis, SC 88040-900, Brazil
| | | | - Jefferson Luiz Brum Marques
- Institute of Biomedical Engineering, Federal University of Santa Catarina, Florianopolis, SC 88040-900, Brazil
| |
Collapse
|
5
|
Lidbury BA, Koerbin G, Richardson AM, Badrick T. Gamma-Glutamyl Transferase (GGT) Is the Leading External Quality Assurance Predictor of ISO15189 Compliance for Pathology Laboratories. Diagnostics (Basel) 2021; 11:diagnostics11040692. [PMID: 33924582 PMCID: PMC8069573 DOI: 10.3390/diagnostics11040692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 04/06/2021] [Accepted: 04/10/2021] [Indexed: 11/16/2022] Open
Abstract
Pathology results are central to modern medical practice, informing diagnosis and patient management. To ensure high standards from pathology laboratories, regulators require compliance with international and local standards. In Australia, the monitoring and regulation of medical laboratories are achieved by conformance to ISO15189-National Pathology Accreditation Advisory Council standards, as assessed by the National Association of Testing Authorities (NATA), and an external quality assurance (EQA) assessment via the Royal College of Pathologists of Australasia Quality Assurance Program (RCPAQAP). While effective individually, integration of data collected by NATA and EQA testing promises advantages for the early detection of technical or management problems in the laboratory, and enhanced ongoing quality assessment. Random forest (RF) machine learning (ML) previously identified gamma-glutamyl transferase (GGT) as a leading predictor of NATA compliance condition reporting. In addition to further RF investigations, this study also deployed single decision trees and support vector machines (SVM) models that included creatinine, electrolytes and liver function test (LFT) EQA results. Across all analyses, GGT was consistently the top-ranked predictor variable, validating previous observations from Australian laboratories. SVM revealed broad patterns of predictive EQA marker interactions with NATA outcomes, and the distribution of GGT relative deviation suggested patterns by which to identify other strong EQA predictors of NATA outcomes. An integrated model of pathology quality assessment was successfully developed, via the prediction of NATA outcomes by EQA results. GGT consistently ranked as the best predictor variable, identified by combining recursive partitioning and SVM ML strategies.
Collapse
Affiliation(s)
- Brett A. Lidbury
- The National Centre for Epidemiology and Population Health, Research School of Population Health, The Australian National University, Canberra, ACT 2601, Australia; (B.A.L.); (A.M.R.)
| | - Gus Koerbin
- Faculty of Health, University of Canberra, Canberra, ACT 2617, Australia;
| | - Alice M. Richardson
- The National Centre for Epidemiology and Population Health, Research School of Population Health, The Australian National University, Canberra, ACT 2601, Australia; (B.A.L.); (A.M.R.)
- Statistical Consulting Unit, Australian National University, Canberra, ACT 2601, Australia
| | - Tony Badrick
- The National Centre for Epidemiology and Population Health, Research School of Population Health, The Australian National University, Canberra, ACT 2601, Australia; (B.A.L.); (A.M.R.)
- Australasia Quality Assurance Programs, Royal College of Pathologists, St. Leonards Sydney, NSW 2065, Australia
- Correspondence: ; Tel.: +61-2-(02)-6125-7875
| |
Collapse
|
6
|
Ronzio L, Cabitza F, Barbaro A, Banfi G. Has the Flood Entered the Basement? A Systematic Literature Review about Machine Learning in Laboratory Medicine. Diagnostics (Basel) 2021; 11:372. [PMID: 33671623 PMCID: PMC7926482 DOI: 10.3390/diagnostics11020372] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 02/08/2021] [Accepted: 02/18/2021] [Indexed: 02/08/2023] Open
Abstract
This article presents a systematic literature review that expands and updates a previous review on the application of machine learning to laboratory medicine. We used Scopus and PubMed to collect, select and analyse the papers published from 2017 to the present in order to highlight the main studies that have applied machine learning techniques to haematochemical parameters and to review their diagnostic and prognostic performance. In doing so, we aim to address the question we asked three years ago about the potential of these techniques in laboratory medicine and the need to leverage a tool that was still under-utilised at that time.
Collapse
Affiliation(s)
- Luca Ronzio
- Department of Informatics, University of Milano-Bicocca, 20126 Milan, Italy;
| | - Federico Cabitza
- Department of Informatics, University of Milano-Bicocca, 20126 Milan, Italy;
| | - Alessandro Barbaro
- IRCCS Istituto Ortopedico Galeazzi, Via Riccardo Galeazzi, 4, 20161 Milan, Italy; (A.B.); (G.B.)
| | - Giuseppe Banfi
- IRCCS Istituto Ortopedico Galeazzi, Via Riccardo Galeazzi, 4, 20161 Milan, Italy; (A.B.); (G.B.)
- School of Medicine, University Vita-Salute San Raffaele, Via Olgettina, 58, 20132 Milan, Italy
| |
Collapse
|
7
|
Du G, Zhang J, Luo Z, Ma F, Ma L, Li S. Joint imbalanced classification and feature selection for hospital readmissions. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106020] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
8
|
Application of Support Vector Machines in Viral Biology. GLOBAL VIROLOGY III: VIROLOGY IN THE 21ST CENTURY 2019. [PMCID: PMC7114997 DOI: 10.1007/978-3-030-29022-1_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Novel experimental and sequencing techniques have led to an exponential explosion and spiraling of data in viral genomics. To analyse such data, rapidly gain information, and transform this information to knowledge, interdisciplinary approaches involving several different types of expertise are necessary. Machine learning has been in the forefront of providing models with increasing accuracy due to development of newer paradigms with strong fundamental bases. Support Vector Machines (SVM) is one such robust tool, based rigorously on statistical learning theory. SVM provides very high quality and robust solutions to classification and regression problems. Several studies in virology employ high performance tools including SVM for identification of potentially important gene and protein functions. This is mainly due to the highly beneficial aspects of SVM. In this chapter we briefly provide lucid and easy to understand details of SVM algorithms along with applications in virology.
Collapse
|