1
|
Gurumurthy G, Gurumurthy J, Gurumurthy S. Machine learning in paediatric haematological malignancies: a systematic review of prognosis, toxicity and treatment response models. Pediatr Res 2024:10.1038/s41390-024-03494-9. [PMID: 39215200 DOI: 10.1038/s41390-024-03494-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/17/2024] [Revised: 06/22/2024] [Accepted: 08/05/2024] [Indexed: 09/04/2024]
Abstract
BACKGROUND Machine Learning (ML) has demonstrated potential in enhancing care in adult oncology. However, its application in paediatric haematological malignancies is still emerging, necessitating a comprehensive review of its capabilities and limitations in this area. METHODS A literature search was conducted through Ovid. Studies included focused on ML models in paediatric patients with haematological malignancies. Studies were categorised into thematic groups for analysis. RESULTS Twenty studies, primarily on leukaemia, were included in this review. Studies were organised into thematic categories such as prognoses, treatment responses and toxicity predictions. Prognostic studies showed AUC scores between 0.685 and 0.929, indicating moderate-high predictive accuracy. Treatment response studies demonstrated AUC scores between 0.840 and 0.875, reflecting moderate accuracy. Toxicity prediction studies reported high accuracy with AUC scores from 0.870 to 0.927. Only five studies (25%) performed external validation. Significant heterogeneity was noted in ML tasks, reporting formats, and effect measures across studies, highlighting a lack of standardised reporting and challenges in data comparability. CONCLUSION The clinical applicability of these ML models remains limited by the lack of external validation and methodological heterogeneity. Addressing these challenges through standardised reporting and rigorous external validation is needed to translate ML from a promising research tool into a reliable clinical practice component. IMPACT Key message: Machine Learning (ML) significantly enhances predictive models in paediatric haematological cancers, offering new avenues for personalised treatment strategies. Future research should focus on developing ML models that can integrate with real-time clinical workflows. Addition to literature: Provides a comprehensive overview of current ML applications and trends. It identifies limitations to its applicability, including the limited diversity in datasets, which may affect the generalisability of ML models across different populations. IMPACT Encourages standardisation and external validation in ML studies, aiming to improve patient outcomes through precision medicine in paediatric haematological oncology.
Collapse
Affiliation(s)
| | - Juditha Gurumurthy
- School of Cancer and Pharmaceutical Sciences, King's College London, London, UK
| | - Samantha Gurumurthy
- Department of Infectious Diseases & Immunology, Imperial College London, London, UK
| |
Collapse
|
2
|
Tang M, Antić Ž, Fardzadeh P, Pietzsch S, Schröder C, Eberhardt A, van Bömmel A, Escherich G, Hofmann W, Horstmann MA, Illig T, McCrary JM, Lentes J, Metzler M, Nejdl W, Schlegelberger B, Schrappe M, Zimmermann M, Miarka-Walczyk K, Pastorczak A, Cario G, Renard BY, Stanulla M, Bergmann AK. An artificial intelligence-assisted clinical framework to facilitate diagnostics and translational discovery in hematologic neoplasia. EBioMedicine 2024; 104:105171. [PMID: 38810562 PMCID: PMC11154115 DOI: 10.1016/j.ebiom.2024.105171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 05/10/2024] [Accepted: 05/15/2024] [Indexed: 05/31/2024] Open
Abstract
BACKGROUND The increasing volume and intricacy of sequencing data, along with other clinical and diagnostic data, like drug responses and measurable residual disease, creates challenges for efficient clinical comprehension and interpretation. Using paediatric B-cell precursor acute lymphoblastic leukaemia (BCP-ALL) as a use case, we present an artificial intelligence (AI)-assisted clinical framework clinALL that integrates genomic and clinical data into a user-friendly interface to support routine diagnostics and reveal translational insights for hematologic neoplasia. METHODS We performed targeted RNA sequencing in 1365 cases with haematological neoplasms, primarily paediatric B-cell precursor acute lymphoblastic leukaemia (BCP-ALL) from the AIEOP-BFM ALL study. We carried out fluorescence in situ hybridization (FISH), karyotyping and arrayCGH as part of the routine diagnostics. The analysis results of these assays as well as additional clinical information were integrated into an interactive web interface using Bokeh, where the main graph is based on Uniform Manifold Approximation and Projection (UMAP) analysis of the gene expression data. At the backend of the clinALL, we built both shallow machine learning models and a deep neural network using Scikit-learn and PyTorch respectively. FINDINGS By applying clinALL, 78% of undetermined patients under the current diagnostic protocol were stratified, and ambiguous cases were investigated. Translational insights were discovered, including IKZF1plus status dependent subpopulations of BCR::ABL1 positive patients, and a subpopulation within ETV6::RUNX1 positive patients that has a high relapse frequency. Our best machine learning models, LDA and PASNET-like neural network models, achieve F1 scores above 97% in predicting patients' subgroups. INTERPRETATION An AI-assisted clinical framework that integrates both genomic and clinical data can take full advantage of the available data, improve point-of-care decision-making and reveal clinically relevant insights promptly. Such a lightweight and easily transferable framework works for both whole transcriptome data as well as the cost-effective targeted RNA-seq, enabling efficient and equitable delivery of personalized medicine in small clinics in developing countries. FUNDING German Ministry of Education and Research (BMBF), German Research Foundation (DFG) and Foundation for Polish Science.
Collapse
Affiliation(s)
- Ming Tang
- Department of Human Genetics, Hannover Medical School, Hannover, Germany; L3S Research Centre, Leibniz University Hannover, Germany
| | - Željko Antić
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | | | - Stefan Pietzsch
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Charlotte Schröder
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | | | - Alena van Bömmel
- Leibniz Institute on Aging - Fritz Lipmann Institute (FLI), Jena, Germany
| | - Gabriele Escherich
- Clinic of Paediatric Haematology and Oncology, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
| | - Winfried Hofmann
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Martin A Horstmann
- Clinic of Paediatric Haematology and Oncology, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany; Research Institute Children's Cancer Centre Hamburg, Hamburg, Germany
| | - Thomas Illig
- Hannover Unified Bio Bank, Hannover Medical School, Hannover, Germany
| | - J Matt McCrary
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Jana Lentes
- Department of Human Genetics, Hannover Medical School, Hannover, Germany
| | - Markus Metzler
- Department of Paediatrics, University Hospital Erlangen, Erlangen, Germany
| | - Wolfgang Nejdl
- L3S Research Centre, Leibniz University Hannover, Germany
| | | | - Martin Schrappe
- Department of Paediatrics, University Medical Centre Schleswig-Holstein, Campus Kiel, Kiel, Germany
| | - Martin Zimmermann
- Department of Paediatric Haematology and Oncology, Hannover Medical School, Hannover, Germany
| | - Karolina Miarka-Walczyk
- Department of Paediatrics, Oncology and Haematology, Medical University of Lodz, Lodz, Poland
| | - Agata Pastorczak
- Department of Paediatrics, Oncology and Haematology, Medical University of Lodz, Lodz, Poland
| | - Gunnar Cario
- Department of Paediatrics, University Medical Centre Schleswig-Holstein, Campus Kiel, Kiel, Germany
| | - Bernhard Y Renard
- Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany
| | - Martin Stanulla
- Department of Paediatric Haematology and Oncology, Hannover Medical School, Hannover, Germany
| | | |
Collapse
|
3
|
Al-Hussaini I, White B, Varmeziar A, Mehra N, Sanchez M, Lee J, DeGroote NP, Miller TP, Mitchell CS. An Interpretable Machine Learning Framework for Rare Disease: A Case Study to Stratify Infection Risk in Pediatric Leukemia. J Clin Med 2024; 13:1788. [PMID: 38542012 PMCID: PMC10970787 DOI: 10.3390/jcm13061788] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 03/08/2024] [Accepted: 03/12/2024] [Indexed: 04/18/2024] Open
Abstract
Background: Datasets on rare diseases, like pediatric acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL), have small sample sizes that hinder machine learning (ML). The objective was to develop an interpretable ML framework to elucidate actionable insights from small tabular rare disease datasets. Methods: The comprehensive framework employed optimized data imputation and sampling, supervised and unsupervised learning, and literature-based discovery (LBD). The framework was deployed to assess treatment-related infection in pediatric AML and ALL. Results: An interpretable decision tree classified the risk of infection as either "high risk" or "low risk" in pediatric ALL (n = 580) and AML (n = 132) with accuracy of ∼79%. Interpretable regression models predicted the discrete number of developed infections with a mean absolute error (MAE) of 2.26 for bacterial infections and an MAE of 1.29 for viral infections. Features that best explained the development of infection were the chemotherapy regimen, cancer cells in the central nervous system at initial diagnosis, chemotherapy course, leukemia type, Down syndrome, race, and National Cancer Institute risk classification. Finally, SemNet 2.0, an open-source LBD software that links relationships from 33+ million PubMed articles, identified additional features for the prediction of infection, like glucose, iron, neutropenia-reducing growth factors, and systemic lupus erythematosus (SLE). Conclusions: The developed ML framework enabled state-of-the-art, interpretable predictions using rare disease tabular datasets. ML model performance baselines were successfully produced to predict infection in pediatric AML and ALL.
Collapse
Affiliation(s)
- Irfan Al-Hussaini
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
- Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Brandon White
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
| | - Armon Varmeziar
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
| | - Nidhi Mehra
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
| | - Milagro Sanchez
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
| | - Judy Lee
- Aflac Cancer and Blood Disorders Center, Children’s Healthcare of Atlanta, Atlanta, GA 30322, USA (T.P.M.)
| | - Nicholas P. DeGroote
- Aflac Cancer and Blood Disorders Center, Children’s Healthcare of Atlanta, Atlanta, GA 30322, USA (T.P.M.)
| | - Tamara P. Miller
- Aflac Cancer and Blood Disorders Center, Children’s Healthcare of Atlanta, Atlanta, GA 30322, USA (T.P.M.)
- Department of Pediatrics, Division of Pediatric Hematology/Oncology, Emory University, Atlanta, GA 30332, USA
| | - Cassie S. Mitchell
- Laboratory for Pathology Dynamics, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA 30332, USA
- Machine Learning Center at Georgia Tech, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
4
|
Castro GA, Almeida JM, Machado-Neto JA, Almeida TA. A decision support system to recommend appropriate therapy protocol for AML patients. Front Artif Intell 2024; 7:1343447. [PMID: 38510471 PMCID: PMC10950921 DOI: 10.3389/frai.2024.1343447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 02/19/2024] [Indexed: 03/22/2024] Open
Abstract
Introduction Acute Myeloid Leukemia (AML) is one of the most aggressive hematological neoplasms, emphasizing the critical need for early detection and strategic treatment planning. The association between prompt intervention and enhanced patient survival rates underscores the pivotal role of therapy decisions. To determine the treatment protocol, specialists heavily rely on prognostic predictions that consider the response to treatment and clinical outcomes. The existing risk classification system categorizes patients into favorable, intermediate, and adverse groups, forming the basis for personalized therapeutic choices. However, accurately assessing the intermediate-risk group poses significant challenges, potentially resulting in treatment delays and deterioration of patient conditions. Methods This study introduces a decision support system leveraging cutting-edge machine learning techniques to address these issues. The system automatically recommends tailored oncology therapy protocols based on outcome predictions. Results The proposed approach achieved a high performance close to 0.9 in F1-Score and AUC. The model generated with gene expression data exhibited superior performance. Discussion Our system can effectively support specialists in making well-informed decisions regarding the most suitable and safe therapy for individual patients. The proposed decision support system has the potential to not only streamline treatment initiation but also contribute to prolonged survival and improved quality of life for individuals diagnosed with AML. This marks a significant stride toward optimizing therapeutic interventions and patient outcomes.
Collapse
Affiliation(s)
- Giovanna A. Castro
- Department of Computer Science, Federal University of São Carlos (UFSCar) Sorocaba, São Paulo, Brazil
| | - Jade M. Almeida
- Department of Computer Science, Federal University of São Carlos (UFSCar) Sorocaba, São Paulo, Brazil
| | - João A. Machado-Neto
- Institute of Biomedical Sciences, The University of São Paulo (USP), São Paulo, Brazil
| | - Tiago A. Almeida
- Department of Computer Science, Federal University of São Carlos (UFSCar) Sorocaba, São Paulo, Brazil
| |
Collapse
|
5
|
Gómez‐Rojas S, Segura GP, Ollé J, Carreño Gómez‐Tarragona G, Medina JG, Aguado JM, Guerrero EV, Santaella MP, Martínez‐López J. A machine learning tool for the diagnosis of SARS-CoV-2 infection from hemogram parameters. J Cell Mol Med 2023; 27:3423-3430. [PMID: 37882471 PMCID: PMC10660618 DOI: 10.1111/jcmm.17864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 06/20/2023] [Accepted: 07/05/2023] [Indexed: 10/27/2023] Open
Abstract
Monocytes and neutrophils play key roles in the cytokine storm triggered by SARS-CoV-2 infection, which changes their conformation and function. These changes are detectable at the cellular and molecular level and may be different to what is observed in other respiratory infections. Here, we applied machine learning (ML) to develop and validate an algorithm to diagnose COVID-19 using blood parameters. In this retrospective single-center study, 49 hemogram parameters from 12,321 patients with clinical suspicion of COVID-19 and tested by RT-PCR (4239 positive and 8082 negative) were analysed. The dataset was randomly divided into training and validation sets. Blood cell parameters and patient age were used to construct the predictive model with the support vector machine (SVM) tool. The model constructed from the training set (5936 patients) achieved an accuracy for diagnosis of SARS-CoV-2 infection of 0.952 (95% CI: 0.875-0.892). Test sensitivity and specificity was 0.868 and 0.899, respectively, with a positive (PPV) and negative (NPV) predictive value of 0.896 and 0.872, respectively (prevalence 0.50). The validation set model (4964 patients) achieved an accuracy of 0.894 (95% CI: 0.883-0.903). Test sensitivity and specificity was 0.8922 and 0.8951, respectively, with a positive (PPV) and negative (NPV) predictive value of 0.817 and 0.94, respectively (prevalence 0.34). The area under the receiver operating characteristic curve was 0.952 for the algorithm performance. This algorithm may allow to rule out COVID-19 diagnosis with 94% of probability. This represents a great advance for early diagnostic orientation and guiding clinical decisions.
Collapse
Affiliation(s)
- S. Gómez‐Rojas
- Department of HematologyHospital Universitario 12 octubreMadridSpain
| | - G. Pérez Segura
- Department of HematologyHospital Universitario 12 octubreMadridSpain
| | - J. Ollé
- Conceptos Claros CoBarcelonaSpain
| | | | - J. González Medina
- Department of HematologyHospital Universitario Fundación Jiménez DíazMadridSpain
| | - J. M. Aguado
- Unit of Infectious DiseasesHospital Universitario "12 de Octubre", Instituto de Investigación Sanitaria Hospital "12 de Octubre" (i+12), CIBERINFEC, ISCIIIMadridSpain
- Department of Medicine, School of MedicineUniversidad ComplutenseMadridSpain
| | - E. Vera Guerrero
- Department of HematologyHospital Universitario 12 octubreMadridSpain
| | - M. Poza Santaella
- Department of HematologyHospital Universitario 12 octubreMadridSpain
| | - J. Martínez‐López
- Department of HematologyHospital Universitario 12 octubreMadridSpain
- Department of Medicine, School of MedicineUniversidad ComplutenseMadridSpain
| |
Collapse
|
6
|
Ferrato MH, Marsh AG, Franke KR, Huang BJ, Kolb EA, DeRyckere D, Grahm DK, Chandrasekaran S, Crowgey EL. Machine learning classifier approaches for predicting response to RTK-type-III inhibitors demonstrate high accuracy using transcriptomic signatures and ex vivo data. BIOINFORMATICS ADVANCES 2023; 3:vbad034. [PMID: 37250111 PMCID: PMC10209528 DOI: 10.1093/bioadv/vbad034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 02/16/2023] [Accepted: 03/21/2023] [Indexed: 05/31/2023]
Abstract
Motivation The application of machine learning (ML) techniques in the medical field has demonstrated both successes and challenges in the precision medicine era. The ability to accurately classify a subject as a potential responder versus a nonresponder to a given therapy is still an active area of research pushing the field to create new approaches for applying machine-learning techniques. In this study, we leveraged publicly available data through the BeatAML initiative. Specifically, we used gene count data, generated via RNA-seq, from 451 individuals matched with ex vivo data generated from treatment with RTK-type-III inhibitors. Three feature selection techniques were tested, principal component analysis, Shapley Additive Explanation (SHAP) technique and differential gene expression analysis, with three different classifiers, XGBoost, LightGBM and random forest (RF). Sensitivity versus specificity was analyzed using the area under the curve (AUC)-receiver operating curves (ROCs) for every model developed. Results Our work demonstrated that feature selection technique, rather than the classifier, had the greatest impact on model performance. The SHAP technique outperformed the other feature selection techniques and was able to with high accuracy predict outcome response, with the highest performing model: Foretinib with 89% AUC using the SHAP technique and RF classifier. Our ML pipelines demonstrate that at the time of diagnosis, a transcriptomics signature exists that can potentially predict response to treatment, demonstrating the potential of using ML applications in precision medicine efforts. Availability and implementation https://github.com/UD-CRPL/RCDML. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | | | - Karl R Franke
- Nemours Children Health System, Wilmington, DE 19803, USA
| | - Benjamin J Huang
- Department of Pediatrics, University of California San Francisco, San Francisco, CA 94143, USA
- Helen Diller Family Comprehensive Cancer Center, University of California San Francisco, San Francisco, CA 94143, USA
| | - E Anders Kolb
- Nemours Children Health System, Wilmington, DE 19803, USA
| | - Deborah DeRyckere
- Department of Pediatrics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Douglas K Grahm
- Department of Pediatrics, Emory University School of Medicine, Atlanta, GA 30322, USA
| | | | | |
Collapse
|
7
|
Eckardt JN, Röllig C, Metzeler K, Kramer M, Stasik S, Georgi JA, Heisig P, Spiekermann K, Krug U, Braess J, Görlich D, Sauerland CM, Woermann B, Herold T, Berdel WE, Hiddemann W, Kroschinsky F, Schetelig J, Platzbecker U, Müller-Tidow C, Sauer T, Serve H, Baldus C, Schäfer-Eckart K, Kaufmann M, Krause S, Hänel M, Schliemann C, Hanoun M, Thiede C, Bornhäuser M, Wendt K, Middeke JM. Prediction of complete remission and survival in acute myeloid leukemia using supervised machine learning. Haematologica 2023; 108:690-704. [PMID: 35708137 PMCID: PMC9973482 DOI: 10.3324/haematol.2021.280027] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Indexed: 11/09/2022] Open
Abstract
Achievement of complete remission signifies a crucial milestone in the therapy of acute myeloid leukemia (AML) while refractory disease is associated with dismal outcomes. Hence, accurately identifying patients at risk is essential to tailor treatment concepts individually to disease biology. We used nine machine learning (ML) models to predict complete remission and 2-year overall survival in a large multicenter cohort of 1,383 AML patients who received intensive induction therapy. Clinical, laboratory, cytogenetic and molecular genetic data were incorporated and our results were validated on an external multicenter cohort. Our ML models autonomously selected predictive features including established markers of favorable or adverse risk as well as identifying markers of so-far controversial relevance. De novo AML, extramedullary AML, double-mutated CEBPA, mutations of CEBPA-bZIP, NPM1, FLT3-ITD, ASXL1, RUNX1, SF3B1, IKZF1, TP53, and U2AF1, t(8;21), inv(16)/t(16;16), del(5)/del(5q), del(17)/del(17p), normal or complex karyotypes, age and hemoglobin concentration at initial diagnosis were statistically significant markers predictive of complete remission, while t(8;21), del(5)/del(5q), inv(16)/t(16;16), del(17)/del(17p), double-mutated CEBPA, CEBPA-bZIP, NPM1, FLT3-ITD, DNMT3A, SF3B1, U2AF1, and TP53 mutations, age, white blood cell count, peripheral blast count, serum lactate dehydrogenase level and hemoglobin concentration at initial diagnosis as well as extramedullary manifestations were predictive for 2-year overall survival. For prediction of complete remission and 2-year overall survival areas under the receiver operating characteristic curves ranged between 0.77-0.86 and between 0.63-0.74, respectively in our test set, and between 0.71-0.80 and 0.65-0.75 in the external validation cohort. We demonstrated the feasibility of ML for risk stratification in AML as a model disease for hematologic neoplasms, using a scalable and reusable ML framework. Our study illustrates the clinical applicability of ML as a decision support system in hematology.
Collapse
Affiliation(s)
- Jan-Niklas Eckardt
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden.
| | - Christoph Röllig
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | - Klaus Metzeler
- Medical Clinic and Policlinic I Hematology and Cell Therapy. University Hospital, Leipzig
| | - Michael Kramer
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | - Sebastian Stasik
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | | | - Peter Heisig
- Institute of Software and Multimedia Technology, Technical University Dresden, Dresden
| | - Karsten Spiekermann
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich
| | - Utz Krug
- Medical Clinic III, Hospital Leverkusen, Leverkusen
| | - Jan Braess
- Hospital Barmherzige Brueder Regensburg, Regensburg
| | - Dennis Görlich
- Institute for Biometrics and Clinical Research, University Muenster, Muenster
| | | | - Bernhard Woermann
- Department of Hematology, Oncology and Tumor Immunology, Charité, Berlin
| | - Tobias Herold
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich
| | - Wolfgang E Berdel
- Department of Internal Medicine A, University Hospital Muenster, Muenster
| | - Wolfgang Hiddemann
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich
| | - Frank Kroschinsky
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | - Johannes Schetelig
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| | - Uwe Platzbecker
- Medical Clinic and Policlinic I Hematology and Cell Therapy. University Hospital, Leipzig
| | - Carsten Müller-Tidow
- Department of Medicine V, University Hospital Heidelberg, Heidelberg, Germany; German Consortium for Translational Cancer Research DKFZ, Heidelberg
| | - Tim Sauer
- Department of Medicine V, University Hospital Heidelberg, Heidelberg
| | - Hubert Serve
- Department of Medicine 2, Hematology and Oncology, Goethe University Frankfurt, Frankfurt
| | - Claudia Baldus
- Department of Hematology and Oncology, University Hospital Schleswig Holstein, Kiel
| | - Kerstin Schäfer-Eckart
- Department of Internal Medicine 5, Paracelsus Medical Private University Nuremberg, Nuremberg
| | - Martin Kaufmann
- Department of Hematology, Oncology and Palliative Care, Robert-Bosch Hospital, Stuttgart
| | - Stefan Krause
- Department of Internal Medicine 5, University Hospital Erlangen, Erlangen
| | - Mathias Hänel
- Department of Internal Medicine 3, Klinikum Chemnitz GmbH, Chemnitz, Germany; Department of Hematology and Stem Cell Transplantation, University Hospital Essen, Essen
| | | | - Maher Hanoun
- Department of Internal Medicine 3, Klinikum Chemnitz GmbH, Chemnitz, Germany; Department of Hematology and Stem Cell Transplantation, University Hospital Essen, Essen
| | - Christian Thiede
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany; German Consortium for Translational Cancer Research DKFZ, Heidelberg
| | - Martin Bornhäuser
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany; German Consortium for Translational Cancer Research DKFZ, Heidelberg, Germany; National Center for Tumor Diseases (NCT), Dresden
| | - Karsten Wendt
- Medical Clinic and Policlinic I Hematology and Cell Therapy. University Hospital, Leipzig
| | - Jan Moritz Middeke
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden
| |
Collapse
|
8
|
Falini B. AML risk models: where do we stand ? Am J Hematol 2022; 97:1124-1126. [PMID: 35856388 DOI: 10.1002/ajh.26666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 07/02/2022] [Indexed: 11/08/2022]
Affiliation(s)
- Brunangelo Falini
- Institute of Hematology and CREO, University and Hospital of Perugia, Perugia, Italy
| |
Collapse
|
9
|
El Alaoui Y, Elomri A, Qaraqe M, Padmanabhan R, Yasin Taha R, El Omri H, El Omri A, Aboumarzouk O. A Review of Artificial Intelligence Applications in Hematology Management: Current Practices and Future Prospects. J Med Internet Res 2022; 24:e36490. [PMID: 35819826 PMCID: PMC9328784 DOI: 10.2196/36490] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Revised: 05/14/2022] [Accepted: 05/29/2022] [Indexed: 12/23/2022] Open
Abstract
Background Machine learning (ML) and deep learning (DL) methods have recently garnered a great deal of attention in the field of cancer research by making a noticeable contribution to the growth of predictive medicine and modern oncological practices. Considerable focus has been particularly directed toward hematologic malignancies because of the complexity in detecting early symptoms. Many patients with blood cancer do not get properly diagnosed until their cancer has reached an advanced stage with limited treatment prospects. Hence, the state-of-the-art revolves around the latest artificial intelligence (AI) applications in hematology management. Objective This comprehensive review provides an in-depth analysis of the current AI practices in the field of hematology. Our objective is to explore the ML and DL applications in blood cancer research, with a special focus on the type of hematologic malignancies and the patient’s cancer stage to determine future research directions in blood cancer. Methods We searched a set of recognized databases (Scopus, Springer, and Web of Science) using a selected number of keywords. We included studies written in English and published between 2015 and 2021. For each study, we identified the ML and DL techniques used and highlighted the performance of each model. Results Using the aforementioned inclusion criteria, the search resulted in 567 papers, of which 144 were selected for review. Conclusions The current literature suggests that the application of AI in the field of hematology has generated impressive results in the screening, diagnosis, and treatment stages. Nevertheless, optimizing the patient’s pathway to treatment requires a prior prediction of the malignancy based on the patient’s symptoms or blood records, which is an area that has still not been properly investigated.
Collapse
Affiliation(s)
- Yousra El Alaoui
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Adel Elomri
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Marwa Qaraqe
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Regina Padmanabhan
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Ruba Yasin Taha
- National Center for Cancer Care and Research, Hamad Medical Corporation, Doha, Qatar
| | - Halima El Omri
- National Center for Cancer Care and Research, Hamad Medical Corporation, Doha, Qatar
| | - Abdelfatteh El Omri
- Surgical Research Section, Department of Surgery, Hamad Medical Corporation, Doha, Qatar
| | - Omar Aboumarzouk
- Surgical Research Section, Department of Surgery, Hamad Medical Corporation, Doha, Qatar.,College of Medicine, Qatar University, Doha, Qatar.,College of Medicine, University of Glasgow, Glasgow, United Kingdom
| |
Collapse
|
10
|
Malakar S, Roy SD, Das S, Sen S, Velásquez JD, Sarkar R. Computer Based Diagnosis of Some Chronic Diseases: A Medical Journey of the Last Two Decades. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING : STATE OF THE ART REVIEWS 2022; 29:5525-5567. [PMID: 35729963 PMCID: PMC9199478 DOI: 10.1007/s11831-022-09776-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 05/22/2022] [Indexed: 06/15/2023]
Abstract
Disease prediction from diagnostic reports and pathological images using artificial intelligence (AI) and machine learning (ML) is one of the fastest emerging applications in recent days. Researchers are striving to achieve near-perfect results using advanced hardware technologies in amalgamation with AI and ML based approaches. As a result, a large number of AI and ML based methods are found in the literature. A systematic survey describing the state-of-the-art disease prediction methods, specifically chronic disease prediction algorithms, will provide a clear idea about the recent models developed in this field. This will also help the researchers to identify the research gaps present there. To this end, this paper looks over the approaches in the literature designed for predicting chronic diseases like Breast Cancer, Lung Cancer, Leukemia, Heart Disease, Diabetes, Chronic Kidney Disease and Liver Disease. The advantages and disadvantages of various techniques are thoroughly explained. This paper also presents a detailed performance comparison of different methods. Finally, it concludes the survey by highlighting some future research directions in this field that can be addressed through the forthcoming research attempts.
Collapse
Affiliation(s)
- Samir Malakar
- Department of Computer Science, Asutosh College, Kolkata, India
| | - Soumya Deep Roy
- Department of Metallurgical and Material Engineering, Jadavpur University, Kolkata, India
| | - Soham Das
- Department of Metallurgical and Material Engineering, Jadavpur University, Kolkata, India
| | - Swaraj Sen
- Department of Computer Science and Engineering, Jadavpur University, Kolkata, India
| | - Juan D. Velásquez
- Departament of Industrial Engineering, University of Chile, Santiago, Chile
- Instituto Sistemas Complejos de Ingeniería (ISCI), Santiago, Chile
| | - Ram Sarkar
- Department of Computer Science and Engineering, Jadavpur University, Kolkata, India
| |
Collapse
|
11
|
Blood cancer prediction using leukemia microarray gene data and hybrid logistic vector trees model. Sci Rep 2022; 12:1000. [PMID: 35046459 PMCID: PMC8770560 DOI: 10.1038/s41598-022-04835-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 12/09/2021] [Indexed: 01/21/2023] Open
Abstract
Blood cancer has been a growing concern during the last decade and requires early diagnosis to start proper treatment. The diagnosis process is costly and time-consuming involving medical experts and several tests. Thus, an automatic diagnosis system for its accurate prediction is of significant importance. Diagnosis of blood cancer using leukemia microarray gene data and machine learning approach has become an important medical research today. Despite research efforts, desired accuracy and efficiency necessitate further enhancements. This study proposes an approach for blood cancer disease prediction using the supervised machine learning approach. For the current study, the leukemia microarray gene dataset containing 22,283 genes, is used. ADASYN resampling and Chi-squared (Chi2) features selection techniques are used to resolve imbalanced and high-dimensional dataset problems. ADASYN generates artificial data to make the dataset balanced for each target class, and Chi2 selects the best features out of 22,283 to train learning models. For classification, a hybrid logistics vector trees classifier (LVTrees) is proposed which utilizes logistic regression, support vector classifier, and extra tree classifier. Besides extensive experiments on the datasets, performance comparison with the state-of-the-art methods has been made for determining the significance of the proposed approach. LVTrees outperform all other models with ADASYN and Chi2 techniques with a significant 100% accuracy. Further, a statistical significance T-test is also performed to show the efficacy of the proposed approach. Results using k-fold cross-validation prove the supremacy of the proposed model.
Collapse
|
12
|
Bohannan ZS, Coffman F, Mitrofanova A. Random survival forest model identifies novel biomarkers of event-free survival in high-risk pediatric acute lymphoblastic leukemia. Comput Struct Biotechnol J 2022; 20:583-597. [PMID: 35116134 PMCID: PMC8777142 DOI: 10.1016/j.csbj.2022.01.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 12/30/2021] [Accepted: 01/01/2022] [Indexed: 12/16/2022] Open
Abstract
High-risk pediatric B-ALL patients experience 5-year negative event rates up to 25%. Although some biomarkers of relapse are utilized in the clinic, their ability to predict outcomes in high-risk patients is limited. Here, we propose a random survival forest (RSF) machine learning model utilizing interpretable genomic inputs to predict relapse/death in high-risk pediatric B-ALL patients. We utilized whole exome sequencing profiles from 156 patients in the TARGET-ALL study (with samples collected at presentation) further stratified into training and test cohorts (109 and 47 patients, respectively). To avoid overfitting and facilitate the interpretation of machine learning results, input genomic variables were engineered using a stepwise approach involving univariable Cox models to select variables directly associated with outcomes, genomic coordinate-based analysis to select mutational hotspots, and correlation analysis to eliminate feature co-linearity. Model training identified 7 genomic regions most predictive of relapse/death-free survival. The test cohort error rate was 12.47%, and a polygenic score based on the sum of the top 7 variables effectively stratified patients into two groups, with significant differences in time to relapse/death (log-rank P = 0.001, hazard ratio = 5.41). Our model outperformed other EFS modeling approaches including an RSF using gold-standard prognostic variables (error rate = 24.35%). Validation in 174 standard-risk patients and 3 patients who failed to respond to induction therapy confirmed that our RSF model and polygenic score were specific to high-risk disease. We propose that our feature selection/engineering approach can increase the clinical interpretability of RSF, and our polygenic score could be utilized for enhance clinical decision-making in high-risk B-ALL.
Collapse
Affiliation(s)
- Zachary S. Bohannan
- Rutgers, The State University of New Jersey, School of Health Professions, Department of Health Informatics, 65 Bergen Street, Suite 120, Newark, NJ 07107-1709, United States
| | - Frederick Coffman
- Rutgers, The State University of New Jersey, School of Health Professions, Department of Health Informatics, 65 Bergen Street, Suite 120, Newark, NJ 07107-1709, United States
| | - Antonina Mitrofanova
- Rutgers, The State University of New Jersey, School of Health Professions, Department of Health Informatics, 65 Bergen Street, Suite 120, Newark, NJ 07107-1709, United States
| |
Collapse
|
13
|
Zaccaria GM, Ferrero S, Hoster E, Passera R, Evangelista A, Genuardi E, Drandi D, Ghislieri M, Barbero D, Del Giudice I, Tani M, Moia R, Volpetti S, Cabras MG, Di Renzo N, Merli F, Vallisa D, Spina M, Pascarella A, Latte G, Patti C, Fabbri A, Guarini A, Vitolo U, Hermine O, Kluin-Nelemans HC, Cortelazzo S, Dreyling M, Ladetto M. A Clinical Prognostic Model Based on Machine Learning from the Fondazione Italiana Linfomi (FIL) MCL0208 Phase III Trial. Cancers (Basel) 2021; 14:188. [PMID: 35008361 PMCID: PMC8750124 DOI: 10.3390/cancers14010188] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 12/26/2021] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND Multicenter clinical trials are producing growing amounts of clinical data. Machine Learning (ML) might facilitate the discovery of novel tools for prognostication and disease-stratification. Taking advantage of a systematic collection of multiple variables, we developed a model derived from data collected on 300 patients with mantle cell lymphoma (MCL) from the Fondazione Italiana Linfomi-MCL0208 phase III trial (NCT02354313). METHODS We developed a score with a clustering algorithm applied to clinical variables. The candidate score was correlated to overall survival (OS) and validated in two independent data series from the European MCL Network (NCT00209222, NCT00209209); Results: Three groups of patients were significantly discriminated: Low, Intermediate (Int), and High risk (High). Seven discriminants were identified by a feature reduction approach: albumin, Ki-67, lactate dehydrogenase, lymphocytes, platelets, bone marrow infiltration, and B-symptoms. Accordingly, patients in the Int and High groups had shorter OS rates than those in the Low and Int groups, respectively (Int→Low, HR: 3.1, 95% CI: 1.0-9.6; High→Int, HR: 2.3, 95% CI: 1.5-4.7). Based on the 7 markers, we defined the engineered MCL international prognostic index (eMIPI), which was validated and confirmed in two independent cohorts; Conclusions: We developed and validated a ML-based prognostic model for MCL. Even when currently limited to baseline predictors, our approach has high scalability potential.
Collapse
Affiliation(s)
- Gian Maria Zaccaria
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
- Unit of Hematology and Cell Therapy, IRCCS-Istituto Tumori ‘Giovanni Paolo II’, 70124 Bari, Italy;
| | - Simone Ferrero
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
| | - Eva Hoster
- Institute of Medical Informatics, Biometry, and Epidemiology, Ludwig-Maximilians-University of Munich, 81377 Munich, Germany;
| | - Roberto Passera
- Division of Nuclear Medicine, University of Torino, 10126 Turin, Italy;
| | - Andrea Evangelista
- Unit of Clinical Epidemiology, CPO Piemonte, AOU Città della Salute e della Scienza di Torino, 10126 Turin, Italy;
| | - Elisa Genuardi
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
| | - Daniela Drandi
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
| | - Marco Ghislieri
- Department of Electronics and Telecommunications, Politecnico di Torino, 10129 Turin, Italy;
- PoliToBIOMedLab of Politecnico di Torino, 10129 Turin, Italy
| | - Daniela Barbero
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
| | - Ilaria Del Giudice
- Hematology, Department of Translational and Precision Medicine, Sapienza University of Rome, 00161 Rome, Italy;
| | - Monica Tani
- Hematology Unit, Santa Maria delle Croci Hospital, 48121 Ravenna, Italy;
| | - Riccardo Moia
- Division of Hematology, Department of Translational Medicine, University of Eastern Piedmont, 28100 Novara, Italy; (R.M.); (M.L.)
| | - Stefano Volpetti
- Unit of Hematology, Presidio Ospedaliero Universitario “Santa Maria della Misericordia”, Azienda Sanitaria Universitaria Friuli Centrale, 33100 Udine, Italy;
| | | | - Nicola Di Renzo
- Unit of Hematology and Bone Marrow Transplant, ‘V. Fazzi’ Hospital, 73100 Lecce, Italy;
| | | | - Daniele Vallisa
- Unit of Hematology, Department of Oncology and Hematology, Guglielmo da Saliceto Hospital, 29121 Piacenza, Italy;
| | - Michele Spina
- Division of Medical Oncology and Immune-Related Tumors, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, 33081 Aviano, Italy;
| | - Anna Pascarella
- Unit of Hematology, dell’ Angelo Mestre-Venezia Hospital, 30174 Mestre-Venezia, Italy;
| | - Giancarlo Latte
- Unit of Hematology and Bone Marrow Transplant, ‘San Francesco’ Hospital, 08100 Nuoro, Italy;
| | - Caterina Patti
- Unit of Hematology, Azienda Ospedali Riuniti Villa Sofia-Cervello, 90146 Palermo, Italy;
| | - Alberto Fabbri
- Unit of Hematology, Azienda Ospedaliera Universitaria Senese, 53100 Siena, Italy;
| | - Attilio Guarini
- Unit of Hematology and Cell Therapy, IRCCS-Istituto Tumori ‘Giovanni Paolo II’, 70124 Bari, Italy;
| | - Umberto Vitolo
- Division of Hematology, Azienda Ospedaliero Universitaria Città della Salute e della Scienza di Torino, 10126 Turin, Italy;
| | - Olivier Hermine
- Service D’hématologie, Hôpital Universitaire Necker, Université René Descartes, Assistance Publique Hôpitaux de Paris, 75015 Paris, France;
| | - Hanneke C Kluin-Nelemans
- Department of Haematology, University Medical Center Groningen, University of Groningen, 9713 Groningen, The Netherlands;
| | | | - Martin Dreyling
- Department of Medicine III, University Hospital, LMU Munich, 81377 Munich, Germany;
| | - Marco Ladetto
- Division of Hematology, Department of Translational Medicine, University of Eastern Piedmont, 28100 Novara, Italy; (R.M.); (M.L.)
- Division of Hematology, Azienda Ospedaliera SS Antonio e Biagio e Cesare Arrigo, 15121 Alessandria, Italy
| |
Collapse
|
14
|
Muhsen IN, Shyr D, Sung AD, Hashmi SK. Machine Learning Applications in the Diagnosis of Benign and Malignant Hematological Diseases. Clin Hematol Int 2021; 3:13-20. [PMID: 34595462 PMCID: PMC8432325 DOI: 10.2991/chi.k.201130.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/05/2020] [Indexed: 12/23/2022] Open
Abstract
The use of machine learning (ML) and deep learning (DL) methods in hematology includes diagnostic, prognostic, and therapeutic applications. This increase is due to the improved access to ML and DL tools and the expansion of medical data. The utilization of ML remains limited in clinical practice, with some disciplines further along in their adoption, such as radiology and histopathology. In this review, we discuss the current uses of ML in diagnosis in the field of hematology, including image-recognition, laboratory, and genomics-based diagnosis. Additionally, we provide an introduction to the fields of ML and DL, highlighting current trends, limitations, and possible areas of improvement.
Collapse
Affiliation(s)
- Ibrahim N Muhsen
- Department of Medicine, Houston Methodist Hospital, Houston, TX, USA
| | - David Shyr
- Division of Stem Cell Transplantation and Regenerative Medicine, Stanford School of Medicine, Palo Alto, CA, USA
| | - Anthony D Sung
- Department of Medicine, Division of Hematologic Malignancies and Cellular Therapy, Duke University School of Medicine, NC, USA
| | - Shahrukh K Hashmi
- Department of Medicine, Mayo Clinic, Rochester, MN, USA.,Department of Medicine, Sheikh Shakbout Medical City, Abu Dhabi, UAE
| |
Collapse
|
15
|
Cancer cells population control in a delayed-model of a leukemic patient using the combination of the eligibility traces algorithm and neural networks. INT J MACH LEARN CYB 2021. [DOI: 10.1007/s13042-021-01287-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
16
|
Auslander N, Gussow AB, Koonin EV. Incorporating Machine Learning into Established Bioinformatics Frameworks. Int J Mol Sci 2021; 22:2903. [PMID: 33809353 PMCID: PMC8000113 DOI: 10.3390/ijms22062903] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/08/2021] [Accepted: 03/10/2021] [Indexed: 12/23/2022] Open
Abstract
The exponential growth of biomedical data in recent years has urged the application of numerous machine learning techniques to address emerging problems in biology and clinical research. By enabling the automatic feature extraction, selection, and generation of predictive models, these methods can be used to efficiently study complex biological systems. Machine learning techniques are frequently integrated with bioinformatic methods, as well as curated databases and biological networks, to enhance training and validation, identify the best interpretable features, and enable feature and model investigation. Here, we review recently developed methods that incorporate machine learning within the same framework with techniques from molecular evolution, protein structure analysis, systems biology, and disease genomics. We outline the challenges posed for machine learning, and, in particular, deep learning in biomedicine, and suggest unique opportunities for machine learning techniques integrated with established bioinformatics approaches to overcome some of these challenges.
Collapse
Affiliation(s)
| | | | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA;
| |
Collapse
|
17
|
Machine learning and augmented human intelligence use in histomorphology for haematolymphoid disorders. Pathology 2021; 53:400-407. [PMID: 33642096 DOI: 10.1016/j.pathol.2020.12.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 12/21/2020] [Indexed: 02/06/2023]
Abstract
Advances in digital pathology have allowed a number of opportunities such as decision support using artificial intelligence (AI). The application of AI to digital pathology data shows promise as an aid for pathologists in the diagnosis of haematological disorders. AI-based applications have embraced benign haematology, diagnosing leukaemia and lymphoma, as well as ancillary testing modalities including flow cytometry. In this review, we highlight the progress made to date in machine learning applications in haematopathology, summarise important studies in this field, and highlight key limitations. We further present our outlook on the future direction and trends for AI to support diagnostic decisions in haematopathology.
Collapse
|
18
|
Eckardt JN, Bornhäuser M, Wendt K, Middeke JM. Application of machine learning in the management of acute myeloid leukemia: current practice and future prospects. Blood Adv 2020; 4:6077-6085. [PMID: 33290546 PMCID: PMC7724910 DOI: 10.1182/bloodadvances.2020002997] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 10/26/2020] [Indexed: 12/19/2022] Open
Abstract
Machine learning (ML) is rapidly emerging in several fields of cancer research. ML algorithms can deal with vast amounts of medical data and provide a better understanding of malignant disease. Its ability to process information from different diagnostic modalities and functions to predict prognosis and suggest therapeutic strategies indicates that ML is a promising tool for the future management of hematologic malignancies; acute myeloid leukemia (AML) is a model disease of various recent studies. An integration of these ML techniques into various applications in AML management can assure fast and accurate diagnosis as well as precise risk stratification and optimal therapy. Nevertheless, these techniques come with various pitfalls and need a strict regulatory framework to ensure safe use of ML. This comprehensive review highlights and discusses recent advances in ML techniques in the management of AML as a model disease of hematologic neoplasms, enabling researchers and clinicians alike to critically evaluate this upcoming, potentially practice-changing technology.
Collapse
Affiliation(s)
- Jan-Niklas Eckardt
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
| | - Martin Bornhäuser
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
- National Center for Tumor Diseases, Dresden (NCT/UCC), Dresden, Germany
- German Consortium for Translational Cancer Research, DKFZ, Heidelberg, Germany; and
| | - Karsten Wendt
- Institute of Circuits and Systems, Technical University Dresden, Dresden, Germany
| | - Jan Moritz Middeke
- Department of Internal Medicine I, University Hospital Carl Gustav Carus, Dresden, Germany
| |
Collapse
|
19
|
Machine learning in haematological malignancies. LANCET HAEMATOLOGY 2020; 7:e541-e550. [PMID: 32589980 DOI: 10.1016/s2352-3026(20)30121-6] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 04/02/2020] [Accepted: 04/14/2020] [Indexed: 02/06/2023]
Abstract
Machine learning is a branch of computer science and statistics that generates predictive or descriptive models by learning from training data rather than by being rigidly programmed. It has attracted substantial attention for its many applications in medicine, both as a catalyst for research and as a means of improving clinical care across the cycle of diagnosis, prognosis, and treatment of disease. These applications include the management of haematological malignancy, in which machine learning has created inroads in pathology, radiology, genomics, and the analysis of electronic health record data. As computational power becomes cheaper and the tools for implementing machine learning become increasingly democratised, it is likely to become increasingly integrated into the research and practice landscape of haematology. As such, machine learning merits understanding and attention from researchers and clinicians alike. This narrative Review describes important concepts in machine learning for unfamiliar readers, details machine learning's current applications in haematological malignancy, and summarises important concepts for clinicians to be aware of when appraising research that uses machine learning.
Collapse
|
20
|
Shouval R, Fein JA, Savani B, Mohty M, Nagler A. Machine learning and artificial intelligence in haematology. Br J Haematol 2020; 192:239-250. [PMID: 32602593 DOI: 10.1111/bjh.16915] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Digitalization of the medical record and integration of genomic methods into clinical practice have resulted in an unprecedented wealth of data. Machine learning is a subdomain of artificial intelligence that attempts to computationally extract meaningful insights from complex data structures. Applications of machine learning in haematological scenarios are steadily increasing. However, basic concepts are often unfamiliar to clinicians and investigators. The purpose of this review is to provide readers with tools to interpret and critically appraise machine learning literature. We begin with the elucidation of standard terminology and then review examples in haematology. Guidelines for designing and evaluating machine-learning studies are provided. Finally, we discuss limitations of the machine-learning approach.
Collapse
Affiliation(s)
- Roni Shouval
- Adult Bone Marrow Transplant Service, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA.,Hematology and Bone Marrow Transplantation Division, Chaim Sheba Medical Center, Tel-Hashomer, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Joshua A Fein
- University of Connecticut Medical Center, Farmington, CT, USA
| | - Bipin Savani
- Division of Hematology-Oncology, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Mohamad Mohty
- European Society for Blood and Marrow Transplantation Paris Study Office/CEREST-TC, Paris, France.,Service d'Hématologie Clinique et de Thérapie Cellulaire, Hôpital Saint Antoine, AP-HP, Paris, France
| | - Arnon Nagler
- Hematology and Bone Marrow Transplantation Division, Chaim Sheba Medical Center, Tel-Hashomer, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
21
|
Sorokin M, Kholodenko I, Kalinovsky D, Shamanskaya T, Doronin I, Konovalov D, Mironov A, Kuzmin D, Nikitin D, Deyev S, Buzdin A, Kholodenko R. RNA Sequencing-Based Identification of Ganglioside GD2-Positive Cancer Phenotype. Biomedicines 2020; 8:E142. [PMID: 32486168 PMCID: PMC7344710 DOI: 10.3390/biomedicines8060142] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 05/20/2020] [Accepted: 05/27/2020] [Indexed: 12/15/2022] Open
Abstract
The tumor-associated ganglioside GD2 represents an attractive target for cancer immunotherapy. GD2-positive tumors are more responsive to such targeted therapy, and new methods are needed for the screening of GD2 molecular tumor phenotypes. In this work, we built a gene expression-based binary classifier predicting the GD2-positive tumor phenotypes. To this end, we compared RNA sequencing data from human tumor biopsy material from experimental samples and public databases as well as from GD2-positive and GD2-negative cancer cell lines, for expression levels of genes encoding enzymes involved in ganglioside biosynthesis. We identified a 2-gene expression signature combining ganglioside synthase genes ST8SIA1 and B4GALNT1 that serves as a more efficient predictor of GD2-positive phenotype (Matthews Correlation Coefficient (MCC) 0.32, 0.88, and 0.98 in three independent comparisons) compared to the individual ganglioside biosynthesis genes (MCC 0.02-0.32, 0.1-0.75, and 0.04-1 for the same independent comparisons). No individual gene showed a higher MCC score than the expression signature MCC score in two or more comparisons. Our diagnostic approach can hopefully be applied for pan-cancer prediction of GD2 phenotypes using gene expression data.
Collapse
Affiliation(s)
- Maxim Sorokin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho- Maklaya St., 117997 Moscow, Russia; (M.S.); (D.K.); (I.D.); (D.N.); (S.D.); (A.B.)
- Sechenov First Moscow State Medical University, 8-2, Trubetskaya St., 119992 Moscow, Russia
- Omicsway Corp., 340 S Lemon Ave, 6040, Walnut, CA 91789, USA
| | - Irina Kholodenko
- Orekhovich Institute of Biomedical Chemistry, 10, Pogodinskaya St., 119121 Moscow, Russia;
| | - Daniel Kalinovsky
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho- Maklaya St., 117997 Moscow, Russia; (M.S.); (D.K.); (I.D.); (D.N.); (S.D.); (A.B.)
| | - Tatyana Shamanskaya
- D. Rogachev Federal Research Center of Pediatric Hematology, Oncology and Immunology, 1, Samory Mashela St., 117997 Moscow, Russia; (T.S.); (D.K.)
| | - Igor Doronin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho- Maklaya St., 117997 Moscow, Russia; (M.S.); (D.K.); (I.D.); (D.N.); (S.D.); (A.B.)
- Real Target LLC, 108841 Moscow, Russia
| | - Dmitry Konovalov
- D. Rogachev Federal Research Center of Pediatric Hematology, Oncology and Immunology, 1, Samory Mashela St., 117997 Moscow, Russia; (T.S.); (D.K.)
| | - Aleksei Mironov
- Skolkovo Institute of Science and Technology, 3, Nobelya St., 121205 Moscow, Russia;
| | - Denis Kuzmin
- Moscow Institute of Physics and Technology (National Research University), 141700 Moscow, Russia;
| | - Daniil Nikitin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho- Maklaya St., 117997 Moscow, Russia; (M.S.); (D.K.); (I.D.); (D.N.); (S.D.); (A.B.)
| | - Sergey Deyev
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho- Maklaya St., 117997 Moscow, Russia; (M.S.); (D.K.); (I.D.); (D.N.); (S.D.); (A.B.)
- Sechenov First Moscow State Medical University, 8-2, Trubetskaya St., 119992 Moscow, Russia
| | - Anton Buzdin
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho- Maklaya St., 117997 Moscow, Russia; (M.S.); (D.K.); (I.D.); (D.N.); (S.D.); (A.B.)
- Sechenov First Moscow State Medical University, 8-2, Trubetskaya St., 119992 Moscow, Russia
- Moscow Institute of Physics and Technology (National Research University), 141700 Moscow, Russia;
- Oncobox ltd., 121205 Moscow, Russia
| | - Roman Kholodenko
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho- Maklaya St., 117997 Moscow, Russia; (M.S.); (D.K.); (I.D.); (D.N.); (S.D.); (A.B.)
- Real Target LLC, 108841 Moscow, Russia
| |
Collapse
|