1
|
Zaccaria GM, Ferrero S, Hoster E, Passera R, Evangelista A, Genuardi E, Drandi D, Ghislieri M, Barbero D, Del Giudice I, Tani M, Moia R, Volpetti S, Cabras MG, Di Renzo N, Merli F, Vallisa D, Spina M, Pascarella A, Latte G, Patti C, Fabbri A, Guarini A, Vitolo U, Hermine O, Kluin-Nelemans HC, Cortelazzo S, Dreyling M, Ladetto M. A Clinical Prognostic Model Based on Machine Learning from the Fondazione Italiana Linfomi (FIL) MCL0208 Phase III Trial. Cancers (Basel) 2021; 14:188. [PMID: 35008361 PMCID: PMC8750124 DOI: 10.3390/cancers14010188] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Accepted: 12/26/2021] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND Multicenter clinical trials are producing growing amounts of clinical data. Machine Learning (ML) might facilitate the discovery of novel tools for prognostication and disease-stratification. Taking advantage of a systematic collection of multiple variables, we developed a model derived from data collected on 300 patients with mantle cell lymphoma (MCL) from the Fondazione Italiana Linfomi-MCL0208 phase III trial (NCT02354313). METHODS We developed a score with a clustering algorithm applied to clinical variables. The candidate score was correlated to overall survival (OS) and validated in two independent data series from the European MCL Network (NCT00209222, NCT00209209); Results: Three groups of patients were significantly discriminated: Low, Intermediate (Int), and High risk (High). Seven discriminants were identified by a feature reduction approach: albumin, Ki-67, lactate dehydrogenase, lymphocytes, platelets, bone marrow infiltration, and B-symptoms. Accordingly, patients in the Int and High groups had shorter OS rates than those in the Low and Int groups, respectively (Int→Low, HR: 3.1, 95% CI: 1.0-9.6; High→Int, HR: 2.3, 95% CI: 1.5-4.7). Based on the 7 markers, we defined the engineered MCL international prognostic index (eMIPI), which was validated and confirmed in two independent cohorts; Conclusions: We developed and validated a ML-based prognostic model for MCL. Even when currently limited to baseline predictors, our approach has high scalability potential.
Collapse
Affiliation(s)
- Gian Maria Zaccaria
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
- Unit of Hematology and Cell Therapy, IRCCS-Istituto Tumori ‘Giovanni Paolo II’, 70124 Bari, Italy;
| | - Simone Ferrero
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
| | - Eva Hoster
- Institute of Medical Informatics, Biometry, and Epidemiology, Ludwig-Maximilians-University of Munich, 81377 Munich, Germany;
| | - Roberto Passera
- Division of Nuclear Medicine, University of Torino, 10126 Turin, Italy;
| | - Andrea Evangelista
- Unit of Clinical Epidemiology, CPO Piemonte, AOU Città della Salute e della Scienza di Torino, 10126 Turin, Italy;
| | - Elisa Genuardi
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
| | - Daniela Drandi
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
| | - Marco Ghislieri
- Department of Electronics and Telecommunications, Politecnico di Torino, 10129 Turin, Italy;
- PoliToBIOMedLab of Politecnico di Torino, 10129 Turin, Italy
| | - Daniela Barbero
- Unit of Hematology, Department of Molecular Biotechnology and Health Sciences, University of Torino, 10126 Torino, Italy; (S.F.); (E.G.); (D.D.); (D.B.)
| | - Ilaria Del Giudice
- Hematology, Department of Translational and Precision Medicine, Sapienza University of Rome, 00161 Rome, Italy;
| | - Monica Tani
- Hematology Unit, Santa Maria delle Croci Hospital, 48121 Ravenna, Italy;
| | - Riccardo Moia
- Division of Hematology, Department of Translational Medicine, University of Eastern Piedmont, 28100 Novara, Italy; (R.M.); (M.L.)
| | - Stefano Volpetti
- Unit of Hematology, Presidio Ospedaliero Universitario “Santa Maria della Misericordia”, Azienda Sanitaria Universitaria Friuli Centrale, 33100 Udine, Italy;
| | | | - Nicola Di Renzo
- Unit of Hematology and Bone Marrow Transplant, ‘V. Fazzi’ Hospital, 73100 Lecce, Italy;
| | | | - Daniele Vallisa
- Unit of Hematology, Department of Oncology and Hematology, Guglielmo da Saliceto Hospital, 29121 Piacenza, Italy;
| | - Michele Spina
- Division of Medical Oncology and Immune-Related Tumors, Centro di Riferimento Oncologico di Aviano (CRO) IRCCS, 33081 Aviano, Italy;
| | - Anna Pascarella
- Unit of Hematology, dell’ Angelo Mestre-Venezia Hospital, 30174 Mestre-Venezia, Italy;
| | - Giancarlo Latte
- Unit of Hematology and Bone Marrow Transplant, ‘San Francesco’ Hospital, 08100 Nuoro, Italy;
| | - Caterina Patti
- Unit of Hematology, Azienda Ospedali Riuniti Villa Sofia-Cervello, 90146 Palermo, Italy;
| | - Alberto Fabbri
- Unit of Hematology, Azienda Ospedaliera Universitaria Senese, 53100 Siena, Italy;
| | - Attilio Guarini
- Unit of Hematology and Cell Therapy, IRCCS-Istituto Tumori ‘Giovanni Paolo II’, 70124 Bari, Italy;
| | - Umberto Vitolo
- Division of Hematology, Azienda Ospedaliero Universitaria Città della Salute e della Scienza di Torino, 10126 Turin, Italy;
| | - Olivier Hermine
- Service D’hématologie, Hôpital Universitaire Necker, Université René Descartes, Assistance Publique Hôpitaux de Paris, 75015 Paris, France;
| | - Hanneke C Kluin-Nelemans
- Department of Haematology, University Medical Center Groningen, University of Groningen, 9713 Groningen, The Netherlands;
| | | | - Martin Dreyling
- Department of Medicine III, University Hospital, LMU Munich, 81377 Munich, Germany;
| | - Marco Ladetto
- Division of Hematology, Department of Translational Medicine, University of Eastern Piedmont, 28100 Novara, Italy; (R.M.); (M.L.)
- Division of Hematology, Azienda Ospedaliera SS Antonio e Biagio e Cesare Arrigo, 15121 Alessandria, Italy
| |
Collapse
|
2
|
Lockery JE, Collyer TA, Reid CM, Ernst ME, Gilbertson D, Hay N, Kirpach B, McNeil JJ, Nelson MR, Orchard SG, Pruksawongsin K, Shah RC, Wolfe R, Woods RL. Overcoming challenges to data quality in the ASPREE clinical trial. Trials 2019; 20:686. [PMID: 31815652 PMCID: PMC6902598 DOI: 10.1186/s13063-019-3789-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Accepted: 10/05/2019] [Indexed: 11/28/2022] Open
Abstract
BACKGROUND Large-scale studies risk generating inaccurate and missing data due to the complexity of data collection. Technology has the potential to improve data quality by providing operational support to data collectors. However, this potential is under-explored in community-based trials. The Aspirin in reducing events in the elderly (ASPREE) trial developed a data suite that was specifically designed to support data collectors: the ASPREE Web Accessible Relational Database (AWARD). This paper describes AWARD and the impact of system design on data quality. METHODS AWARD's operational requirements, conceptual design, key challenges and design solutions for data quality are presented. Impact of design features is assessed through comparison of baseline data collected prior to implementation of key functionality (n = 1000) with data collected post implementation (n = 18,114). Overall data quality is assessed according to data category. RESULTS At baseline, implementation of user-driven functionality reduced staff error (from 0.3% to 0.01%), out-of-range data entry (from 0.14% to 0.04%) and protocol deviations (from 0.4% to 0.08%). In the longitudinal data set, which contained more than 39 million data values collected within AWARD, 96.6% of data values were entered within specified query range or found to be accurate upon querying. The remaining data were missing (3.4%). Participant non-attendance at scheduled study activity was the most common cause of missing data. Costs associated with cleaning data in ASPREE were lower than expected compared with reports from other trials. CONCLUSIONS Clinical trials undertake complex operational activity in order to collect data, but technology rarely provides sufficient support. We find the AWARD suite provides proof of principle that designing technology to support data collectors can mitigate known causes of poor data quality and produce higher-quality data. Health information technology (IT) products that support the conduct of scheduled activity in addition to traditional data entry will enhance community-based clinical trials. A standardised framework for reporting data quality would aid comparisons across clinical trials. TRIAL REGISTRATION International Standard Randomized Controlled Trial Number Register, ISRCTN83772183. Registered on 3 March 2005.
Collapse
Affiliation(s)
- Jessica E. Lockery
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
| | - Taya A. Collyer
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
| | - Christopher M. Reid
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
- School of Public Health, Curtin University, Perth, WA Australia
| | - Michael E. Ernst
- Department of Pharmacy Practice and Science, College of Pharmacy and Department of Family Medicine, Carver College of Medicine, The University of Iowa, Iowa City, USA
| | - David Gilbertson
- Chronic Disease Research Group, Minneapolis Medical Research Foundation, Minneapolis, Minnesota USA
| | - Nino Hay
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
| | - Brenda Kirpach
- Berman Center for Outcomes and Clinical Research, Hennepin Healthcare Research Institute (HHRI), Hennepin Healthcare, Minneapolis, MN USA
| | - John J. McNeil
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
| | - Mark R. Nelson
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
- Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS Australia
| | - Suzanne G. Orchard
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
| | - Kunnapoj Pruksawongsin
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
| | - Raj C. Shah
- Department of Family Medicine and Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL USA
| | - Rory Wolfe
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
| | - Robyn L. Woods
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
| | - on behalf of the ASPREE Investigator Group
- Department of Epidemiology & Preventive Medicine, Monash University, ASPREE Co-ordinating Centre, 99 Commercial Road, Melbourne, VIC 3004 Australia
- School of Public Health, Curtin University, Perth, WA Australia
- Department of Pharmacy Practice and Science, College of Pharmacy and Department of Family Medicine, Carver College of Medicine, The University of Iowa, Iowa City, USA
- Chronic Disease Research Group, Minneapolis Medical Research Foundation, Minneapolis, Minnesota USA
- Berman Center for Outcomes and Clinical Research, Hennepin Healthcare Research Institute (HHRI), Hennepin Healthcare, Minneapolis, MN USA
- Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS Australia
- Department of Family Medicine and Rush Alzheimer’s Disease Center, Rush University Medical Center, Chicago, IL USA
| |
Collapse
|
3
|
Zaccaria GM, Ferrero S, Rosati S, Ghislieri M, Genuardi E, Evangelista A, Sandrone R, Castagneri C, Barbero D, Lo Schirico M, Arcaini L, Molinari AL, Ballerini F, Ferreri A, Omedè P, Zamò A, Balestra G, Boccadoro M, Cortelazzo S, Ladetto M. Applying Data Warehousing to a Phase III Clinical Trial From the Fondazione Italiana Linfomi Ensures Superior Data Quality and Improved Assessment of Clinical Outcomes. JCO Clin Cancer Inform 2019; 3:1-15. [PMID: 31633999 PMCID: PMC6873907 DOI: 10.1200/cci.19.00049] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/30/2019] [Indexed: 12/28/2022] Open
Abstract
PURPOSE Data collection in clinical trials is becoming complex, with a huge number of variables that need to be recorded, verified, and analyzed to effectively measure clinical outcomes. In this study, we used data warehouse (DW) concepts to achieve this goal. A DW was developed to accommodate data from a large clinical trial, including all the characteristics collected. We present the results related to baseline variables with the following objectives: developing a data quality (DQ) control strategy and improving outcome analysis according to the clinical trial primary end points. METHODS Data were retrieved from the electronic case reporting forms (eCRFs) of the phase III, multicenter MCL0208 trial (ClinicalTrials.gov identifier: NCT02354313) of the Fondazione Italiana Linfomi for younger patients with untreated mantle cell lymphoma (MCL). The DW was created with a relational database management system. Recommended DQ dimensions were observed to monitor the activity of each site to handle DQ management during patient follow-up. The DQ management was applied to clinically relevant parameters that predicted progression-free survival to assess its impact. RESULTS The DW encompassed 16 tables, which included 226 variables for 300 patients and 199,500 items of data. The tool allowed cross-comparison analysis and detected some incongruities in eCRFs, prompting queries to clinical centers. This had an impact on clinical end points, as the DQ control strategy was able to improve the prognostic stratification according to single parameters, such as tumor infiltration by flow cytometry, and even using established prognosticators, such as the MCL International Prognostic Index. CONCLUSION The DW is a powerful tool to organize results from large phase III clinical trials and to effectively improve DQ through the application of effective engineered tools.
Collapse
Affiliation(s)
| | | | | | | | | | - Andrea Evangelista
- Unit of Clinical Epidemiology, Centro di Prevenzione Oncologica (CPO), Città della Salute e della Scienza di Torino, Hospital of Turin, Turin, Italy
| | | | | | | | | | - Luca Arcaini
- Fondazione Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Policlinico San Matteo, University of Pavia, Pavia, Italy
| | | | - Filippo Ballerini
- University of Genoa, Ospedale Policlinico San Martino, IRCCS per l’Oncologia, Genoa, Italy
| | | | | | | | | | | | | | - Marco Ladetto
- Division of Hematology, Azienda Ospedaliera SS Antonio e Biagio e Cesare Arrigo, Alessandria, Italy
| |
Collapse
|
4
|
Rosati S, Balestra G, Knaflitz M. Comparison of Different Sets of Features for Human Activity Recognition by Wearable Sensors. SENSORS 2018; 18:s18124189. [PMID: 30501111 PMCID: PMC6308535 DOI: 10.3390/s18124189] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Revised: 11/22/2018] [Accepted: 11/27/2018] [Indexed: 11/16/2022]
Abstract
Human Activity Recognition (HAR) refers to an emerging area of interest for medical, military, and security applications. However, the identification of the features to be used for activity classification and recognition is still an open point. The aim of this study was to compare two different feature sets for HAR. Particularly, we compared a set including time, frequency, and time-frequency domain features widely used in literature (FeatSet_A) with a set of time-domain features derived by considering the physical meaning of the acquired signals (FeatSet_B). The comparison of the two sets were based on the performances obtained using four machine learning classifiers. Sixty-one healthy subjects were asked to perform seven different daily activities wearing a MIMU-based device. Each signal was segmented using a 5-s window and for each window, 222 and 221 variables were extracted for the FeatSet_A and FeatSet_B respectively. Each set was reduced using a Genetic Algorithm (GA) simultaneously performing feature selection and classifier optimization. Our results showed that Support Vector Machine achieved the highest performances using both sets (97.1% and 96.7% for FeatSet_A and FeatSet_B respectively). However, FeatSet_B allows to better understand alterations of the biomechanical behavior in more complex situations, such as when applied to pathological subjects.
Collapse
Affiliation(s)
- Samanta Rosati
- Department of Electronics and Telecommunications, Politecnico di Torino, 10129 Torino, Italy.
| | - Gabriella Balestra
- Department of Electronics and Telecommunications, Politecnico di Torino, 10129 Torino, Italy.
| | - Marco Knaflitz
- Department of Electronics and Telecommunications, Politecnico di Torino, 10129 Torino, Italy.
| |
Collapse
|