1
|
Wang D, Wang SJ, Lababidi S. The impact of misclassification errors on the performance of biomarkers based on next-generation sequencing, a simulation study. J Biopharm Stat 2024; 34:700-718. [PMID: 37819021 DOI: 10.1080/10543406.2023.2269251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 09/29/2023] [Indexed: 10/13/2023]
Abstract
The development of next-generation sequencing (NGS) opens opportunities for new applications such as liquid biopsy, in which tumor mutation genotypes can be determined by sequencing circulating tumor DNA after blood draws. However, with highly diluted samples like those obtained with liquid biopsy, NGS invariably introduces a certain level of misclassification, even with improved technology. Recently, there has been a high demand to use mutation genotypes as biomarkers for predicting prognosis and treatment selection. Many methods have also been proposed to build classifiers based on multiple loci with machine learning algorithms as biomarkers. How the higher misclassification rate introduced by liquid biopsy will affect the performance of these biomarkers has not been thoroughly investigated. In this paper, we report the results from a simulation study focused on the clinical utility of biomarkers when misclassification is present due to the current technological limit of NGS in the liquid biopsy setting. The simulation covers a range of performance profiles for current NGS platforms with different machine learning algorithms and uses actual patient genotypes. Our results show that, at the high end of the performance spectrum, the misclassification introduced by NGS had very little effect on the clinical utility of the biomarker. However, in more challenging applications with lower accuracy, misclassification could have a notable effect on clinical utility. The pattern of this effect can be complex, especially for machine learning-based classifiers. Our results show that simulation can be an effective tool for assessing different scenarios of misclassification.
Collapse
Affiliation(s)
- Dong Wang
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Sue-Jane Wang
- Office of Biostatistics, Center for Drug Evaluation Research, FDA, Maryland, USA
| | - Samir Lababidi
- Office of Data, Analytics and Research, Office of Digital Transformation, Office of Commissioner, FDA, Maryland, USA
| |
Collapse
|
2
|
Kunonga TP, Kenny RPW, Astin M, Bryant A, Kontogiannis V, Coughlan D, Richmond C, Eastaugh CH, Beyer FR, Pearson F, Craig D, Lovat P, Vale L, Ellis R. Predictive accuracy of risk prediction models for recurrence, metastasis and survival for early-stage cutaneous melanoma: a systematic review. BMJ Open 2023; 13:e073306. [PMID: 37770261 PMCID: PMC10546114 DOI: 10.1136/bmjopen-2023-073306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 09/03/2023] [Indexed: 09/30/2023] Open
Abstract
OBJECTIVES To identify prognostic models for melanoma survival, recurrence and metastasis among American Joint Committee on Cancer stage I and II patients postsurgery; and evaluate model performance, including overall survival (OS) prediction. DESIGN Systematic review and narrative synthesis. DATA SOURCES Searched MEDLINE, Embase, CINAHL, Cochrane Library, Science Citation Index and grey literature sources including cancer and guideline websites from 2000 to September 2021. ELIGIBILITY CRITERIA Included studies on risk prediction models for stage I and II melanoma in adults ≥18 years. Outcomes included OS, recurrence, metastases and model performance. No language or country of publication restrictions were applied. DATA EXTRACTION AND SYNTHESIS Two pairs of reviewers independently screened studies, extracted data and assessed the risk of bias using the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies checklist and the Prediction study Risk of Bias Assessment Tool. Heterogeneous predictors prevented statistical synthesis. RESULTS From 28 967 records, 15 studies reporting 20 models were included; 8 (stage I), 2 (stage II), 7 (stages I-II) and 7 (stages not reported), but were clearly applicable to early stages. Clinicopathological predictors per model ranged from 3-10. The most common were: ulceration, Breslow thickness/depth, sociodemographic status and site. Where reported, discriminatory values were ≥0.7. Calibration measures showed good matches between predicted and observed rates. None of the studies assessed clinical usefulness of the models. Risk of bias was high in eight models, unclear in nine and low in three. Seven models were internally and externally cross-validated, six models were externally validated and eight models were internally validated. CONCLUSIONS All models are effective in their predictive performance, however the low quality of the evidence raises concern as to whether current follow-up recommendations following surgical treatment is adequate. Future models should incorporate biomarkers for improved accuracy. PROSPERO REGISTRATION NUMBER CRD42018086784.
Collapse
Affiliation(s)
- Tafadzwa Patience Kunonga
- Evidence Synthesis Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
- NIHR Innovation Observatory, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - R P W Kenny
- Evidence Synthesis Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
- NIHR Innovation Observatory, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Margaret Astin
- Evidence Synthesis Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Andrew Bryant
- Biostatistics Research Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Vasileios Kontogiannis
- Health Economics Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Diarmuid Coughlan
- Health Economics Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Catherine Richmond
- Evidence Synthesis Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
- NIHR Innovation Observatory, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Claire H Eastaugh
- Evidence Synthesis Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
- NIHR Innovation Observatory, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Fiona R Beyer
- Evidence Synthesis Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
- NIHR Innovation Observatory, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Fiona Pearson
- Evidence Synthesis Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
- NIHR Innovation Observatory, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Dawn Craig
- Evidence Synthesis Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
- NIHR Innovation Observatory, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
- Health Economics Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Penny Lovat
- Dermatological Sciences, Translation and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, UK
- AMLo Bisciences, The Biosphere, Newcastle Helix, Newcastle upon Tyne, UK
| | - Luke Vale
- Health Economics Group, Population Health Sciences Institute, Newcastle University, Newcastle upon Tyne, UK
| | - Robert Ellis
- Dermatological Sciences, Translation and Clinical Research Institute, Newcastle University, Newcastle upon Tyne, UK
- AMLo Bisciences, The Biosphere, Newcastle Helix, Newcastle upon Tyne, UK
- Department of Dermatology, South Tees Hospitals NHS FT, Middlesbrough, UK
| |
Collapse
|
3
|
Vale L, Kunonga P, Coughlan D, Kontogiannis V, Astin M, Beyer F, Richmond C, Wilson D, Bajwa D, Javanbakht M, Bryant A, Akor W, Craig D, Lovat P, Labus M, Nasr B, Cunliffe T, Hinde H, Shawgi M, Saleh D, Royle P, Steward P, Lucas R, Ellis R. Optimal surveillance strategies for patients with stage 1 cutaneous melanoma post primary tumour excision: three systematic reviews and an economic model. Health Technol Assess 2021; 25:1-178. [PMID: 34792018 DOI: 10.3310/hta25640] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Malignant melanoma is the fifth most common cancer in the UK, with rates continuing to rise, resulting in considerable burden to patients and the NHS. OBJECTIVES The objectives were to evaluate the effectiveness and cost-effectiveness of current and alternative follow-up strategies for stage IA and IB melanoma. REVIEW METHODS Three systematic reviews were conducted. (1) The effectiveness of surveillance strategies. Outcomes were detection of new primaries, recurrences, metastases and survival. Risk of bias was assessed using the Cochrane Collaboration's Risk-of-Bias 2.0 tool. (2) Prediction models to stratify by risk of recurrence, metastases and survival. Model performance was assessed by study-reported measures of discrimination (e.g. D-statistic, Harrel's c-statistic), calibration (e.g. the Hosmer-Lemeshow 'goodness-of-fit' test) or overall performance (e.g. Brier score, R 2). Risk of bias was assessed using the Prediction model Risk Of Bias ASsessment Tool (PROBAST). (3) Diagnostic test accuracy of fine-needle biopsy and ultrasonography. Outcomes were detection of new primaries, recurrences, metastases and overall survival. Risk of bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) tool. Review data and data from elsewhere were used to model the cost-effectiveness of alternative surveillance strategies and the value of further research. RESULTS (1) The surveillance review included one randomised controlled trial. There was no evidence of a difference in new primary or recurrence detected (risk ratio 0.75, 95% confidence interval 0.43 to 1.31). Risk of bias was considered to be of some concern. Certainty of the evidence was low. (2) Eleven risk prediction models were identified. Discrimination measures were reported for six models, with the area under the operating curve ranging from 0.59 to 0.88. Three models reported calibration measures, with coefficients of ≥ 0.88. Overall performance was reported by two models. In one, the Brier score was slightly better than the American Joint Committee on Cancer scheme score. The other reported an R 2 of 0.47 (95% confidence interval 0.45 to 0.49). All studies were judged to have a high risk of bias. (3) The diagnostic test accuracy review identified two studies. One study considered fine-needle biopsy and the other considered ultrasonography. The sensitivity and specificity for fine-needle biopsy were 0.94 (95% confidence interval 0.90 to 0.97) and 0.95 (95% confidence interval 0.90 to 0.97), respectively. For ultrasonography, sensitivity and specificity were 1.00 (95% confidence interval 0.03 to 1.00) and 0.99 (95% confidence interval 0.96 to 0.99), respectively. For the reference standards and flow and timing domains, the risk of bias was rated as being high for both studies. The cost-effectiveness results suggest that, over a lifetime, less intensive surveillance than recommended by the National Institute for Health and Care Excellence might be worthwhile. There was considerable uncertainty. Improving the diagnostic performance of cancer nurse specialists and introducing a risk prediction tool could be promising. Further research on transition probabilities between different stages of melanoma and on improving diagnostic accuracy would be of most value. LIMITATIONS Overall, few data of limited quality were available, and these related to earlier versions of the American Joint Committee on Cancer staging. Consequently, there was considerable uncertainty in the economic evaluation. CONCLUSIONS Despite adoption of rigorous methods, too few data are available to justify changes to the National Institute for Health and Care Excellence recommendations on surveillance. However, alternative strategies warrant further research, specifically on improving estimates of incidence, progression of recurrent disease; diagnostic accuracy and health-related quality of life; developing and evaluating risk stratification tools; and understanding patient preferences. STUDY REGISTRATION This study is registered as PROSPERO CRD42018086784. FUNDING This project was funded by the National Institute for Health Research Health Technology Assessment programme and will be published in full in Health Technology Assessment; Vol 25, No. 64. See the NIHR Journals Library website for further project information.
Collapse
Affiliation(s)
- Luke Vale
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Patience Kunonga
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Diarmuid Coughlan
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | | | - Margaret Astin
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Fiona Beyer
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Catherine Richmond
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Dor Wilson
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Dalvir Bajwa
- Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, UK
| | - Mehdi Javanbakht
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Andrew Bryant
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Wanwuri Akor
- Northumbria Healthcare NHS Foundation Trust, North Shields, UK
| | - Dawn Craig
- Institute of Health & Society, Newcastle University, Newcastle upon Tyne, UK
| | - Penny Lovat
- Institute of Translation and Clinical Studies, Newcastle University, Newcastle upon Tyne, UK
| | - Marie Labus
- Business Development and Enterprise, Newcastle University, Newcastle upon Tyne, UK
| | - Batoul Nasr
- Dermatological Sciences, Institute of Cellular Medicine, Newcastle University, Newcastle upon Tyne, UK
| | - Timothy Cunliffe
- Dermatology Department, James Cook University Hospital, Middlesbrough, UK
| | - Helena Hinde
- Dermatology Department, James Cook University Hospital, Middlesbrough, UK
| | - Mohamed Shawgi
- Radiology Department, James Cook University Hospital, Middlesbrough, UK
| | - Daniel Saleh
- Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK.,Princess Alexandra Hospital Southside Clinical Unit, Faculty of Medicine, University of Queensland, Brisbane, QLD, Australia
| | - Pam Royle
- Patient representative, ITV Tyne Tees, Gateshead, UK
| | - Paul Steward
- Patient representative, Dermatology Department, James Cook University Hospital, Middlesbrough, UK
| | - Rachel Lucas
- Patient representative, Dermatology Department, James Cook University Hospital, Middlesbrough, UK
| | - Robert Ellis
- Institute of Translation and Clinical Studies, Newcastle University, Newcastle upon Tyne, UK.,South Tees Hospitals NHS Foundation Trust, Middlesbrough, UK
| |
Collapse
|
4
|
Ture M, Kurt Omurlu I. Determining of complexity parameter for recursive partitioning trees by simulation of survival data and an application on breast cancer data. JOURNAL OF STATISTICS & MANAGEMENT SYSTEMS 2018. [DOI: 10.1080/09720510.2017.1386878] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Mevlut Ture
- Medical Faculty, Department of Biostatistics, Adnan Menderes University, Aydın, Turkey
| | - Imran Kurt Omurlu
- Medical Faculty, Department of Biostatistics, Adnan Menderes University, Aydın, Turkey
| |
Collapse
|
5
|
Moon H, Zhao Y, Pluta D, Ahn H. Subgroup analysis based on prognostic and predictive gene signatures for adjuvant chemotherapy in early-stage non-small-cell lung cancer patients. J Biopharm Stat 2017; 28:750-762. [PMID: 29157115 DOI: 10.1080/10543406.2017.1397006] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
In treating patients diagnosed with early Stage I non-small-cell lung cancer (NSCLC), doctors must choose surgery alone, Adjuvant Cisplatin-Based Chemotherapy (ACT) alone or both. For patients with resected stages IB to IIIA, clinical trials have shown a survival advantage from 4-15% with the adoption of ACT. However, due to the inherent toxicity of chemotherapy, it is necessary for doctors to identify patients whose chance of success with ACT is sufficient to justify the risks. This research seeks to use gene expression profiling in the development of a statistical decision-making algorithm to identify patients whose survival rates will improve from ACT treatment. Using the data from the National Cancer Institute, the lasso method in the Cox-Proportional-Hazards regression model is used as the main method to determine a feasible number of genes that are strongly associated with the treatment-related patient survival. Considering treatment groups separately, the patients are assigned a risk category based on the estimation of survival times. These risk categories are used to develop a Random Forests classification model to identify patients who are likely to benefit from chemotherapy treatment. This model allows the prediction of a new patient's prognosis and the likelihood of survival benefit from ACT treatment based on a feasible number of genomic biomarkers. The proposed methods are evaluated using a simulation study.
Collapse
Affiliation(s)
- Hojin Moon
- a Department of Mathematics and Statistics , California State University, Long Beach , Long Beach , CA , USA
| | - Yuan Zhao
- b Department of Applied Mathematics and Statistics , Stony Brook University , Stony Brook , NY , USA
| | - Dustin Pluta
- c Department of Statistics , University of California , Irvine , CA , USA
| | - Hongshik Ahn
- b Department of Applied Mathematics and Statistics , Stony Brook University , Stony Brook , NY , USA
| |
Collapse
|
6
|
Chen Y, Chen JJ. Ensemble survival trees for identifying subpopulations in personalized medicine. Biom J 2016; 58:1151-63. [DOI: 10.1002/bimj.201500075] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 12/03/2015] [Accepted: 01/18/2016] [Indexed: 11/08/2022]
Affiliation(s)
- Yu‐Chuan Chen
- Division of Bioinformatics and Biostatistics National Center for Toxicological Research U.S. Food and Drug Administration Jefferson AR 72079 USA
| | - James J. Chen
- Division of Bioinformatics and Biostatistics National Center for Toxicological Research U.S. Food and Drug Administration Jefferson AR 72079 USA
| |
Collapse
|
7
|
|
8
|
Wiener M, Acland KM, Shaw HM, Soong SJ, Lin HY, Chen DT, Scolyer RA, Winstanley JB, Thompson JF. Sentinel node positive melanoma patients: prediction and prognostic significance of nonsentinel node metastases and development of a survival tree model. Ann Surg Oncol 2010; 17:1995-2005. [PMID: 20490699 DOI: 10.1245/s10434-010-1049-5] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Indexed: 11/18/2022]
Abstract
BACKGROUND Completion lymph node dissection (CLND) following positive sentinel node biopsy (SNB) for melanoma detects additional nonsentinel node (NSN) metastases in approximately 20% of cases. This study aimed to establish whether NSN status can be predicted, to determine its effect on survival, and to develop survival tree models for the sentinel node (SN) positive population. MATERIALS AND METHODS Sydney Melanoma Unit (SMU) patients with at least 1 positive SN, meeting inclusion criteria and treated between October 1992 and June 2005, were identified from the Unit database. Survival characteristics, potential predictors of survival, and NSN status were assessed using the Kaplan-Meier method, Cox regression model, and logistic regression analyses, respectively. Classification tree analysis was performed to identify groups with distinctly different survival characteristics. RESULTS A total of 323 SN-positive melanoma patients met the inclusion criteria. On multivariate analysis, age, gender, primary tumor thickness, mitotic rate, number of positive NSNs, or total number of positive nodes were statistically significant predictors of survival. NSN metastasis, found at CLND in 19% of patients, was only predicted to a statistically significant degree by ulceration. Multivariate analyses demonstrated that survival was more closely related to number of positive NSNs than total number of positive nodes. Classification tree analysis revealed 4 prognostically distinct survival groups. CONCLUSIONS Patients with NSN metastases could not be reliably identified prior to CLND. Prognosis following CLND was more closely related to number of positive NSNs than total number of positive nodes. Classification tree analysis defined distinctly different survival groups more accurately than use of single-factor analysis.
Collapse
Affiliation(s)
- Martin Wiener
- Melanoma Institute Australia (formerly Sydney Melanoma Unit), Sydney, NSW, Australia
| | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Abstract
Forty years ago, a clinical and histological classification scheme and prognostic factors were described for cutaneous melanoma. This scheme included the subtypes superficial spreading, nodular and lentigo maligna, and prognostic factors including tumor thickness, ulceration, and mitotic activity. There have been some tweaks to the classification scheme, but these basic findings form the foundation for melanoma diagnosis and staging today. Currently, no molecular marker or target has proved reliably useful in the staging or treatment of melanoma. Measurement with a simple ruler serves as the basis for the staging of primary cutaneous melanoma, while the recognition of primary tumor mitotic activity and ulceration also remain significant factors. Recently, mutational analysis has revealed a correlation of activating mutations with the morphological descriptors from decades ago. Future classification schemes may have more power in predicting response to therapy by integrating specific genomic and intra-tumoral expression profiles with histologic findings.
Collapse
Affiliation(s)
- Lyn McDivitt Duncan
- Department of Pathology, Harvard Medical School, MGH Dermatopathology Unit WRN827, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA.
| |
Collapse
|
10
|
Koziol JA, Feng AC, Jia Z, Wang Y, Goodison S, McClelland M, Mercola D. The wisdom of the commons: ensemble tree classifiers for prostate cancer prognosis. Bioinformatics 2009; 25:54-60. [PMID: 18628288 PMCID: PMC2638928 DOI: 10.1093/bioinformatics/btn354] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2008] [Revised: 07/09/2008] [Accepted: 07/10/2008] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Classification and regression trees have long been used for cancer diagnosis and prognosis. Nevertheless, instability and variable selection bias, as well as overfitting, are well-known problems of tree-based methods. In this article, we investigate whether ensemble tree classifiers can ameliorate these difficulties, using data from two recent studies of radical prostatectomy in prostate cancer. RESULTS Using time to progression following prostatectomy as the relevant clinical endpoint, we found that ensemble tree classifiers robustly and reproducibly identified three subgroups of patients in the two clinical datasets: non-progressors, early progressors and late progressors. Moreover, the consensus classifications were independent predictors of time to progression compared to known clinical prognostic factors.
Collapse
Affiliation(s)
- James A Koziol
- The Scripps Research Institute, La Jolla, San Diego, CA, USA
| | | | | | | | | | | | | |
Collapse
|