1
|
Jagtap N, Kalapala R, Rughwani H, Singh AP, Inavolu P, Ramchandani M, Lakhtakia S, Manohar Reddy P, Sekaran A, Tandan M, Nabi Z, Basha J, Gupta R, Memon SF, Venkat Rao G, Sharma P, Nageshwar Reddy D. Application of machine-learning model to optimize colonic adenoma detection in India. Indian J Gastroenterol 2024:10.1007/s12664-024-01530-4. [PMID: 38758433 DOI: 10.1007/s12664-024-01530-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 01/05/2024] [Indexed: 05/18/2024]
Abstract
AIMS There is limited data on the prevalence and risk factors of colonic adenoma from the Indian sub-continent. We aimed at developing a machine-learning model to optimize colonic adenoma detection in a prospective cohort. METHODS All consecutive adult patients undergoing diagnostic colonoscopy were enrolled between October 2020 and November 2022. Patients with a high risk of colonic adenoma were excluded. The predictive model was developed using the gradient-boosting machine (GBM)-learning method. The GBM model was optimized further by adjusting the learning rate and the number of trees and 10-fold cross-validation. RESULTS Total 10,320 patients (mean age 45.18 ± 14.82 years; 69% men) were included in the study. In the overall population, 1152 (11.2%) patients had at least one adenoma. In patients with age > 50 years, hospital-based adenoma prevalence was 19.5% (808/4144). The area under the receiver operating curve (AUC) (SD) of the logistic regression model was 72.55% (4.91), while the AUCs for deep learning, decision tree, random forest and gradient-boosted tree model were 76.25% (4.22%), 65.95% (4.01%), 79.38% (4.91%) and 84.76% (2.86%), respectively. After model optimization and cross-validation, the AUC of the gradient-boosted tree model has increased to 92.2% (1.1%). CONCLUSIONS Machine-learning models may predict colorectal adenoma more accurately than logistic regression. A machine-learning model may help optimize the use of colonoscopy to prevent colorectal cancers. TRIAL REGISTRATION ClinicalTrials.gov (ID: NCT04512729).
Collapse
Affiliation(s)
- Nitin Jagtap
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India.
| | - Rakesh Kalapala
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Hardik Rughwani
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Aniruddha Pratap Singh
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Pradev Inavolu
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Mohan Ramchandani
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Sundeep Lakhtakia
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - P Manohar Reddy
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Anuradha Sekaran
- Department of Pathology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Manu Tandan
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Zaheer Nabi
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Jahangeer Basha
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Rajesh Gupta
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - Sana Fathima Memon
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| | - G Venkat Rao
- Department of Surgical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad 500 082, India
| | - Prateek Sharma
- The University of Kansas Medical Center, Kansas City, KS, USA
| | - D Nageshwar Reddy
- Department of Medical Gastroenterology, Asian Institute of Gastroenterology, Hyderabad, 500 082, India
| |
Collapse
|
2
|
Zhang M, Zhang Y, Zhang W, Zhao L, Jing H, Wu X, Guo L, Zhang H, Zhang Y, Zhu S, Zhang S, Zhang X. Reply: Request for clarification on symptom assessment methodology in high-risk population colonoscopy study. Cancer Med 2023; 12:15629-15631. [PMID: 37264753 PMCID: PMC10417086 DOI: 10.1002/cam4.6164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Accepted: 05/16/2023] [Indexed: 06/03/2023] Open
Affiliation(s)
- Mingqing Zhang
- Nankai University School of MedicineNankai UniversityTianjinChina
- Department of Colorectal SurgeryTianjin Union Medical CenterTianjinChina
- Tianjin Institute of ColoproctologyTianjinChina
- The Institute of Translational MedicineTianjin Union Medical Center of Nankai UniversityTianjinChina
| | - Yongdan Zhang
- Department of Colorectal SurgeryTianjin Union Medical CenterTianjinChina
- Tianjin Institute of ColoproctologyTianjinChina
| | - Wen Zhang
- Center for Applied MathematicsTianjin UniversityTianjinChina
| | - Lizhong Zhao
- Department of Colorectal SurgeryTianjin Union Medical CenterTianjinChina
- Tianjin Institute of ColoproctologyTianjinChina
| | - Haoren Jing
- Department of Colorectal SurgeryTianjin Union Medical CenterTianjinChina
- Tianjin Institute of ColoproctologyTianjinChina
| | - Xiaojing Wu
- The Institute of Translational MedicineTianjin Union Medical Center of Nankai UniversityTianjinChina
| | - Lu Guo
- Center for Applied MathematicsTianjin UniversityTianjinChina
| | - Haixiang Zhang
- Center for Applied MathematicsTianjin UniversityTianjinChina
| | - Yong Zhang
- Center for Applied MathematicsTianjin UniversityTianjinChina
| | - Siwei Zhu
- Nankai University School of MedicineNankai UniversityTianjinChina
- Department of Colorectal SurgeryTianjin Union Medical CenterTianjinChina
- Tianjin Institute of ColoproctologyTianjinChina
- The Institute of Translational MedicineTianjin Union Medical Center of Nankai UniversityTianjinChina
| | - Shiwu Zhang
- The Institute of Translational MedicineTianjin Union Medical Center of Nankai UniversityTianjinChina
- Department of PathologyTianjin Union Medical CenterTianjinChina
| | - Xipeng Zhang
- Nankai University School of MedicineNankai UniversityTianjinChina
- Department of Colorectal SurgeryTianjin Union Medical CenterTianjinChina
- Tianjin Institute of ColoproctologyTianjinChina
- The Institute of Translational MedicineTianjin Union Medical Center of Nankai UniversityTianjinChina
| |
Collapse
|
3
|
Kastrinos F, Kupfer SS, Gupta S. Colorectal Cancer Risk Assessment and Precision Approaches to Screening: Brave New World or Worlds Apart? Gastroenterology 2023; 164:812-827. [PMID: 36841490 PMCID: PMC10370261 DOI: 10.1053/j.gastro.2023.02.021] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 02/12/2023] [Accepted: 02/17/2023] [Indexed: 02/27/2023]
Abstract
Current colorectal cancer (CRC) screening recommendations take a "one-size-fits-all" approach using age as the major criterion to initiate screening. Precision screening that incorporates factors beyond age to risk stratify individuals could improve on current approaches and optimally use available resources with benefits for patients, providers, and health care systems. Prediction models could identify high-risk groups who would benefit from more intensive screening, while low-risk groups could be recommended less intensive screening incorporating noninvasive screening modalities. In addition to age, prediction models incorporate well-established risk factors such as genetics (eg, family CRC history, germline, and polygenic risk scores), lifestyle (eg, smoking, alcohol, diet, and physical inactivity), sex, and race and ethnicity among others. Although several risk prediction models have been validated, few have been systematically studied for risk-adapted population CRC screening. In order to envisage clinical implementation of precision screening in the future, it will be critical to develop reliable and accurate prediction models that apply to all individuals in a population; prospectively study risk-adapted CRC screening on the population level; garner acceptance from patients and providers; and assess feasibility, resources, cost, and cost-effectiveness of these new paradigms. This review evaluates the current state of risk prediction modeling and provides a roadmap for future implementation of precision CRC screening.
Collapse
Affiliation(s)
- Fay Kastrinos
- Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, New York; Division of Digestive and Liver Diseases, Columbia University Medical Center and Vagelos College of Physicians and Surgeons, New York, New York.
| | - Sonia S Kupfer
- University of Chicago, Section of Gastroenterology, Hepatology and Nutrition, Chicago, Illinois
| | - Samir Gupta
- Division of Gastroenterology, Department of Internal Medicine, University of California, San Diego, La Jolla, California; Veterans Affairs San Diego Healthcare System, San Diego, California
| |
Collapse
|
4
|
Yuan Z, Wang S, Liu Z, Liu Y, Wang Y, Han Y, Gao W, Liu X, Li H, Zhang Q, Ma H, Wang J, Wei X, Zhang X, Cui W, Zhang C. A risk scoring system for advanced colorectal neoplasia in high-risk participants to improve current colorectal cancer screening in Tianjin, China. BMC Gastroenterol 2022; 22:466. [PMCID: PMC9670427 DOI: 10.1186/s12876-022-02563-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 11/02/2022] [Indexed: 11/18/2022] Open
Abstract
Abstract
Background
Given the limited effectiveness of the current Chinese colorectal cancer (CRC) screening procedure, adherence to colonoscopy remains low. We aim to develop and validate a scoring system based on individuals who were identified as having a high risk in initial CRC screening to achieve more efficient risk stratification and improve adherence to colonoscopy.
Methods
A total of 29,504 screening participants with positive High-Risk Factor Questionnaire (HRFQ) or faecal immunochemical test (FIT) who underwent colonoscopy in Tianjin from 2012–2020 were enrolled in this study. Binary regression analysis was used to evaluate the association between risk factors and advanced colorectal neoplasia. Internal validation was also used to assess the performance of the scoring system.
Results
Male sex, older age (age ≥ 50 years), high body mass index (BMI ≥ 28 kg/m2), current or past smoking and weekly alcohol intake were identified as risk factors for advanced colorectal neoplasm. The odds ratios (ORs) for significant variables were applied to construct the risk score ranging from 0–11: LR, low risk (score 0–3); MR, moderate risk (score 4–6); and HR, high risk (score 7–11). Compared with subjects with LR, those with MR and HR had ORs of 2.47 (95% confidence interval, 2.09–2.93) and 4.59 (95% confidence interval, 3.86–5.44), respectively. The scoring model showed an outstanding discriminatory capacity with a c-statistic of 0.64 (95% confidence interval, 0.63–0.65).
Conclusions
Our results showed that the established scoring system could identify very high-risk populations with colorectal neoplasia. Combining this risk score with current Chinese screening methods may improve the effectiveness of CRC screening and adherence to colonoscopy.
Collapse
|
5
|
Wu W, Chen X, Fu C, Wong MC, Bao P, Huang J, Gong Y, Xu W, Gu K. Risk Scoring Systems for Predicting the Presence of Colorectal Neoplasia by Fecal Immunochemical Test Results in Chinese Population. Clin Transl Gastroenterol 2022; 13:e00525. [PMID: 36007185 PMCID: PMC9624592 DOI: 10.14309/ctg.0000000000000525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 08/10/2022] [Indexed: 11/17/2022] Open
Abstract
INTRODUCTION Adherence to colonoscopy screening for colorectal cancer (CRC) is low in general populations, including those tested positive in the fecal immunochemical test (FIT). Developing tailored risk scoring systems by FIT results may allow for more accurate identification of individuals for colonoscopy. METHODS Among 807,109 participants who completed the primary tests in the first-round Shanghai CRC screening program, 71,023 attended recommended colonoscopy. Predictors for colorectal neoplasia were used to develop respective scoring systems for FIT-positive or FIT-negative populations using logistic regression and artificial neural network methods. RESULTS Age, sex, area of residence, history of mucus or bloody stool, and CRC in first-degree relatives were identified as predictors for CRC in FIT-positive subjects, while a history of chronic diarrhea and prior cancer were additionally included for FIT-negative subjects. With an area under the receiver operating characteristic curve of more than 0.800 in predicting CRC, the logistic regression-based systems outperformed the artificial neural network-based ones and had a sensitivity of 68.9%, a specificity of 82.6%, and a detection rate of 0.24% by identifying 17.6% subjects at high risk. We also reported an area under the receiver operating characteristic curve of about 0.660 for the systems predicting CRC and adenoma, with a sensitivity of 57.8%, a specificity of 64.6%, and a detection rate of 6.87% through classifying 38.1% subjects as high-risk individuals. The performance of the scoring systems for CRC was superior to the currently used method in Mainland, China, and comparable with the scoring systems incorporating the FIT results. DISCUSSION The tailored risk scoring systems may better identify high-risk individuals of colorectal neoplasia and facilitate colonoscopy follow-up. External validation is warranted for widespread use of the scoring systems.
Collapse
Affiliation(s)
- Weimiao Wu
- Global Health Institute, School of Public Health, Fudan University, Shanghai, China
| | - Xin Chen
- Global Health Institute, School of Public Health, Fudan University, Shanghai, China
| | - Chen Fu
- Shanghai Municipal Center for Disease Control & Prevention, Shanghai, China
| | - Martin C.S. Wong
- The Jockey Club School of Public Health and Primary Care, Faculty of Medicine, Chinese University of Hong Kong, Hong Kong SAR, China
| | - Pingping Bao
- Shanghai Municipal Center for Disease Control & Prevention, Shanghai, China
| | - Junjie Huang
- The Jockey Club School of Public Health and Primary Care, Faculty of Medicine, Chinese University of Hong Kong, Hong Kong SAR, China
| | - Yangming Gong
- Shanghai Municipal Center for Disease Control & Prevention, Shanghai, China
| | - Wanghong Xu
- Global Health Institute, School of Public Health, Fudan University, Shanghai, China
| | - Kai Gu
- Shanghai Municipal Center for Disease Control & Prevention, Shanghai, China
| |
Collapse
|
6
|
Abhari RE, Thomson B, Yang L, Millwood I, Guo Y, Yang X, Lv J, Avery D, Pei P, Wen P, Yu C, Chen Y, Chen J, Li L, Chen Z, Kartsonaki C. External validation of models for predicting risk of colorectal cancer using the China Kadoorie Biobank. BMC Med 2022; 20:302. [PMID: 36071519 PMCID: PMC9454206 DOI: 10.1186/s12916-022-02488-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 07/17/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND In China, colorectal cancer (CRC) incidence and mortality have been steadily increasing over the last decades. Risk models to predict incident CRC have been developed in various populations, but they have not been systematically externally validated in a Chinese population. This study aimed to assess the performance of risk scores in predicting CRC using the China Kadoorie Biobank (CKB), one of the largest and geographically diverse prospective cohort studies in China. METHODS Nine models were externally validated in 512,415 participants in CKB and included 2976 cases of CRC. Model discrimination was assessed, overall and by sex, age, site, and geographic location, using the area under the receiver operating characteristic curve (AUC). Model discrimination of these nine models was compared to a model using age alone. Calibration was assessed for five models, and they were re-calibrated in CKB. RESULTS The three models with the highest discrimination (Ma (Cox model) AUC 0.70 [95% CI 0.69-0.71]; Aleksandrova 0.70 [0.69-0.71]; Hong 0.69 [0.67-0.71]) included the variables age, smoking, and alcohol. These models performed significantly better than using a model based on age alone (AUC of 0.65 [95% CI 0.64-0.66]). Model discrimination was generally higher in younger participants, males, urban environments, and for colon cancer. The two models (Guo and Chen) developed in Chinese populations did not perform better than the others. Among the 10% of participants with the highest risk, the three best performing models identified 24-26% of participants that went on to develop CRC. CONCLUSIONS Several risk models based on easily obtainable demographic and modifiable lifestyle factor have good discrimination in a Chinese population. The three best performing models have a higher discrimination than using a model based on age alone.
Collapse
Affiliation(s)
- Roxanna E Abhari
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, Big Data Institute Building, Roosevelt Drive, University of Oxford, Oxford, UK
| | - Blake Thomson
- Department of Surveillance and Health Equity Science, American Cancer Society, Atlanta, GA, USA
| | - Ling Yang
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, Big Data Institute Building, Roosevelt Drive, University of Oxford, Oxford, UK
- Medical Research Council Population Health Research Unit (MRC PHRU), Nuffield Department of Population Health, Big Data Institute Building, University of Oxford, Old Road Campus, Oxford, OX3 7LF, UK
| | - Iona Millwood
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, Big Data Institute Building, Roosevelt Drive, University of Oxford, Oxford, UK
- Medical Research Council Population Health Research Unit (MRC PHRU), Nuffield Department of Population Health, Big Data Institute Building, University of Oxford, Old Road Campus, Oxford, OX3 7LF, UK
| | - Yu Guo
- Fuwai Hospital, Chinese Academy of Medical Sciences, Beijing, 102308, China
| | - Xiaoming Yang
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, Big Data Institute Building, Roosevelt Drive, University of Oxford, Oxford, UK
| | - Jun Lv
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, 38 Xueyuan Road, Beijing, 100191, China
| | - Daniel Avery
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, Big Data Institute Building, Roosevelt Drive, University of Oxford, Oxford, UK
| | - Pei Pei
- Chinese Academy of Medical Sciences, Building C, NCCD, Shilongxi Rd., Mentougou District, Beijing, 102308, China
| | - Peng Wen
- Maiji CDC, No. 29 Shangbu Road, Maiji, Tianshui, 741020, Gansu, China
| | - Canqing Yu
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, 38 Xueyuan Road, Beijing, 100191, China
| | - Yiping Chen
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, Big Data Institute Building, Roosevelt Drive, University of Oxford, Oxford, UK
- Medical Research Council Population Health Research Unit (MRC PHRU), Nuffield Department of Population Health, Big Data Institute Building, University of Oxford, Old Road Campus, Oxford, OX3 7LF, UK
| | - Junshi Chen
- National Center for Food Safety Risk Assessment, 37 Guangqu Road, Beijing, 100021, China
| | - Liming Li
- Department of Epidemiology and Biostatistics, School of Public Health, Peking University, 38 Xueyuan Road, Beijing, 100191, China
| | - Zhengming Chen
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, Big Data Institute Building, Roosevelt Drive, University of Oxford, Oxford, UK
- Medical Research Council Population Health Research Unit (MRC PHRU), Nuffield Department of Population Health, Big Data Institute Building, University of Oxford, Old Road Campus, Oxford, OX3 7LF, UK
| | - Christiana Kartsonaki
- Clinical Trial Service Unit & Epidemiological Studies Unit (CTSU), Nuffield Department of Population Health, Big Data Institute Building, Roosevelt Drive, University of Oxford, Oxford, UK.
- Medical Research Council Population Health Research Unit (MRC PHRU), Nuffield Department of Population Health, Big Data Institute Building, University of Oxford, Old Road Campus, Oxford, OX3 7LF, UK.
| |
Collapse
|
7
|
Cairns JM, Greenley S, Bamidele O, Weller D. A scoping review of risk-stratified bowel screening: current evidence, future directions. Cancer Causes Control 2022; 33:653-685. [PMID: 35306592 PMCID: PMC8934381 DOI: 10.1007/s10552-022-01568-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2021] [Accepted: 03/02/2022] [Indexed: 12/21/2022]
Abstract
PURPOSE In this scoping review, we examined the international literature on risk-stratified bowel screening to develop recommendations for future research, practice and policy. METHODS Six electronic databases were searched from inception to 18 October 2021: Medline, Embase, PsycINFO, CINAHL, Cochrane Database of Systematic Reviews and Cochrane Central Register of Controlled Trials. Forward and backwards citation searches were also undertaken. All relevant literature were included. RESULTS After de-deduplication, 3,629 records remained. 3,416 were excluded at the title/abstract screening stage. A further 111 were excluded at full-text screening stage. In total, 102 unique studies were included. Results showed that risk-stratified bowel screening programmes can potentially improve diagnostic performance, but there is a lack of information on longer-term outcomes. Risk models do appear to show promise in refining existing risk stratification guidelines but most were not externally validated and less than half achieved good discriminatory power. Risk assessment tools in primary care have the potential for high levels of acceptability and uptake, and therefore, could form an important component of future risk-stratified bowel screening programmes, but sometimes the screening recommendations were not adhered to by the patient or healthcare provider. The review identified important knowledge gaps, most notably in the area of organisation of screening services due to few pilots, and what risk stratification might mean for inequalities. CONCLUSION We recommend that future research focuses on what organisational challenges risk-stratified bowel screening may face and a consideration of inequalities in any changes to organised bowel screening programmes.
Collapse
Affiliation(s)
- J M Cairns
- Hull York Medical School, University of Hull, Cottingham Road, Hull, HU6 7HR, UK.
| | - S Greenley
- Hull York Medical School, University of Hull, Cottingham Road, Hull, HU6 7HR, UK
| | - O Bamidele
- Hull York Medical School, University of Hull, Cottingham Road, Hull, HU6 7HR, UK
| | - D Weller
- Centre for Population Health Sciences, University of Edinburgh, Teviot Place, Edinburgh, EH8 9AG, UK
| |
Collapse
|
8
|
Wu WM, Gu K, Yang YH, Bao PP, Gong YM, Shi Y, Xu WH, Fu C. Improved risk scoring systems for colorectal cancer screening in Shanghai, China. Cancer Med 2022; 11:1972-1983. [PMID: 35274820 PMCID: PMC9089226 DOI: 10.1002/cam4.4576] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 12/13/2021] [Accepted: 01/03/2022] [Indexed: 12/16/2022] Open
Abstract
Background An optimal risk‐scoring system enables more targeted offers for colonoscopy in colorectal cancer (CRC) screening. This analysis aims to develop and validate scoring systems using parametric and non‐parametric methods for average‐risk populations. Methods Screening data of 807,695 subjects and 2806 detected cases in the first‐round CRC screening program in Shanghai were used to develop risk‐predictive models and scoring systems using logistic‐regression (LR) and artificial‐neural‐network (ANN) methods. Performance of established scoring systems was evaluated using area under the receiver operating characteristic curve (AUC), calibration, sensitivity, specificity, number of high‐risk individuals and potential detection rates of CRC. Results Age, sex, CRC in first‐degree relatives, chronic diarrhoea, mucus or bloody stool, history of any cancer and faecal‐immunochemical‐test (FIT) results were identified as predictors for the presence of CRC. The AUC of LR‐based system was 0.642 when using risk factors only in derivation set, and increased to 0.774 by further incorporating one‐sample FIT results, and to 0.808 by including two‐sample FIT results, while those for ANN‐based systems were 0.639, 0.763 and 0.805, respectively. Better calibrations were observed for the LR‐based systems than the ANN‐based ones. Compared with the currently used initial tests, parallel use of FIT with LR‐based systems resulted in improved specificities, less demands for colonoscopy and higher detection rates of CRC, while parallel use of FIT with ANN‐based systems had higher sensitivities; incorporating FIT in the scoring systems further increased specificities, decreased colonoscopy demands and improved detection rates of CRC. Conclusions Our results indicate the potentials of LR‐based scoring systems incorporating one‐ or two‐sample FIT results for CRC mass screening. External validation is warranted for scaling‐up implementation in the Chinese population. The established scoring systems derived from the logistic regression (LR) models, incorporating one‐ or two‐sample faecal immunochemical test (FIT) results as a predictor, have the potential to triage high‐risk individuals for colonoscopy in mass screening of colorectal cancer (CRC). More importantly, the cut‐off points of the scoring systems can be adjusted flexibly, facilitating the choices of cut‐off values for populations with abundant or limited resources.
Collapse
Affiliation(s)
- Wei-Miao Wu
- Global Health Institute, School of Public Health, Fudan University, Shanghai, China
| | - Kai Gu
- Shanghai Municipal Center for Disease Control & Prevention, Shanghai, China
| | - Yi-Hui Yang
- Global Health Institute, School of Public Health, Fudan University, Shanghai, China
| | - Ping-Ping Bao
- Shanghai Municipal Center for Disease Control & Prevention, Shanghai, China
| | - Yang-Ming Gong
- Shanghai Municipal Center for Disease Control & Prevention, Shanghai, China
| | - Yan Shi
- Shanghai Municipal Center for Disease Control & Prevention, Shanghai, China
| | - Wang-Hong Xu
- Global Health Institute, School of Public Health, Fudan University, Shanghai, China
| | - Chen Fu
- Shanghai Municipal Center for Disease Control & Prevention, Shanghai, China
| |
Collapse
|
9
|
Hussan H, Zhao J, Badu-Tawiah AK, Stanich P, Tabung F, Gray D, Ma Q, Kalady M, Clinton SK. Utility of machine learning in developing a predictive model for early-age-onset colorectal neoplasia using electronic health records. PLoS One 2022; 17:e0265209. [PMID: 35271664 PMCID: PMC9064446 DOI: 10.1371/journal.pone.0265209] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 02/24/2022] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND AND AIMS The incidence of colorectal cancer (CRC) is increasing in adults younger than 50, and early screening remains challenging due to cost and under-utilization. To identify individuals aged 35-50 years who may benefit from early screening, we developed a prediction model using machine learning and electronic health record (EHR)-derived factors. METHODS We enrolled 3,116 adults aged 35-50 at average-risk for CRC and underwent colonoscopy between 2017-2020 at a single center. Prediction outcomes were (1) CRC and (2) CRC or high-risk polyps. We derived our predictors from EHRs (e.g., demographics, obesity, laboratory values, medications, and zip code-derived factors). We constructed four machine learning-based models using a training set (random sample of 70% of participants): regularized discriminant analysis, random forest, neural network, and gradient boosting decision tree. In the testing set (remaining 30% of participants), we measured predictive performance by comparing C-statistics to a reference model (logistic regression). RESULTS The study sample was 55.1% female, 32.8% non-white, and included 16 (0.05%) CRC cases and 478 (15.3%) cases of CRC or high-risk polyps. All machine learning models predicted CRC with higher discriminative ability compared to the reference model [e.g., C-statistics (95%CI); neural network: 0.75 (0.48-1.00) vs. reference: 0.43 (0.18-0.67); P = 0.07] Furthermore, all machine learning approaches, except for gradient boosting, predicted CRC or high-risk polyps significantly better than the reference model [e.g., C-statistics (95%CI); regularized discriminant analysis: 0.64 (0.59-0.69) vs. reference: 0.55 (0.50-0.59); P<0.0015]. The most important predictive variables in the regularized discriminant analysis model for CRC or high-risk polyps were income per zip code, the colonoscopy indication, and body mass index quartiles. DISCUSSION Machine learning can predict CRC risk in adults aged 35-50 using EHR with improved discrimination. Further development of our model is needed, followed by validation in a primary-care setting, before clinical application.
Collapse
Affiliation(s)
- Hisham Hussan
- Division of Gastroenterology, Hepatology, and Nutrition, Department of
Internal Medicine, The Ohio State University, Columbus, Ohio, United States of
America
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio,
United States of America
| | - Jing Zhao
- Department of Biomedical Informatics, College of Medicine, The Ohio State
University, Columbus, Ohio, United States of America
| | - Abraham K. Badu-Tawiah
- Division of Gastroenterology, Hepatology, and Nutrition, Department of
Internal Medicine, The Ohio State University, Columbus, Ohio, United States of
America
- Department of Chemistry and Biochemistry, The Ohio State University,
Columbus, Ohio, United States of America
- Department of Microbial Infection and Immunity, The Ohio State
University, Columbus, Ohio, United States of America
| | - Peter Stanich
- Division of Gastroenterology, Hepatology, and Nutrition, Department of
Internal Medicine, The Ohio State University, Columbus, Ohio, United States of
America
| | - Fred Tabung
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio,
United States of America
- Division of Medical Oncology, Department of Internal Medicine, College of
Medicine, The Ohio State University, Columbus, Ohio, United States of
America
| | - Darrell Gray
- Division of Gastroenterology, Hepatology, and Nutrition, Department of
Internal Medicine, The Ohio State University, Columbus, Ohio, United States of
America
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio,
United States of America
| | - Qin Ma
- Department of Biomedical Informatics, College of Medicine, The Ohio State
University, Columbus, Ohio, United States of America
| | - Matthew Kalady
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio,
United States of America
- Division of Colon and Rectal Surgery, Department of Surgery, The Ohio
State University, Columbus, Ohio, United States of America
| | - Steven K. Clinton
- Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio,
United States of America
- Division of Medical Oncology, Department of Internal Medicine, College of
Medicine, The Ohio State University, Columbus, Ohio, United States of
America
| |
Collapse
|