1
|
Balasubramanian JB, Choudhury PP, Mukhopadhyay S, Ahearn T, Chatterjee N, García-Closas M, Almeida JS. Wasm-iCARE: a portable and privacy-preserving web module to build, validate, and apply absolute risk models. JAMIA Open 2024; 7:ooae055. [PMID: 38938691 PMCID: PMC11208928 DOI: 10.1093/jamiaopen/ooae055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 04/24/2024] [Accepted: 06/14/2024] [Indexed: 06/29/2024] Open
Abstract
Objectives Absolute risk models estimate an individual's future disease risk over a specified time interval. Applications utilizing server-side risk tooling, the R-based iCARE (R-iCARE), to build, validate, and apply absolute risk models, face limitations in portability and privacy due to their need for circulating user data in remote servers for operation. We overcome this by porting iCARE to the web platform. Materials and Methods We refactored R-iCARE into a Python package (Py-iCARE) and then compiled it to WebAssembly (Wasm-iCARE)-a portable web module, which operates within the privacy of the user's device. Results We showcase the portability and privacy of Wasm-iCARE through 2 applications: for researchers to statistically validate risk models and to deliver them to end-users. Both applications run entirely on the client side, requiring no downloads or installations, and keep user data on-device during risk calculation. Conclusions Wasm-iCARE fosters accessible and privacy-preserving risk tools, accelerating their validation and delivery.
Collapse
Affiliation(s)
- Jeya Balaji Balasubramanian
- Trans-Divisional Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20850, United States
| | - Parichoy Pal Choudhury
- Trans-Divisional Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20850, United States
- Surveillance & Health Equity Science, American Cancer Society, Atlanta, GA 30303, United States
| | - Srijon Mukhopadhyay
- Trans-Divisional Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20850, United States
| | - Thomas Ahearn
- Trans-Divisional Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20850, United States
| | - Nilanjan Chatterjee
- Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, MD 21205, United States
| | - Montserrat García-Closas
- Trans-Divisional Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20850, United States
- The Cancer Epidemiology and Prevention Research Unit, The Institute of Cancer Research, London, SM2 5NG, United Kingdom
| | - Jonas S Almeida
- Trans-Divisional Research Program, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD 20850, United States
| |
Collapse
|
2
|
Choi J, Ha TW, Choi HM, Lee HB, Shin HC, Chung W, Han W. Development of a Breast Cancer Risk Prediction Model Incorporating Polygenic Risk Scores and Nongenetic Risk Factors for Korean Women. Cancer Epidemiol Biomarkers Prev 2023; 32:1182-1189. [PMID: 37310812 PMCID: PMC10472098 DOI: 10.1158/1055-9965.epi-23-0064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 04/19/2023] [Accepted: 06/09/2023] [Indexed: 06/15/2023] Open
Abstract
BACKGROUND To develop a breast cancer prediction model for Korean women using published polygenic risk scores (PRS) combined with nongenetic risk factors (NGRF). METHODS Thirteen PRS models generated from single or multiple combinations of the Asian and European PRSs were evaluated among 20,434 Korean women. The AUC and increase in OR per SD were compared for each PRS. The PRSs with the highest predictive power were combined with NGRFs; then, an integrated prediction model was established using the Individualized Coherent Absolute Risk Estimation (iCARE) tool. The absolute breast cancer risk was stratified for 18,142 women with available follow-up data. RESULTS PRS38_ASN+PRS190_EB, a combination of Asian and European PRSs, had the highest AUC (0.621) among PRSs, with an OR per SD increase of 1.45 (95% confidence interval: 1.31-1.61). Compared with the average risk group (35%-65%), women in the top 5% had a 2.5-fold higher risk of breast cancer. Incorporating NGRFs yielded a modest increase in the AUC of women ages >50 years. For PRS38_ASN+PRS190_EB+NGRF, the average absolute risk was 5.06%. The lifetime absolute risk at age 80 years for women in the top 5% was 9.93%, whereas that of women in the lowest 5% was 2.22%. Women at higher risks were more sensitive to NGRF incorporation. CONCLUSIONS Combined Asian and European PRSs were predictive of breast cancer in Korean women. Our findings support the use of these models for personalized screening and prevention of breast cancer. IMPACT Our study provides insights into genetic susceptibility and NGRFs for predicting breast cancer in Korean women.
Collapse
Affiliation(s)
- Jihye Choi
- Department of General Surgery, National Medical Center, Seoul, Republic of Korea
- Department of Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea
| | | | | | - Han-Byoel Lee
- Department of Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea
- DCGen, Co., Ltd., Seoul, Republic of Korea
- Biomedical Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
- Cancer Research Institute, Seoul National University, Seoul, Republic of Korea
| | - Hee-Chul Shin
- DCGen, Co., Ltd., Seoul, Republic of Korea
- Department of Surgery, Seoul National University Bundang Hospital, Seongnam, Republic of Korea
| | | | - Wonshik Han
- Department of Surgery, Seoul National University College of Medicine, Seoul, Republic of Korea
- DCGen, Co., Ltd., Seoul, Republic of Korea
- Biomedical Research Institute, Seoul National University Hospital, Seoul, Republic of Korea
- Cancer Research Institute, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
3
|
Jee YH, Ho WK, Park S, Easton DF, Teo SH, Jung KJ, Kraft P. Polygenic risk scores for prediction of breast cancer in Korean women. Int J Epidemiol 2023; 52:796-805. [PMID: 36343017 PMCID: PMC10244045 DOI: 10.1093/ije/dyac206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Accepted: 10/31/2022] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND Polygenic risk scores (PRSs) for breast cancer, developed using European and Asian genome-wide association studies (GWAS), have been shown to have good discrimination in Asian women. However, prospective calibration of absolute risk prediction models, based on a PRS or PRS combined with lifestyle, clinical and environmental factors, in Asian women is limited. METHODS We consider several PRSs trained using European and/or Asian GWAS. For each PRS, we evaluate the discrimination and calibration of three absolute risk models among 41 031 women from the Korean Cancer Prevention Study (KCPS)-II Biobank: (i) a model using incidence, mortality and risk factor distributions (reference inputs) among US women and European relative risks; (ii) a recalibrated model, using Korean reference but European relative risks; and (iii) a fully Korean-based model using Korean reference and relative risk estimates from KCPS. RESULTS All Asian and European PRS improved discrimination over lifestyle, clinical and environmental (Qx) factors in Korean women. US-based absolute risk models overestimated the risks for women aged ≥50 years, and this overestimation was larger for models that only included PRS (expected-to-observed ratio E/O = 1.2 for women <50, E/O = 2.7 for women ≥50). Recalibrated and Korean-based risk models had better calibration in the large, although the risk in the highest decile was consistently overestimated. Absolute risk projections suggest that risk-reducing lifestyle changes would lead to larger absolute risk reductions among women at higher PRS. CONCLUSIONS Absolute risk models incorporating PRS trained in European and Asian GWAS and population-appropriate average age-specific incidences may be useful for risk-stratified interventions in Korean women.
Collapse
Affiliation(s)
- Yon Ho Jee
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Weang-Kee Ho
- School of Mathematical Sciences, Faculty of Science and Engineering, University of Nottingham Malaysia, Semenyih, Selangor, Malaysia
- Cancer Research Malaysia, Subang Jaya, Selangor, Malaysia
| | - Sohee Park
- Department of Biostatistics, Yonsei University Graduate School of Public Health, Seoul, Republic of Korea
| | - Douglas F Easton
- Centre for Cancer Genetic Epidemiology, Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
- Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Cambridge, UK
| | - Soo-Hwang Teo
- Cancer Research Malaysia, Subang Jaya, Selangor, Malaysia
- Sime Darby Medical Centre, Subang Jaya, Selangor, Malaysia
| | - Keum Ji Jung
- Institute for Health Promotion, Graduate School of Public Health, Yonsei University, Seoul, Republic of Korea
- Nuffield Department Population Health, University of Oxford, Oxford, UK
| | - Peter Kraft
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
4
|
Ding Y, Hou K, Xu Z, Pimplaskar A, Petter E, Boulier K, Privé F, Vilhjálmsson BJ, Olde Loohuis LM, Pasaniuc B. Polygenic scoring accuracy varies across the genetic ancestry continuum. Nature 2023; 618:774-781. [PMID: 37198491 PMCID: PMC10284707 DOI: 10.1038/s41586-023-06079-4] [Citation(s) in RCA: 65] [Impact Index Per Article: 65.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 04/12/2023] [Indexed: 05/19/2023]
Abstract
Polygenic scores (PGSs) have limited portability across different groupings of individuals (for example, by genetic ancestries and/or social determinants of health), preventing their equitable use1-3. PGS portability has typically been assessed using a single aggregate population-level statistic (for example, R2)4, ignoring inter-individual variation within the population. Here, using a large and diverse Los Angeles biobank5 (ATLAS, n = 36,778) along with the UK Biobank6 (UKBB, n = 487,409), we show that PGS accuracy decreases individual-to-individual along the continuum of genetic ancestries7 in all considered populations, even within traditionally labelled 'homogeneous' genetic ancestries. The decreasing trend is well captured by a continuous measure of genetic distance (GD) from the PGS training data: Pearson correlation of -0.95 between GD and PGS accuracy averaged across 84 traits. When applying PGS models trained on individuals labelled as white British in the UKBB to individuals with European ancestries in ATLAS, individuals in the furthest GD decile have 14% lower accuracy relative to the closest decile; notably, the closest GD decile of individuals with Hispanic Latino American ancestries show similar PGS performance to the furthest GD decile of individuals with European ancestries. GD is significantly correlated with PGS estimates themselves for 82 of 84 traits, further emphasizing the importance of incorporating the continuum of genetic ancestries in PGS interpretation. Our results highlight the need to move away from discrete genetic ancestry clusters towards the continuum of genetic ancestries when considering PGSs.
Collapse
Affiliation(s)
- Yi Ding
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA.
| | - Kangcheng Hou
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Ziqi Xu
- Department of Computer Science, UCLA, Los Angeles, CA, USA
| | - Aditya Pimplaskar
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Ella Petter
- Department of Computer Science, UCLA, Los Angeles, CA, USA
| | - Kristin Boulier
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Florian Privé
- National Centre for Register-based Research, Aarhus University, Aarhus, Denmark
| | - Bjarni J Vilhjálmsson
- National Centre for Register-based Research, Aarhus University, Aarhus, Denmark
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
- Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute, Cambridge, MA, USA
| | - Loes M Olde Loohuis
- Center for Neurobehavioral Genetics, Semel Institute for Neuroscience and Human Behavior, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - Bogdan Pasaniuc
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA.
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA.
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA.
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA.
- Institute for Precision Health, UCLA, Los Angeles, CA, USA.
| |
Collapse
|
5
|
Breast Cancer Risk Assessment Tools for Stratifying Women into Risk Groups: A Systematic Review. Cancers (Basel) 2023; 15:cancers15041124. [PMID: 36831466 PMCID: PMC9953796 DOI: 10.3390/cancers15041124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 01/31/2023] [Accepted: 02/01/2023] [Indexed: 02/12/2023] Open
Abstract
BACKGROUND The benefits and harms of breast screening may be better balanced through a risk-stratified approach. We conducted a systematic review assessing the accuracy of questionnaire-based risk assessment tools for this purpose. METHODS Population: asymptomatic women aged ≥40 years; Intervention: questionnaire-based risk assessment tool (incorporating breast density and polygenic risk where available); Comparison: different tool applied to the same population; Primary outcome: breast cancer incidence; Scope: external validation studies identified from databases including Medline and Embase (period 1 January 2008-20 July 2021). We assessed calibration (goodness-of-fit) between expected and observed cancers and compared observed cancer rates by risk group. Risk of bias was assessed with PROBAST. RESULTS Of 5124 records, 13 were included examining 11 tools across 15 cohorts. The Gail tool was most represented (n = 11), followed by Tyrer-Cuzick (n = 5), BRCAPRO and iCARE-Lit (n = 3). No tool was consistently well-calibrated across multiple studies and breast density or polygenic risk scores did not improve calibration. Most tools identified a risk group with higher rates of observed cancers, but few tools identified lower-risk groups across different settings. All tools demonstrated a high risk of bias. CONCLUSION Some risk tools can identify groups of women at higher or lower breast cancer risk, but this is highly dependent on the setting and population.
Collapse
|
6
|
Lou SJ, Hou MF, Chang HT, Chiu CC, Lee HH, Yeh SCJ, Shi HY. Machine Learning Algorithms to Predict Recurrence within 10 Years after Breast Cancer Surgery: A Prospective Cohort Study. Cancers (Basel) 2020; 12:cancers12123817. [PMID: 33348826 PMCID: PMC7765963 DOI: 10.3390/cancers12123817] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 12/11/2020] [Accepted: 12/14/2020] [Indexed: 02/07/2023] Open
Abstract
No studies have discussed machine learning algorithms to predict recurrence within 10 years after breast cancer surgery. This study purposed to compare the accuracy of forecasting models to predict recurrence within 10 years after breast cancer surgery and to identify significant predictors of recurrence. Registry data for breast cancer surgery patients were allocated to a training dataset (n = 798) for model development, a testing dataset (n = 171) for internal validation, and a validating dataset (n = 171) for external validation. Global sensitivity analysis was then performed to evaluate the significance of the selected predictors. Demographic characteristics, clinical characteristics, quality of care, and preoperative quality of life were significantly associated with recurrence within 10 years after breast cancer surgery (p < 0.05). Artificial neural networks had the highest prediction performance indices. Additionally, the surgeon volume was the best predictor of recurrence within 10 years after breast cancer surgery, followed by hospital volume and tumor stage. Accurate recurrence within 10 years prediction by machine learning algorithms may improve precision in managing patients after breast cancer surgery and improve understanding of risk factors for recurrence within 10 years after breast cancer surgery.
Collapse
Affiliation(s)
- Shi-Jer Lou
- Graduate Institute of Technological and Vocational Education, National Pingtung University of Science and Technology, Pingtung 91201, Taiwan;
| | - Ming-Feng Hou
- College of Medicine, Kaohsiung Medical University, Kaohsiung 80708, Taiwan;
- Department of Surgery, Kaohsiung Medical University Hospital, Kaohsiung 80756, Taiwan
| | - Hong-Tai Chang
- Department of Surgery, Kaohsiung Municipal United Hospital, Kaohsiung 80457, Taiwan;
| | - Chong-Chi Chiu
- Department of General Surgery, E-Da Cancer Hospital, I-Shou University, Kaohsiung 82445, Taiwan;
- School of Medicine, College of Medicine, I-Shou University, Kaohsiung 82445, Taiwan
| | - Hao-Hsien Lee
- Department of General Surgery, Chi Mei Medical Center, Liouying, Tainan 73657, Taiwan;
| | - Shu-Chuan Jennifer Yeh
- Department of Healthcare Administration and Medical Informatics, Kaohsiung Medical University, Kaohsiung 80708, Taiwan;
- Department of Business Management, National Sun Yat-sen University, Kaohsiung 80424, Taiwan
| | - Hon-Yi Shi
- Department of Healthcare Administration and Medical Informatics, Kaohsiung Medical University, Kaohsiung 80708, Taiwan;
- Department of Business Management, National Sun Yat-sen University, Kaohsiung 80424, Taiwan
- Department of Medical Research, Kaohsiung Medical University Hospital, Kaohsiung 80708, Taiwan
- Department of Medical Research, China Medical University Hospital, China Medical University, Taichung 40402, Taiwan
- Correspondence: ; Tel.: +886-7-321-1101 (ext. 2648); Fax: +886-7-313-7487
| |
Collapse
|