1
|
Alganmi N, Bashanfar A, Alotaibi R, Banjar H, Karim S, Mirza Z, Abusamra H, Al-Attas M, Turkistany S, Abuzenadah A. Uncovering hidden genetic risk factors for breast and ovarian cancers in BRCA-negative women: a machine learning approach in the Saudi population. PeerJ Comput Sci 2024; 10:e1942. [PMID: 38660159 PMCID: PMC11042021 DOI: 10.7717/peerj-cs.1942] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/26/2024] [Indexed: 04/26/2024]
Abstract
Breast and ovarian cancers are prevalent worldwide, with genetic factors such as BRCA1 and BRCA2 mutations playing a significant role. However, not all patients carry these mutations, making it challenging to identify risk factors. Researchers have turned to whole exome sequencing (WES) as a tool to identify genetic risk factors in BRCA-negative women. WES allows the sequencing of all protein-coding regions of an individual's genome, providing a comprehensive analysis that surpasses traditional gene-by-gene sequencing methods. This technology offers efficiency, cost-effectiveness and the potential to identify new genetic variants contributing to the susceptibility to the diseases. Interpreting WES data for disease-causing variants is challenging due to its complex nature. Machine learning techniques can uncover hidden genetic-variant patterns associated with cancer susceptibility. In this study, we used the extreme gradient boosting (XGBoost) and random forest (RF) algorithms to identify BRCA-related cancer high-risk genes specifically in the Saudi population. The experimental results exposed that the RF method scored superior performance with an accuracy of 88.16% and an area under the receiver-operator characteristic curve of 0.95. Using bioinformatics analysis tools, we explored the top features of the high-accuracy machine learning model that we built to enhance our knowledge of genetic interactions and find complex genetic patterns connected to the development of BRCA-related cancers. We were able to identify the significance of HLA gene variations in these WES datasets for BRCA-related patients. We find that immune response mechanisms play a major role in the development of BRCA-related cancer. It specifically highlights genes associated with antigen processing and presentation, such as HLA-B, HLA-A and HLA-DRB1 and their possible effects on tumour progression and immune evasion. In summary, by utilizing machine learning approaches, we have the potential to aid in the development of precision medicine approaches for early detection and personalized treatment strategies.
Collapse
Affiliation(s)
- Nofe Alganmi
- Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia
- Centre of Artificial Intelligence in Precision Medicines, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Arwa Bashanfar
- Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Reem Alotaibi
- Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Haneen Banjar
- Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia
- Centre of Artificial Intelligence in Precision Medicines, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Sajjad Karim
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Lab Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Zeenat Mirza
- Department of Medical Lab Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
- King Fahd Medical Research Center, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Heba Abusamra
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Manal Al-Attas
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Shereen Turkistany
- Center of Innovation Personalized Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Adel Abuzenadah
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia
- Department of Medical Lab Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
2
|
Germline variants associated with breast cancer in Khakass women of North Asia. Mol Biol Rep 2023; 50:2335-2341. [PMID: 36577833 DOI: 10.1007/s11033-022-08215-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 12/15/2022] [Indexed: 12/30/2022]
Abstract
INTRODUCTION Variants in the BRCA1/2 genes are responsible for familial breast cancer. Numerous studies showed a different spectrum of BRCA variants among breast cancer patients of different Ethnicity origin. In the available literature, no previous research has focused on breast cancer-associated variants among the Khakass people (the indigenous people of the Russian Federation). METHODS Twenty-six Khakass breast cancer patients were enrolled in the study. Genomic DNA was isolated from blood samples and used to prepare libraries using a Hereditary Cancer Solution kit. Next-generation sequencing (NGS) was performed using the MiSeq System (Illumina, USA). RESULTS In our study, 12% of patients (3/26) carried a single pathogenic variant; 54% of patients (14/26) carried variants of uncertain significance (VUS) or conflicting variants; and 35% of patients (9/26) did not carry any clinically significant variants. Germline pathogenic variant in the ATM gene (rs780619951, NC_000011.10:g.108259022C > T) was identified in two unrelated patients with a family history of cancer (7.6%, 2/26). The pathogenic truncating variant in the ATM gene (p. R805* or c.2413C > T) leads to the nonfunctional version of the protein. This variant has been earlier reported in individuals with a family history of breast cancer. CONCLUSIONS Our pilot study describes the germline variant in the ATM gene associated with breast cancer in Khakass women of North Asia.
Collapse
|
3
|
Lee NY, Hum M, Amali AA, Lim WK, Wong M, Myint MK, Tay RJ, Ong PY, Samol J, Lim CW, Ang P, Tan MH, Lee SC, Lee ASG. Whole-exome sequencing of BRCA-negative breast cancer patients and case-control analyses identify variants associated with breast cancer susceptibility. Hum Genomics 2022; 16:61. [PMID: 36424660 PMCID: PMC9685974 DOI: 10.1186/s40246-022-00435-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 11/14/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND For the majority of individuals with early-onset or familial breast cancer referred for genetic testing, the genetic basis of their familial breast cancer remains unexplained. To identify novel germline variants associated with breast cancer predisposition, whole-exome sequencing (WES) was performed. METHODS WES on 290 BRCA1/BRCA2-negative Singaporeans with early-onset breast cancer and/or a family history of breast cancer was done. Case-control analysis against the East-Asian subpopulation (EAS) from the Genome Aggregation Database (gnomAD) identified variants enriched in cases, which were further selected by occurrence in cancer gene databases. Variants were further evaluated in repeated case-control analyses using a second case cohort from the database of Genotypes and Phenotypes (dbGaP) comprising 466 early-onset breast cancer patients from the United States, and a Singapore SG10K_Health control cohort. RESULTS Forty-nine breast cancer-associated germline pathogenic variants in 37 genes were identified in Singapore cases versus gnomAD (EAS). Compared against SG10K_Health controls, 13 of 49 variants remain significantly enriched (False Discovery Rate (FDR)-adjusted p < 0.05). Comparing these 49 variants in dbGaP cases against gnomAD (EAS) and SG10K_Health controls revealed 23 concordant variants that were significantly enriched (FDR-adjusted p < 0.05). Fourteen variants were consistently enriched in breast cancer cases across all comparisons (FDR-adjusted p < 0.05). Seven variants in GPRIN2, NRG1, MYO5A, CLIP1, CUX1, GNAS and MGA were confirmed by Sanger sequencing. CONCLUSIONS In conclusion, we have identified pathogenic variants in genes associated with breast cancer predisposition. Importantly, many of these variants were significant in a second case cohort from dbGaP, suggesting that the strategy of using case-control analysis to select variants could potentially be utilized for identifying variants associated with cancer susceptibility.
Collapse
Affiliation(s)
- Ning Yuan Lee
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 11 Hospital Crescent, Singapore, 169610 Singapore
| | - Melissa Hum
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 11 Hospital Crescent, Singapore, 169610 Singapore
| | - Aseervatham Anusha Amali
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 11 Hospital Crescent, Singapore, 169610 Singapore
| | - Wei Kiat Lim
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 11 Hospital Crescent, Singapore, 169610 Singapore
| | - Matthew Wong
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 11 Hospital Crescent, Singapore, 169610 Singapore
| | - Matthew Khine Myint
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 11 Hospital Crescent, Singapore, 169610 Singapore
| | - Ru Jin Tay
- Lucence Diagnostics Pte Ltd, 211 Henderson Road, Singapore, 159552 Singapore
| | - Pei-Yi Ong
- Department of Hematology-Oncology, National University Cancer Institute, Singapore (NCIS), National University Health System, 5 Lower Kent Ridge Road, Singapore, 119074 Singapore
| | - Jens Samol
- Medical Oncology Department, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore, 308433 Singapore
- Johns Hopkins University, Baltimore, MD 21218 USA
| | - Chia Wei Lim
- Department of Personalised Medicine, Tan Tock Seng Hospital, 11 Jalan Tan Tock Seng, Singapore, 308433 Singapore
| | - Peter Ang
- Oncocare Cancer Centre, Gleneagles Medical Centre, 6 Napier Road, Singapore, 258499 Singapore
| | - Min-Han Tan
- Lucence Diagnostics Pte Ltd, 211 Henderson Road, Singapore, 159552 Singapore
| | - Soo-Chin Lee
- Department of Hematology-Oncology, National University Cancer Institute, Singapore (NCIS), National University Health System, 5 Lower Kent Ridge Road, Singapore, 119074 Singapore
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, 10 Medical Dr, Singapore, 117597 Singapore
- Cancer Science Institute, Singapore (CSI), National University of Singapore, 14 Medical Dr, Singapore, 117599 Singapore
| | - Ann S. G. Lee
- Division of Cellular and Molecular Research, Humphrey Oei Institute of Cancer Research, National Cancer Centre Singapore, 11 Hospital Crescent, Singapore, 169610 Singapore
- Department of Physiology, Yong Loo Lin School of Medicine, National University of Singapore, 2 Medical Drive, Singapore, 117593 Singapore
- SingHealth Duke-NUS Oncology Academic Clinical Programme (ONCO ACP), Duke-NUS Graduate Medical School, 8 College Road, Singapore, 169857 Singapore
| |
Collapse
|