1
|
Brempou D, Montibus B, Izatt L, Andoniadou CL, Oakey RJ. Using parenclitic networks on phaeochromocytoma and paraganglioma tumours provides novel insights on global DNA methylation. Sci Rep 2024; 14:29958. [PMID: 39622952 PMCID: PMC11612305 DOI: 10.1038/s41598-024-81486-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2024] [Accepted: 11/26/2024] [Indexed: 12/06/2024] Open
Abstract
Despite the prevalence of sequencing data in biomedical research, the methylome remains underrepresented. Given the importance of DNA methylation in gene regulation and disease, it is crucial to address the need for reliable differential methylation methods. This work presents a novel, transferable approach for extracting information from DNA methylation data. Our agnostic, graph-based pipeline overcomes the limitations of commonly used differential methylation techniques and addresses the "small n, big k" problem. Pheochromocytoma and Paraganglioma (PPGL) tumours with known genetic aetiologies experience extreme hypermethylation genome wide. To highlight the effectiveness of our method in candidate discovery, we present the first phenotypic classifier of PPGLs based on DNA methylation achieving 0.7 ROC-AUC. Each sample is represented by an optimised parenclitic network, a graph representing the deviation of the sample's DNA methylation from the expected non-aggressive patterns. By extracting meaningful topological features, the dimensionality and, hence, the risk of overfitting is reduced, and the samples can be classified effectively. By using an explainable classification method, in this case logistic regression, the key CG loci influencing the decision can be identified. Our work provides insights into the molecular signature of aggressive PPGLs and we propose candidates for further research. Our optimised parenclitic network implementation improves the potential utility of DNA methylation data and offers an effective and complete pipeline for studying such datasets.
Collapse
Affiliation(s)
- Dimitria Brempou
- Department of Medical and Molecular Genetics, King's College London, London, SE1 9RT, UK
| | - Bertille Montibus
- Department of Medical and Molecular Genetics, King's College London, London, SE1 9RT, UK
| | - Louise Izatt
- Department of Clinical Genetics, Guy's and St Thomas' NHS Foundation Trust, London, SE1 9RT, UK
| | - Cynthia L Andoniadou
- Centre for Craniofacial and Regenerative Biology, King's College London, London, SE1 9RT, UK
| | - Rebecca J Oakey
- Department of Medical and Molecular Genetics, King's College London, London, SE1 9RT, UK.
| |
Collapse
|
2
|
Sokolov AV, Schiöth HB. Decoding depression: a comprehensive multi-cohort exploration of blood DNA methylation using machine learning and deep learning approaches. Transl Psychiatry 2024; 14:287. [PMID: 39009577 PMCID: PMC11250806 DOI: 10.1038/s41398-024-02992-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 06/24/2024] [Accepted: 06/27/2024] [Indexed: 07/17/2024] Open
Abstract
The causes of depression are complex, and the current diagnosis methods rely solely on psychiatric evaluations with no incorporation of laboratory biomarkers in clinical practices. We investigated the stability of blood DNA methylation depression signatures in six different populations using six public and two domestic cohorts (n = 1942) conducting mega-analysis and meta-analysis of the individual studies. We evaluated 12 machine learning and deep learning strategies for depression classification both in cross-validation (CV) and in hold-out tests using merged data from 8 separate batches, constructing models with both biased and unbiased feature selection. We found 1987 CpG sites related to depression in both mega- and meta-analysis at the nominal level, and the associated genes were nominally related to axon guidance and immune pathways based on enrichment analysis and eQTM data. Random forest classifiers achieved the highest performance (AUC 0.73 and 0.76) in CV and hold-out tests respectively on the batch-level processed data. In contrast, the methylation showed low predictive power (all AUCs < 0.57) for all classifiers in CV and no predictive power in hold-out tests when used with harmonized data. All models achieved significantly better performance (>14% gain in AUCs) with pre-selected features (selection bias), with some of the models (joint autoencoder-classifier) reaching AUCs of up to 0.91 in the final testing regardless of data preparation. Different algorithmic feature selection approaches may outperform limma, however, random forest models perform well regardless of the strategy. The results provide an overview over potential future biomarkers for depression and highlight many important methodological aspects for DNA methylation-based depression profiling including the use of machine learning strategies.
Collapse
Affiliation(s)
- Aleksandr V Sokolov
- Department of Surgical Sciences, Functional Pharmacology and Neuroscience, Uppsala University, Uppsala, Sweden
| | - Helgi B Schiöth
- Department of Surgical Sciences, Functional Pharmacology and Neuroscience, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
3
|
Lu S, Yang J, Gu Y, He D, Wu H, Sun W, Xu D, Li C, Guo C. Advances in Machine Learning Processing of Big Data from Disease Diagnosis Sensors. ACS Sens 2024; 9:1134-1148. [PMID: 38363978 DOI: 10.1021/acssensors.3c02670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2024]
Abstract
Exploring accurate, noninvasive, and inexpensive disease diagnostic sensors is a critical task in the fields of chemistry, biology, and medicine. The complexity of biological systems and the explosive growth of biomarker data have driven machine learning to become a powerful tool for mining and processing big data from disease diagnosis sensors. With the development of bioinformatics and artificial intelligence (AI), machine learning models formed by data mining have been able to guide more sensitive and accurate molecular computing. This review presents an overview of big data collection approaches and fundamental machine learning algorithms and discusses recent advances in machine learning and molecular computational disease diagnostic sensors. More specifically, we highlight existing modular workflows and key opportunities and challenges for machine learning to achieve disease diagnosis through big data mining.
Collapse
Affiliation(s)
- Shasha Lu
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Jianyu Yang
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Yu Gu
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Dongyuan He
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Haocheng Wu
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Wei Sun
- College of Chemistry and Chemical Engineering, Hainan Normal University, Haikou 571158, China
| | - Dong Xu
- Department of Diagnostic Ultrasound Imaging & Interventional Therapy, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences, Hangzhou 310022, China
| | - Changming Li
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| | - Chunxian Guo
- School of Materials Science and Engineering, Suzhou University of Science and Technology, Suzhou 215011, China
| |
Collapse
|
4
|
Yuan T, Edelmann D, Fan Z, Alwers E, Kather JN, Brenner H, Hoffmeister M. Machine learning in the identification of prognostic DNA methylation biomarkers among patients with cancer: A systematic review of epigenome-wide studies. Artif Intell Med 2023; 143:102589. [PMID: 37673571 DOI: 10.1016/j.artmed.2023.102589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 04/19/2023] [Accepted: 04/30/2023] [Indexed: 09/08/2023]
Abstract
BACKGROUND DNA methylation biomarkers have great potential in improving prognostic classification systems for patients with cancer. Machine learning (ML)-based analytic techniques might help overcome the challenges of analyzing high-dimensional data in relatively small sample sizes. This systematic review summarizes the current use of ML-based methods in epigenome-wide studies for the identification of DNA methylation signatures associated with cancer prognosis. METHODS We searched three electronic databases including PubMed, EMBASE, and Web of Science for articles published until 2 January 2023. ML-based methods and workflows used to identify DNA methylation signatures associated with cancer prognosis were extracted and summarized. Two authors independently assessed the methodological quality of included studies by a seven-item checklist adapted from 'A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies (PROBAST)' and from the 'Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK). Different ML methods and workflows used in included studies were summarized and visualized by a sunburst chart, a bubble chart, and Sankey diagrams, respectively. RESULTS Eighty-three studies were included in this review. Three major types of ML-based workflows were identified. 1) unsupervised clustering, 2) supervised feature selection, and 3) deep learning-based feature transformation. For the three workflows, the most frequently used ML techniques were consensus clustering, least absolute shrinkage and selection operator (LASSO), and autoencoder, respectively. The systematic review revealed that the performance of these approaches has not been adequately evaluated yet and that methodological and reporting flaws were common in the identified studies using ML techniques. CONCLUSIONS There is great heterogeneity in ML-based methodological strategies used by epigenome-wide studies to identify DNA methylation markers associated with cancer prognosis. In theory, most existing workflows could not handle the high multi-collinearity and potentially non-linearity interactions in epigenome-wide DNA methylation data. Benchmarking studies are needed to compare the relative performance of various approaches for specific cancer types. Adherence to relevant methodological and reporting guidelines are urgently needed.
Collapse
Affiliation(s)
- Tanwei Yuan
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany; Medical Faculty Heidelberg, Heidelberg University, Heidelberg, Germany
| | - Dominic Edelmann
- Division of Biostatistics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Ziwen Fan
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Elizabeth Alwers
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Jakob Nikolas Kather
- Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany; Medical Oncology, National Center of Tumour Diseases (NCT), University Hospital Heidelberg, Heidelberg, Germany
| | - Hermann Brenner
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany; Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Heidelberg, Germany; German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Michael Hoffmeister
- Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), Heidelberg, Germany.
| |
Collapse
|
5
|
Long G, Zhao L, Tang B, Zhou L, Mi X, Su W, Xiao L. A robust panel based on genomic methylation sites for recurrence-free survival in early hepatocellular carcinoma. Heliyon 2023; 9:e19434. [PMID: 37809660 PMCID: PMC10558510 DOI: 10.1016/j.heliyon.2023.e19434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 08/17/2023] [Accepted: 08/22/2023] [Indexed: 10/10/2023] Open
Abstract
Purpose Altered gene methylation precedes altered gene expression and the onset of disease. This study aimed to develop a potential model for predicting recurrence of early to mid-stage hepatocellular carcinoma (HCC) using methylation loci. Methods We used data from early to mid-stage HCC patients (TNM I-II) in the TCGA-LIHC dataset and lasso-cox regression model to identify an 18-DNA methylation site panel from which to calculate the riskScore of patients. The correlation of high/low riskScore with recurrence-free survival (RFS) and immune microenvironment in HCC patients was analyzed by bioinformatics. It was also validated in the GSE56588 dataset and the final dynamic nomogram was constructed. Results The results showed that riskScore was significantly correlated with RFS in HCC patients. The differential mutated genes between the two groups of HCC patients with high/low riskScore were mainly enriched in the TP53 signaling pathway. The immune microenvironment was better in HCC patients in the low-riskScore group compared to the high-riskScore group. This was validated in the GSE56588 dataset. Based on the subgroup stratification analysis of the relationship between high/low riskScore and RFS, as well as univariate and multivariate cox analyses, the riskScore was found to be independent of clinical indicators. We found that riskScore, vascular invasion and cirrhosis status could effectively differentiate RFS in HCC patients, and we also constructed prediction model based on these three factors. The model we constructed were validated in the TCGA-LIHC database and a web calculator was built for clinical use. Conclusion The methylation riskScore is a predictor of RFS independent of clinical factors and can be used as a marker to predict recurrence in HCC patients.
Collapse
Affiliation(s)
- Guo Long
- Department of General Surgery, Xiangya Hospital, Central South University, Changsha, China
| | - Lihua Zhao
- Department of Translational Medicine, Genecast Biotechnology Co., Ltd, Wuxi City, Jiangsu, China
| | - Biao Tang
- Hepatobiliary and Pancreatic Surgery Department, The Central Hospital of Yongzhou, Yongzhou, China
| | - Ledu Zhou
- Department of General Surgery, Xiangya Hospital, Central South University, Changsha, China
| | - Xingyu Mi
- Department of General Surgery, Xiangya Hospital, Central South University, Changsha, China
| | - Wenxin Su
- Department of General Surgery, Xiangya Hospital, Central South University, Changsha, China
| | - Liang Xiao
- Department of General Surgery, Xiangya Hospital, Central South University, Changsha, China
| |
Collapse
|
6
|
Machine learning to analyse omic-data for COVID-19 diagnosis and prognosis. BMC Bioinformatics 2023; 24:7. [PMID: 36609221 PMCID: PMC9817417 DOI: 10.1186/s12859-022-05127-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Accepted: 12/23/2022] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND With the global spread of COVID-19, the world has seen many patients, including many severe cases. The rapid development of machine learning (ML) has made significant disease diagnosis and prediction achievements. Current studies have confirmed that omics data at the host level can reflect the development process and prognosis of the disease. Since early diagnosis and effective treatment of severe COVID-19 patients remains challenging, this research aims to use omics data in different ML models for COVID-19 diagnosis and prognosis. We used several ML models on omics data of a large number of individuals to first predict whether patients are COVID-19 positive or negative, followed by the severity of the disease. RESULTS On the COVID-19 diagnosis task, we got the best AUC of 0.99 with our multilayer perceptron model and the highest F1-score of 0.95 with our logistic regression (LR) model. For the severity prediction task, we achieved the highest accuracy of 0.76 with an LR model. Beyond classification and predictive modeling, our study founds ML models performed better on integrated multi-omics data, rather than single omics. By comparing top features from different omics dataset, we also found the robustness of our model, with a wider range of applicability in diverse dataset related to COVID-19. Additionally, we have found that omics-based models performed better than image or physiological feature-based models, proving the importance of the omics-based dataset for future model development. CONCLUSIONS This study diagnoses COVID-19 positive cases and predicts accurate severity levels. It lowers the dependence on clinical data and professional judgment, by leveraging the utilization of state-of-the-art models. our model showed wider applicability across different omics dataset, which is highly transferable in other respiratory or similar diseases. Hospital and public health care mechanisms can optimize the distribution of medical resources and improve the robustness of the medical system.
Collapse
|
7
|
P D, C G. A systematic review on machine learning and deep learning techniques in cancer survival prediction. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2022; 174:62-71. [PMID: 35933043 DOI: 10.1016/j.pbiomolbio.2022.07.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/13/2022] [Accepted: 07/19/2022] [Indexed: 06/15/2023]
Abstract
Cancer is a disease which is characterised by the unusual and uncontrollable growth of body cells. This usually happens asymptomatically and gets spread to other parts of the body. The major problem in treating cancer is that its progress is not monitored once it is diagnosed. The progress or the prognosis can be done through survival analysis. The survival analysis is the branch of statistics that deals in predicting the time of event of occurrence. In the case of cancer prognosis the event is the survival time of the patient from the onset of the disease or it can be the recurrence of the disease after undergoing a treatment. This study aims to bring out the machine learning and deep learning models involved in providing the prognosis to the cancer patients.
Collapse
Affiliation(s)
- Deepa P
- School of Information Technology and Engineering, Vellore Institute of Technology, Vellore, India
| | - Gunavathi C
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India.
| |
Collapse
|
8
|
Massi MC, Dominoni L, Ieva F, Fiorito G. A Deep Survival EWAS approach estimating risk profile based on pre-diagnostic DNA methylation: An application to breast cancer time to diagnosis. PLoS Comput Biol 2022; 18:e1009959. [PMID: 36155971 PMCID: PMC9536632 DOI: 10.1371/journal.pcbi.1009959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 10/06/2022] [Accepted: 09/10/2022] [Indexed: 11/19/2022] Open
Abstract
Previous studies for cancer biomarker discovery based on pre-diagnostic blood DNA methylation (DNAm) profiles, either ignore the explicit modeling of the Time To Diagnosis (TTD), or provide inconsistent results. This lack of consistency is likely due to the limitations of standard EWAS approaches, that model the effect of DNAm at CpG sites on TTD independently. In this work, we aim to identify blood DNAm profiles associated with TTD, with the aim to improve the reliability of the results, as well as their biological meaningfulness. We argue that a global approach to estimate CpG sites effect profile should capture the complex (potentially non-linear) relationships interplaying between sites. To prove our concept, we develop a new Deep Learning-based approach assessing the relevance of individual CpG Islands (i.e., assigning a weight to each site) in determining TTD while modeling their combined effect in a survival analysis scenario. The algorithm combines a tailored sampling procedure with DNAm sites agglomeration, deep non-linear survival modeling and SHapley Additive exPlanations (SHAP) values estimation to aid robustness of the derived effects profile. The proposed approach deals with the common complexities arising from epidemiological studies, such as small sample size, noise, and low signal-to-noise ratio of blood-derived DNAm. We apply our approach to a prospective case-control study on breast cancer nested in the EPIC Italy cohort and we perform weighted gene-set enrichment analyses to demonstrate the biological meaningfulness of the obtained results. We compared the results of Deep Survival EWAS with those of a traditional EWAS approach, demonstrating that our method performs better than the standard approach in identifying biologically relevant pathways. Blood-derived DNAm profiles could be exploited as new biomarkers for cancer risk stratification and possibly, early detection. This is of particular interest since blood is a convenient tissue to assay for constitutional methylation and its collection is non-invasive. Exploiting pre-diagnostic blood DNAm data opens the further opportunity to investigate the association of DNAm at baseline on cancer risk, modeling the relationship between sites’ methylation and the Time to Diagnosis. Previous studies mostly provide inconsistent results likely due to the limitations of standard EWAS approaches, that model the effect of DNAm at CpG sites on TTD independently. In this work we argue that an approach to estimate single CpG sites’ effect while modeling their combined effect on the survival outcome is needed, and we claim that such approach should capture the complex (potentially non-linear) relationships interplaying between sites. We prove this concept by developing a novel approach to analyze a prospective case-control study on breast cancer nested in the EPIC Italy cohort. A weighted gene set enrichment analysis confirms that our approach outperforms standard EWAS in identifying biologically meaningful pathways.
Collapse
Affiliation(s)
- Michela Carlotta Massi
- Health Data Science Centre, Human Technopole Foundation, Milan, Italy
- MOX Laboratory for Modeling and Scientific Computing, Dept. of Mathematics, Politecnico di Milano, Milan, Italy
- * E-mail:
| | - Lorenzo Dominoni
- MOX Laboratory for Modeling and Scientific Computing, Dept. of Mathematics, Politecnico di Milano, Milan, Italy
| | - Francesca Ieva
- Health Data Science Centre, Human Technopole Foundation, Milan, Italy
- MOX Laboratory for Modeling and Scientific Computing, Dept. of Mathematics, Politecnico di Milano, Milan, Italy
| | - Giovanni Fiorito
- Laboratory of Biostatistics, Dept. of Biomedical Sciences, University of Sassari, Sassari, Italy
| |
Collapse
|
9
|
Alfonso Perez G, Caballero Villarraso J. Neural Network Aided Detection of Huntington Disease. J Clin Med 2022; 11:jcm11082110. [PMID: 35456203 PMCID: PMC9032851 DOI: 10.3390/jcm11082110] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 04/07/2022] [Accepted: 04/08/2022] [Indexed: 02/06/2023] Open
Abstract
Huntington Disease (HD) is a degenerative neurological disease that causes a significant impact on the quality of life of the patient and eventually death. In this paper we present an approach to create a biomarker using as an input DNA CpG methylation data to identify HD patients. DNA CpG methylation is a well-known epigenetic marker for disease state. Technological advances have made it possible to quickly analyze hundreds of thousands of CpGs. This large amount of information might introduce noise as potentially not all DNA CpG methylation levels will be related to the presence of the illness. In this paper, we were able to reduce the number of CpGs considered from hundreds of thousands to 237 using a non-linear approach. It will be shown that using only these 237 CpGs and non-linear techniques such as artificial neural networks makes it possible to accurately differentiate between control and HD patients. An underlying assumption in this paper is that there are no indications suggesting that the process is linear and therefore non-linear techniques, such as artificial neural networks, are a valid tool to analyze this complex disease. The proposed approach is able to accurately distinguish between control and HD patients using DNA CpG methylation data as an input and non-linear forecasting techniques. It should be noted that the dataset analyzed is relatively small. However, the results seem relatively consistent and the analysis can be repeated with larger data-sets as they become available.
Collapse
Affiliation(s)
- Gerardo Alfonso Perez
- Department of Biochemistry and Molecular Biology, University of Cordoba, 14071 Cordoba, Spain;
- Correspondence:
| | - Javier Caballero Villarraso
- Department of Biochemistry and Molecular Biology, University of Cordoba, 14071 Cordoba, Spain;
- Biochemical Laboratory, Reina Sofia University Hospital, 14004 Cordoba, Spain
| |
Collapse
|
10
|
Automated Breast Cancer Detection Models Based on Transfer Learning. SENSORS 2022; 22:s22030876. [PMID: 35161622 PMCID: PMC8838322 DOI: 10.3390/s22030876] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 12/28/2021] [Accepted: 01/19/2022] [Indexed: 02/06/2023]
Abstract
Breast cancer is among the leading causes of mortality for females across the planet. It is essential for the well-being of women to develop early detection and diagnosis techniques. In mammography, focus has contributed to the use of deep learning (DL) models, which have been utilized by radiologists to enhance the needed processes to overcome the shortcomings of human observers. The transfer learning method is being used to distinguish malignant and benign breast cancer by fine-tuning multiple pre-trained models. In this study, we introduce a framework focused on the principle of transfer learning. In addition, a mixture of augmentation strategies were used to prevent overfitting and produce stable outcomes by increasing the number of mammographic images; including several rotation combinations, scaling, and shifting. On the Mammographic Image Analysis Society (MIAS) dataset, the proposed system was evaluated and achieved an accuracy of 89.5% using (residual network-50) ResNet50, and achieved an accuracy of 70% using the Nasnet-Mobile network. The proposed system demonstrated that pre-trained classification networks are significantly more effective and efficient, making them more acceptable for medical imaging, particularly for small training datasets.
Collapse
|
11
|
Sundus KI, Hammo BH, Al-Zoubi MB, Al-Omari A. Solving the multicollinearity problem to improve the stability of machine learning algorithms applied to a fully annotated breast cancer dataset. INFORMATICS IN MEDICINE UNLOCKED 2022. [DOI: 10.1016/j.imu.2022.101088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022] Open
|
12
|
Keelan S, Flanagan M, Hill ADK. Evolving Trends in Surgical Management of Breast Cancer: An Analysis of 30 Years of Practice Changing Papers. Front Oncol 2021; 11:622621. [PMID: 34422626 PMCID: PMC8371403 DOI: 10.3389/fonc.2021.622621] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Accepted: 04/19/2021] [Indexed: 01/12/2023] Open
Abstract
The management of breast cancer has evolved into a multidisciplinary evidence-based surgical speciality, with emphasis on conservative surgery. A number of landmark trials have established lumpectomy followed by radiation as the standard of care for many patients. The aim of this study is to construct a narrative review of recent developments in the surgical management of breast cancer and how such developments have impacted surgical practice. A comprehensive literature search of Pubmed was conducted. The latest search was performed on October 31st, 2020. Search terms “breast cancer” were used in combinations with specific key words and Boolean operators relating to surgical management. The reference lists of retrieved articles were comprehensively screened for additional eligible publications. Articles were selected and reviewed based on relevance. We selected publications in the past 10 years but did not exclude commonly referenced and highly regarded previous publications. Review articles and book chapters were also cited to provide reference on details not discussed in the academic literature. This article reviews the current evidence in surgical management of early-stage breast cancer, discusses recent trends in surgical practice for therapeutic and prophylactic procedures and provides commentary on implications and factors associated with these trends.
Collapse
Affiliation(s)
- Stephen Keelan
- The Department of Surgery, The Royal College of Surgeons in Ireland, Dublin, Ireland.,The Department of Surgery, Beaumont Hospital, Dublin, Ireland
| | - Michael Flanagan
- The Department of Surgery, The Royal College of Surgeons in Ireland, Dublin, Ireland.,The Department of Surgery, Beaumont Hospital, Dublin, Ireland
| | - Arnold D K Hill
- The Department of Surgery, The Royal College of Surgeons in Ireland, Dublin, Ireland.,The Department of Surgery, Beaumont Hospital, Dublin, Ireland
| |
Collapse
|
13
|
|
14
|
Asada K, Kaneko S, Takasawa K, Machino H, Takahashi S, Shinkai N, Shimoyama R, Komatsu M, Hamamoto R. Integrated Analysis of Whole Genome and Epigenome Data Using Machine Learning Technology: Toward the Establishment of Precision Oncology. Front Oncol 2021; 11:666937. [PMID: 34055633 PMCID: PMC8149908 DOI: 10.3389/fonc.2021.666937] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 04/26/2021] [Indexed: 12/17/2022] Open
Abstract
With the completion of the International Human Genome Project, we have entered what is known as the post-genome era, and efforts to apply genomic information to medicine have become more active. In particular, with the announcement of the Precision Medicine Initiative by U.S. President Barack Obama in his State of the Union address at the beginning of 2015, "precision medicine," which aims to divide patients and potential patients into subgroups with respect to disease susceptibility, has become the focus of worldwide attention. The field of oncology is also actively adopting the precision oncology approach, which is based on molecular profiling, such as genomic information, to select the appropriate treatment. However, the current precision oncology is dominated by a method called targeted-gene panel (TGP), which uses next-generation sequencing (NGS) to analyze a limited number of specific cancer-related genes and suggest optimal treatments, but this method causes the problem that the number of patients who benefit from it is limited. In order to steadily develop precision oncology, it is necessary to integrate and analyze more detailed omics data, such as whole genome data and epigenome data. On the other hand, with the advancement of analysis technologies such as NGS, the amount of data obtained by omics analysis has become enormous, and artificial intelligence (AI) technologies, mainly machine learning (ML) technologies, are being actively used to make more efficient and accurate predictions. In this review, we will focus on whole genome sequencing (WGS) analysis and epigenome analysis, introduce the latest results of omics analysis using ML technologies for the development of precision oncology, and discuss the future prospects.
Collapse
Affiliation(s)
- Ken Asada
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Syuzo Kaneko
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Ken Takasawa
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Hidenori Machino
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Satoshi Takahashi
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Norio Shinkai
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan
- Department of NCC Cancer Science, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| | - Ryo Shimoyama
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Masaaki Komatsu
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan
| | - Ryuji Hamamoto
- Cancer Translational Research Team, RIKEN Center for Advanced Intelligence Project, Tokyo, Japan
- Division of Medical AI Research and Development, National Cancer Center Research Institute, Tokyo, Japan
- Department of NCC Cancer Science, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| |
Collapse
|
15
|
Cardoso MJ, Houssami N, Pozzi G, Séroussi B. Artificial intelligence (AI) in breast cancer care - Leveraging multidisciplinary skills to improve care. Breast 2020; 56:110-113. [PMID: 33308879 PMCID: PMC7982546 DOI: 10.1016/j.breast.2020.11.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Affiliation(s)
- Maria Joao Cardoso
- Breast Unit, Champalimaud Clinical Center, Champalimaud Foundation, Lisbon, Portugal; Faculdade de Medicina da Universidade Nova de Lisboa, Nova Medical School, Lisbon, Portugal
| | - Nehmat Houssami
- Sydney School of Public Health, Faculty of Medicine and Health, Fisher Road, The University of Sydney, New South Wales, Australia
| | - Giuseppe Pozzi
- DEIB, Politecnico di Milano, P.za L. da Vinci 32, I-20133, Milano, Italy
| | - Brigitte Séroussi
- Sorbonne Université, Université Sorbonne Paris Nord, INSERM, LIMICS UMR_S 1142, F-75006, Paris, France; Assistance Publique - Hôpitaux de Paris, Département de Santé Publique, F-75020, Paris, France
| |
Collapse
|