1
|
Dhillon SK, Ganggayah MD, Sinnadurai S, Lio P, Taib NA. Theory and Practice of Integrating Machine Learning and Conventional Statistics in Medical Data Analysis. Diagnostics (Basel) 2022; 12:2526. [PMID: 36292218 PMCID: PMC9601117 DOI: 10.3390/diagnostics12102526] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 09/26/2022] [Accepted: 10/04/2022] [Indexed: 11/16/2022] Open
Abstract
The practice of medical decision making is changing rapidly with the development of innovative computing technologies. The growing interest of data analysis with improvements in big data computer processing methods raises the question of whether machine learning can be integrated with conventional statistics in health research. To help address this knowledge gap, this paper presents a review on the conceptual integration between conventional statistics and machine learning, focusing on the health research. The similarities and differences between the two are compared using mathematical concepts and algorithms. The comparison between conventional statistics and machine learning methods indicates that conventional statistics are the fundamental basis of machine learning, where the black box algorithms are derived from basic mathematics, but are advanced in terms of automated analysis, handling big data and providing interactive visualizations. While the nature of both these methods are different, they are conceptually similar. Based on our review, we conclude that conventional statistics and machine learning are best to be integrated to develop automated data analysis tools. We also strongly believe that machine learning could be explored by health researchers to enhance conventional statistics in decision making for added reliable validation measures.
Collapse
Affiliation(s)
- Sarinder Kaur Dhillon
- Data Science & Bioinformatics Laboratory, Institute of Biological Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur 50603, Malaysia
| | - Mogana Darshini Ganggayah
- Department of Econometrics and Business Statistics, School of Business, Monash University Malaysia, Kuala Lumpur 47500, Malaysia
| | - Siamala Sinnadurai
- Department of Population Medicine and Lifestyle Disease Prevention, Medical University of Bialystok, 15-269 Bialystok, Poland
| | - Pietro Lio
- Department of Computer Science and Technology, University of Cambridge, 15 JJ Thomson Avenue, Cambridge CB3 0FD, UK
| | - Nur Aishah Taib
- Department of Surgery, Faculty of Medicine, Universiti Malaya, Kuala Lumpur 50603, Malaysia
| |
Collapse
|
2
|
Wang C, He T, Wang Z, Zheng D, Shen C. Relative Risk of Cardiovascular Mortality in Breast Cancer Patients: A Population-Based Study. Rev Cardiovasc Med 2022; 23:120. [PMID: 39076231 PMCID: PMC11273964 DOI: 10.31083/j.rcm2304120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 02/25/2022] [Accepted: 03/11/2022] [Indexed: 07/31/2024] Open
Abstract
Aims To investigate the risk of cardiovascular disease (CVD) mortality in breast cancer patients compared with the general female population. Methods Data was retrieved from the Surveillance, Epidemiology, and End Results database. 924,439 female breast cancer patients who were at the age of follow-up ≥ 30 years and diagnosed during 1990-2016 as well as the aggregated general female population in the US were included. Using multivariable Poisson regression, we calculated incidence rate ratios (IRRs) of CVD mortality among female breast cancer patients compared with the female population. Results The risk of CVD mortality was mildly increased among breast cancer patients at the age of follow-up 30-64 years (IRR 1.06, 95% confidence interval [CI] 1.03-1.10) compared with the general population. This growth of risk reached its peak within the first month after diagnosis (IRR 3.33, 95% CI 2.84-3.91) and was mainly activated by diseases of the heart (IRR 1.11, 95% CI 1.07-1.15). The elevation was greatest in survivors at the age of follow up 30-34 years (IRR 3.50, 95% CI 1.75-7.01). Conclusions Clinicians should provide risk mitigation strategies with early monitoring of CVD mortality for breast cancer survivors, especially those who were young or with aggressive tumor stage.
Collapse
Affiliation(s)
- Chengshi Wang
- Department of Breast Surgery, Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, School of Medicine, University of Electronic Science and Technology of China, 610044 Chengdu, Sichuan, China
- Laboratory of Molecular Diagnosis of Cancer, Clinical Research Center for Breast, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan, China
| | - Tao He
- Department of Breast Surgery, West China School of Medicine/West China Hospital, Sichuan University, 610041 Chengdu, Sichuan, China
| | - Zhu Wang
- Laboratory of Molecular Diagnosis of Cancer, Clinical Research Center for Breast, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan, China
| | - Dan Zheng
- Laboratory of Molecular Diagnosis of Cancer, Clinical Research Center for Breast, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan, China
- Department of Head, Neck and Mammary Gland Oncology, Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan, China
| | - Chaoyong Shen
- Department of Gastrointestinal Surgery, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan, China
| |
Collapse
|
3
|
Wang C, Hu K, Luo C, Deng L, Fall K, Tamimi RM, Valdimarsdóttir UA, Fang F, Lu D. Cardiovascular mortality among cancer survivors who developed breast cancer as a second primary malignancy. Br J Cancer 2021; 125:1450-1458. [PMID: 34580431 PMCID: PMC8575780 DOI: 10.1038/s41416-021-01549-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Revised: 08/17/2021] [Accepted: 09/14/2021] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND To assess the risk of cardiovascular mortality among cancer survivors who developed breast cancer as a second malignancy (BCa-2) compared with patients with first primary breast cancer (BCa-1) and the general population. METHODS Using the Surveillance, Epidemiology, and End Results database, we conducted a population-based cohort study including 1,024,047 BCa-1 and 41,744 BCa-2 patients diagnosed from the age 30 between 1975 and 2016, and the corresponding US female population (994,415,911 person-years; 5,403,551 cardiovascular deaths). Compared with the general population and BCa-1 patients, we calculated incidence rate ratios (IRRs) of cardiovascular deaths among BCa-2 patients using Poisson regression. To adjust for unmeasured confounders, we performed a nested, case-crossover analysis among BCa-2 patients who died from cardiovascular disease. RESULTS Although BCa-2 patients had a mildly increased risk of cardiovascular mortality compared with the population (IRR 1.08) and BCa-1 patients (IRR 1.15), the association was pronounced among individuals aged 30-49 years (BCa-2 vs. population: IRR 6.61; BCa-2 vs. BCa-1: IRR 3.03). The risk elevation was greatest within the first month after diagnosis, compared with the population, but comparable with BCa-1 patients. The case-crossover analysis confirmed these results. CONCLUSION Our findings suggest that patients with BCa-2 are at increased risk of cardiovascular mortality.
Collapse
Affiliation(s)
- Chengshi Wang
- grid.13291.380000 0001 0807 1581Laboratory of Molecular Diagnosis of Cancer, and Department of Medical Oncology, Clinical Research Center for Breast Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan P. R. China ,grid.54549.390000 0004 0369 4060Sichuan Cancer Hospital & Institute, Sichuan Cancer Center, School of Medicine, University of Electronic Science and Technology of China, Chengdu, China
| | - Kejia Hu
- grid.4714.60000 0004 1937 0626Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden
| | - Chuanxu Luo
- grid.13291.380000 0001 0807 1581Laboratory of Molecular Diagnosis of Cancer, and Department of Medical Oncology, Clinical Research Center for Breast Diseases, West China Hospital, Sichuan University, Chengdu, Sichuan P. R. China
| | - Lei Deng
- grid.240614.50000 0001 2181 8635Department of Medicine, Roswell Park Cancer Institute, Buffalo, NY USA
| | - Katja Fall
- grid.15895.300000 0001 0738 8966Clinical Epidemiology and Biostatistics, School of Medical Sciences, Örebro University, 701 85 Örebro, Sweden
| | - Rulla M. Tamimi
- grid.38142.3c000000041936754XDepartment of Epidemiology, Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA USA ,grid.5386.8000000041936877XDepartment of Population Health Sciences, Weill Cornell Medicine, New York, NY USA
| | - Unnur A. Valdimarsdóttir
- grid.38142.3c000000041936754XDepartment of Epidemiology, Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA USA ,grid.14013.370000 0004 0640 0021Center of Public Health Sciences, Faculty of Medicine, University of Iceland, Sturlugata 8, 101 Reykjavik, Iceland ,grid.465198.7Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Nobels väg 12a, 171 77 Solna, Sweden
| | - Fang Fang
- grid.4714.60000 0004 1937 0626Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden
| | - Donghao Lu
- grid.4714.60000 0004 1937 0626Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden ,grid.38142.3c000000041936754XDepartment of Epidemiology, Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA USA ,grid.13291.380000 0001 0807 1581West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|