1
|
Zhang Z, Lan H, Zhao S. Analysis of the Value of Quantitative Features in Multimodal MRI Images to Construct a Radio-Omics Model for Breast Cancer Diagnosis. BREAST CANCER (DOVE MEDICAL PRESS) 2024; 16:305-318. [PMID: 38895649 PMCID: PMC11182731 DOI: 10.2147/bctt.s458036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 05/24/2024] [Indexed: 06/21/2024]
Abstract
Objective To analyze the diagnostic value of quantitative features in multimodal magnetic resonance imaging (MRI) images to construct a radio-omics model for breast cancer. Methods Ninety-five patients with breast-related diseases from January 2020 to January 2021 were grouped into the benign group (n=57) and malignant group (n=38) according to the pathological findings. All cases were randomized as the training group (n=66) and validation group (n=29) in a 7:3 ratio based on the examination time. All subjects were examined by T1-weighted imaging (T1WI), T2-weighted imaging (T2WI), diffusion-weighted imaging (DWI), dynamic contrast enhancement (DCE), and apparent diffusion coefficient (ADC) multimodality MRI. The MRI findings were analyzed against pathological findings. A diagnostic breast cancer radiomics model was constructed. The diagnostic efficacy of the model in the validation group was analyzed, and the diagnostic efficacy was analyzed via the ROC curve. Results Fibroadenoma accounted for 49.12% of benign breast diseases, and invasive ductal carcinoma accounted for 73.68% of malignant breast diseases. The sensitivity of T1WI, T2WI, DWI, ADC, and DCE in diagnosing breast cancer was 61.14%, 66.67%, 73.30%, 78.95%, and 85.96%, using the four-fold table method. The area under the curves (AUCs) of T1WI, T2WI, DWI, ADC, and DCE for diagnosing breast cancer were 0.715, 0.769, 0.785, 0.835, and 0.792, respectively. The AUCs of plain scan, diffuse, enhanced, plain scan + diffuse, plain scan + enhanced, enhanced + diffuse, and plain scan + enhanced + diffuse for diagnosing breast cancer were 0.746, 0.798, 0.816, 0.839, 0.890, 0.906, and 0.927, respectively. Conclusion The construction of a radio-omics model by quantitative features in multimodal MRI images was valuable in the diagnosis of breast cancer. The value of radio-omics models such as plain scan + enhanced + diffuse was higher than the other models in diagnosing breast cancer and could be widely applied in clinical practice.
Collapse
Affiliation(s)
- Zhitao Zhang
- Department of Galactophore, Fujian Maternity and Child Health Hospital, Fuzhou, Fujian Province, 350001, People’s Republic of China
| | - Huan Lan
- Department of Galactophore, Fujian Maternity and Child Health Hospital, Fuzhou, Fujian Province, 350001, People’s Republic of China
| | - Shuai Zhao
- Department of Galactophore, Fujian Maternity and Child Health Hospital, Fuzhou, Fujian Province, 350001, People’s Republic of China
| |
Collapse
|
2
|
Liu X, Tao Y, Cai Z, Bao P, Ma H, Li K, Li M, Zhu Y, Lu ZJ. Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data. Bioinformatics 2024; 40:btae316. [PMID: 38741230 PMCID: PMC11139513 DOI: 10.1093/bioinformatics/btae316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Revised: 03/29/2024] [Accepted: 05/11/2024] [Indexed: 05/16/2024] Open
Abstract
MOTIVATION Multi-omics data provide a comprehensive view of gene regulation at multiple levels, which is helpful in achieving accurate diagnosis of complex diseases like cancer. However, conventional integration methods rarely utilize prior biological knowledge and lack interpretability. RESULTS To integrate various multi-omics data of tissue and liquid biopsies for disease diagnosis and prognosis, we developed a biological pathway informed Transformer, Pathformer. It embeds multi-omics input with a compacted multi-modal vector and a pathway-based sparse neural network. Pathformer also leverages criss-cross attention mechanism to capture the crosstalk between different pathways and modalities. We first benchmarked Pathformer with 18 comparable methods on multiple cancer datasets, where Pathformer outperformed all the other methods, with an average improvement of 6.3%-14.7% in F1 score for cancer survival prediction, 5.1%-12% for cancer stage prediction, and 8.1%-13.6% for cancer drug response prediction. Subsequently, for cancer prognosis prediction based on tissue multi-omics data, we used a case study to demonstrate the biological interpretability of Pathformer by identifying key pathways and their biological crosstalk. Then, for cancer early diagnosis based on liquid biopsy data, we used plasma and platelet datasets to demonstrate Pathformer's potential of clinical applications in cancer screening. Moreover, we revealed deregulation of interesting pathways (e.g. scavenger receptor pathway) and their crosstalk in cancer patients' blood, providing potential candidate targets for cancer microenvironment study. AVAILABILITY AND IMPLEMENTATION Pathformer is implemented and freely available at https://github.com/lulab/Pathformer.
Collapse
Affiliation(s)
- Xiaofan Liu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Yuhuan Tao
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Zilin Cai
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Pengfei Bao
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Hongli Ma
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| | - Kexing Li
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
| | - Mengtao Li
- Department of Rheumatology and Clinical Immunology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases (NCRC-DID), MST State Key Laboratory of Complex Severe and Rare Diseases, MOE Key Laboratory of Rheumatology and Clinical Immunology, Beijing 100730, China
| | - Yunping Zhu
- State Key Laboratory of Medical Proteomics, Beijing Proteome Research Center, National Center for Protein Sciences (Beijing), Beijing Institute of Lifeomics, Beijing 102206, China
| | - Zhi John Lu
- MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China
- Institute for Precision Medicine, Tsinghua University, Beijing 100084, China
| |
Collapse
|
3
|
Chandrashekar PB, Alatkar S, Wang J, Hoffman GE, He C, Jin T, Khullar S, Bendl J, Fullard JF, Roussos P, Wang D. DeepGAMI: deep biologically guided auxiliary learning for multimodal integration and imputation to improve genotype-phenotype prediction. Genome Med 2023; 15:88. [PMID: 37904203 PMCID: PMC10617196 DOI: 10.1186/s13073-023-01248-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 10/16/2023] [Indexed: 11/01/2023] Open
Abstract
BACKGROUND Genotypes are strongly associated with disease phenotypes, particularly in brain disorders. However, the molecular and cellular mechanisms behind this association remain elusive. With emerging multimodal data for these mechanisms, machine learning methods can be applied for phenotype prediction at different scales, but due to the black-box nature of machine learning, integrating these modalities and interpreting biological mechanisms can be challenging. Additionally, the partial availability of these multimodal data presents a challenge in developing these predictive models. METHOD To address these challenges, we developed DeepGAMI, an interpretable neural network model to improve genotype-phenotype prediction from multimodal data. DeepGAMI leverages functional genomic information, such as eQTLs and gene regulation, to guide neural network connections. Additionally, it includes an auxiliary learning layer for cross-modal imputation allowing the imputation of latent features of missing modalities and thus predicting phenotypes from a single modality. Finally, DeepGAMI uses integrated gradient to prioritize multimodal features for various phenotypes. RESULTS We applied DeepGAMI to several multimodal datasets including genotype and bulk and cell-type gene expression data in brain diseases, and gene expression and electrophysiology data of mouse neuronal cells. Using cross-validation and independent validation, DeepGAMI outperformed existing methods for classifying disease types, and cellular and clinical phenotypes, even using single modalities (e.g., AUC score of 0.79 for Schizophrenia and 0.73 for cognitive impairment in Alzheimer's disease). CONCLUSION We demonstrated that DeepGAMI improves phenotype prediction and prioritizes phenotypic features and networks in multiple multimodal datasets in complex brains and brain diseases. Also, it prioritized disease-associated variants, genes, and regulatory networks linked to different phenotypes, providing novel insights into the interpretation of gene regulatory mechanisms. DeepGAMI is open-source and available for general use.
Collapse
Affiliation(s)
- Pramod Bharadwaj Chandrashekar
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Sayali Alatkar
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Jiebiao Wang
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA, 15261, USA
| | - Gabriel E Hoffman
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Chenfeng He
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Ting Jin
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Saniya Khullar
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA
| | - Jaroslav Bendl
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - John F Fullard
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Panos Roussos
- Center for Disease Neurogenomics, Department of Psychiatry and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Mental Illness Research, Education and Clinical Centers, James J. Peters VA Medical Center, Bronx, NY, 10468, USA
- Center for Dementia Research, Nathan Kline Institute for Psychiatric Research, Orangeburg, NY, 10962, USA
| | - Daifeng Wang
- Waisman Center, University of Wisconsin-Madison, Madison, WI, 53705, USA.
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, 53076, USA.
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI, 53076, USA.
| |
Collapse
|
4
|
Hatamikia S, Nougaret S, Panico C, Avesani G, Nero C, Boldrini L, Sala E, Woitek R. Ovarian cancer beyond imaging: integration of AI and multiomics biomarkers. Eur Radiol Exp 2023; 7:50. [PMID: 37700218 PMCID: PMC10497482 DOI: 10.1186/s41747-023-00364-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Accepted: 06/19/2023] [Indexed: 09/14/2023] Open
Abstract
High-grade serous ovarian cancer is the most lethal gynaecological malignancy. Detailed molecular studies have revealed marked intra-patient heterogeneity at the tumour microenvironment level, likely contributing to poor prognosis. Despite large quantities of clinical, molecular and imaging data on ovarian cancer being accumulated worldwide and the rise of high-throughput computing, data frequently remain siloed and are thus inaccessible for integrated analyses. Only a minority of studies on ovarian cancer have set out to harness artificial intelligence (AI) for the integration of multiomics data and for developing powerful algorithms that capture the characteristics of ovarian cancer at multiple scales and levels. Clinical data, serum markers, and imaging data were most frequently used, followed by genomics and transcriptomics. The current literature proves that integrative multiomics approaches outperform models based on single data types and indicates that imaging can be used for the longitudinal tracking of tumour heterogeneity in space and potentially over time. This review presents an overview of studies that integrated two or more data types to develop AI-based classifiers or prediction models.Relevance statement Integrative multiomics models for ovarian cancer outperform models using single data types for classification, prognostication, and predictive tasks.Key points• This review presents studies using multiomics and artificial intelligence in ovarian cancer.• Current literature proves that integrative multiomics outperform models using single data types.• Around 60% of studies used a combination of imaging with clinical data.• The combination of genomics and transcriptomics with imaging data was infrequently used.
Collapse
Affiliation(s)
- Sepideh Hatamikia
- Research Center for Medical Image Analysis and AI (MIAAI), Danube Private University, Krems, Austria.
- Austrian Center for Medical Innovation and Technology (ACMIT), Wiener Neustadt, Austria.
| | - Stephanie Nougaret
- Department of Radiology, Montpellier Cancer Institute, University of Montpellier, Montpellier, France
| | - Camilla Panico
- Dipartimento di Diagnostica Per Immagini, Radioterapia Oncologica Ed Ematologia, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Giacomo Avesani
- Dipartimento di Diagnostica Per Immagini, Radioterapia Oncologica Ed Ematologia, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Camilla Nero
- Scienze Della Salute Della Donna, del bambino e Di Sanità Pubblica, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Luca Boldrini
- Dipartimento di Diagnostica Per Immagini, Radioterapia Oncologica Ed Ematologia, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Evis Sala
- Dipartimento di Diagnostica Per Immagini, Radioterapia Oncologica Ed Ematologia, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy
| | - Ramona Woitek
- Research Center for Medical Image Analysis and AI (MIAAI), Danube Private University, Krems, Austria
- Department of Radiology, University of Cambridge, Cambridge, UK
- Cancer Research UK Cambridge Centre, University of Cambridge, Cambridge, UK
| |
Collapse
|
5
|
Salimy S, Lanjanian H, Abbasi K, Salimi M, Najafi A, Tapak L, Masoudi-Nejad A. A deep learning-based framework for predicting survival-associated groups in colon cancer by integrating multi-omics and clinical data. Heliyon 2023; 9:e17653. [PMID: 37455955 PMCID: PMC10344710 DOI: 10.1016/j.heliyon.2023.e17653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 05/30/2023] [Accepted: 06/25/2023] [Indexed: 07/18/2023] Open
Abstract
Precise prognostic classification of patients and identifying survival subgroups and their associated genes can be important clinical references when designing treatment strategies for cancer patients. Multi-omics and data integration techniques are powerful tools to achieve this goal. This study aimed to introduce a machine learning method to integrate three types of biological data, and investigate the performance of two other methods, in identifying the survival dependency of patients. The data included TCGA RNA-seq gene expression, DNA methylation, and clinical data from 368 patients with colon cancer also we use an independent external validation data set, containing 232 samples. Three methods including, hyper-parameter optimized autoencoders (HPOAE), normal autoencoder, and penalized principal component analysis (PPCA) were used for simultaneous data integration and estimation under a COX hazards model. The HPOAE was thought to outperform other methods. The HPOAE had the Log Rank Mantel-Cox value of 14.27 ± 2, and a Breslow-Generalized Wilcoxon value of 13.13 ± 1. Ten miRNA, 11 methylated genes, and 28 mRNA all by (importance of marginal cutoff > 0.95) were identified. The study demonstrated that hsa-miR-485-5p targets both ZMYM1 and tp53, the latter of which has been previously associated with cancer in numerous studies. Furthermore, compared to other methods, the HPOAE exhibited a greater capacity for identifying survival subgroups and the genes associated with them in patients with colon cancer. However, all of the results were obtained by computational methods, and clinical and experimental studies are needed to validate these results.
Collapse
Affiliation(s)
- Siamak Salimy
- Laboratory of System Biology and Bioinformatics (LBB), Department of Bioinformatics, University of Tehran, Kish International Campus, Kish, Iran
| | - Hossein Lanjanian
- Cellular and Molecular Endocrine Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Karim Abbasi
- Laboratory of System Biology, Bioinformatics & Artificial Intelligent in Medicine (LBBai), Faculty of Mathematics and Computer Science, Kharazmi University, Tehran, Iran
| | - Mahdieh Salimi
- Department of Medical Genetics, Institute of Medical Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Ali Najafi
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Tehran, Iran
| | - Leili Tapak
- Department of Biostatistics, School of Public Health and Modeling of Noncommunicable Diseases Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Ali Masoudi-Nejad
- Laboratory of System Biology and Bioinformatics (LBB), Department of Bioinformatics, University of Tehran, Kish International Campus, Kish, Iran
| |
Collapse
|
6
|
Hao Y, Jing XY, Sun Q. Cancer survival prediction by learning comprehensive deep feature representation for multiple types of genetic data. BMC Bioinformatics 2023; 24:267. [PMID: 37380946 DOI: 10.1186/s12859-023-05392-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 06/19/2023] [Indexed: 06/30/2023] Open
Abstract
BACKGROUND Cancer is one of the leading death causes around the world. Accurate prediction of its survival time is significant, which can help clinicians make appropriate therapeutic schemes. Cancer data can be characterized by varied molecular features, clinical behaviors and morphological appearances. However, the cancer heterogeneity problem usually makes patient samples with different risks (i.e., short and long survival time) inseparable, thereby causing unsatisfactory prediction results. Clinical studies have shown that genetic data tends to contain more molecular biomarkers associated with cancer, and hence integrating multi-type genetic data may be a feasible way to deal with cancer heterogeneity. Although multi-type gene data have been used in the existing work, how to learn more effective features for cancer survival prediction has not been well studied. RESULTS To this end, we propose a deep learning approach to reduce the negative impact of cancer heterogeneity and improve the cancer survival prediction effect. It represents each type of genetic data as the shared and specific features, which can capture the consensus and complementary information among all types of data. We collect mRNA expression, DNA methylation and microRNA expression data for four cancers to conduct experiments. CONCLUSIONS Experimental results demonstrate that our approach substantially outperforms established integrative methods and is effective for cancer survival prediction. AVAILABILITY AND IMPLEMENTATION https://github.com/githyr/ComprehensiveSurvival .
Collapse
Affiliation(s)
- Yaru Hao
- School of Computer Science, Wuhan University, Wuhan, China.
| | - Xiao-Yuan Jing
- School of Computer Science, Wuhan University, Wuhan, China.
- School of Computer, Guangdong University of Petrochemical Technology, Maoming, China.
- State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, China.
| | - Qixing Sun
- School of Computer Science, Wuhan University, Wuhan, China
| |
Collapse
|
7
|
Du X, Zhao Y. Multimodal adversarial representation learning for breast cancer prognosis prediction. Comput Biol Med 2023; 157:106765. [PMID: 36963355 DOI: 10.1016/j.compbiomed.2023.106765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 02/27/2023] [Accepted: 03/07/2023] [Indexed: 03/17/2023]
Abstract
With the increasing incidence of breast cancer, accurate prognosis prediction of breast cancer patients is a key issue in current cancer research, and it is also of great significance for patients' psychological rehabilitation and assisting clinical decision-making. Many studies that integrate data from different heterogeneous modalities such as gene expression profile, clinical data, and copy number alteration, have achieved greater success than those with only one modality in prognostic prediction. However, many of these approaches that exist fail to dramatically reduce the modality gap by aligning multimodal distributions. Therefore, it is crucial to develop a method that fully considers a modality-invariant embedding space to effectively integrate multimodal data. In this study, to reduce the modality gap, we propose a multimodal data adversarial representation framework (MDAR) to reduce the modal heterogeneity by translating source modalities into distributions for the target modality. Additionally, we apply reconstruction and classification losses to embedding space to further constrain it. Then, we design a multi-scale bilinear convolutional neural network (MS-B-CNN) for uni-modality to improve the feature expression ability. In addition, the embedding space generates predictions as stacked feature inputs to the extremely randomized trees classifier. With 10-fold cross-validation, our results show that the proposed adversarial representation learning improves prognostic performance. A comparative study of this method and other existing methods on the METABRIC (1980 patients) dataset showed that Matthews correlation coefficient (Mcc) was significantly enhanced by 7.4% in the prognosis prediction of breast cancer patients.
Collapse
Affiliation(s)
- Xiuquan Du
- Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei, China; School of Computer Science and Technology, Anhui University, Hefei, China.
| | - Yuefan Zhao
- School of Computer Science and Technology, Anhui University, Hefei, China
| |
Collapse
|
8
|
Sheehy J, Rutledge H, Acharya UR, Loh HW, Gururajan R, Tao X, Zhou X, Li Y, Gurney T, Kondalsamy-Chennakesavan S. Gynecological cancer prognosis using machine learning techniques: A systematic review of last three decades (1990–2022). Artif Intell Med 2023; 139:102536. [PMID: 37100507 DOI: 10.1016/j.artmed.2023.102536] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 03/19/2023] [Accepted: 03/23/2023] [Indexed: 03/30/2023]
Abstract
OBJECTIVE Many Computer Aided Prognostic (CAP) systems based on machine learning techniques have been proposed in the field of oncology. The objective of this systematic review was to assess and critically appraise the methodologies and approaches used in predicting the prognosis of gynecological cancers using CAPs. METHODS Electronic databases were used to systematically search for studies utilizing machine learning methods in gynecological cancers. Study risk of bias (ROB) and applicability were assessed using the PROBAST tool. 139 studies met the inclusion criteria, of which 71 predicted outcomes for ovarian cancer patients, 41 predicted outcomes for cervical cancer patients, 28 predicted outcomes for uterine cancer patients, and 2 predicted outcomes for gynecological malignancies broadly. RESULTS Random forest (22.30 %) and support vector machine (21.58 %) classifiers were used most commonly. Use of clinicopathological, genomic and radiomic data as predictors was observed in 48.20 %, 51.08 % and 17.27 % of studies, respectively, with some studies using multiple modalities. 21.58 % of studies were externally validated. Twenty-three individual studies compared ML and non-ML methods. Study quality was highly variable and methodologies, statistical reporting and outcome measures were inconsistent, preventing generalized commentary or meta-analysis of performance outcomes. CONCLUSION There is significant variability in model development when prognosticating gynecological malignancies with respect to variable selection, machine learning (ML) methods and endpoint selection. This heterogeneity prevents meta-analysis and conclusions regarding the superiority of ML methods. Furthermore, PROBAST-mediated ROB and applicability analysis demonstrates concern for the translatability of existing models. This review identifies ways that this can be improved upon in future works to develop robust, clinically translatable models within this promising field.
Collapse
|
9
|
Kartsova LA, Bessonova EA, Deev VA, Kolobova EA. Current Role of Modern Chromatography with Mass Spectrometry and Nuclear Magnetic Resonance Spectroscopy in the Investigation of Biomarkers of Endometriosis. Crit Rev Anal Chem 2023:1-24. [PMID: 36625278 DOI: 10.1080/10408347.2022.2156770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Endometriosis has a wide range of clinical manifestations, and the disease course is unpredictable, making the diagnosis a challenging task. Despite significant advances in the pathophysiology of endometriosis and various proposed theories, the exact etiology is not fully understood and is still unknown. The most commonly used biomarker of endometriosis is CA-125, however, it is nonspecific and is applied for cancers diagnosis. Therefore, the development of reliable noninvasive diagnostic tests for the early diagnosis of endometriosis remains one of the top priorities. Omics technologies are very promising approaches for constructing diagnostic models and biomarker discovery. Their use can greatly facilitate the study of such a complex disease as endometriosis. Nowadays, powerful analytical platforms commonly used in omics, such as gas and liquid chromatography with mass spectrometry and nuclear magnetic resonance (NMR) spectroscopy, have proven to be a promising tools for biomarker discovery. The aim of this review is to summarize the various features of the analytical approaches, practical challenges and features of gas and liquid chromatography with MS and NMR spectroscopy (including sample processing protocols, technological advancements, and methodology) used for profiling of metabolites, lipids, peptides and proteins in physiological fluids and tissues from patients with endometriosis. In addition, this report devotes special attention to the issue of how comprehensive analyses of these profiles can effectively contribute to the study of endometriosis. The search query included reports published between 2012 and 2022 years in PubMed, Web-of-Science, SCOPUS, Science Direct.
Collapse
Affiliation(s)
| | | | | | - Ekaterina Alekseevna Kolobova
- Institute of Chemistry, St. Petersburg State University, St. Petersburg, Russia
- The Federal State Institute of Public Health 'The Nikiforov Russian Center of Emergency and Radiation Medicine', The Ministry of Russian Federation for Civil Defence, Emergencies and Elimination of Consequences of Natural Disasters, St. Petersburg, Russia
| |
Collapse
|
10
|
Liao J, Li X, Gan Y, Han S, Rong P, Wang W, Li W, Zhou L. Artificial intelligence assists precision medicine in cancer treatment. Front Oncol 2023; 12:998222. [PMID: 36686757 PMCID: PMC9846804 DOI: 10.3389/fonc.2022.998222] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 11/22/2022] [Indexed: 01/06/2023] Open
Abstract
Cancer is a major medical problem worldwide. Due to its high heterogeneity, the use of the same drugs or surgical methods in patients with the same tumor may have different curative effects, leading to the need for more accurate treatment methods for tumors and personalized treatments for patients. The precise treatment of tumors is essential, which renders obtaining an in-depth understanding of the changes that tumors undergo urgent, including changes in their genes, proteins and cancer cell phenotypes, in order to develop targeted treatment strategies for patients. Artificial intelligence (AI) based on big data can extract the hidden patterns, important information, and corresponding knowledge behind the enormous amount of data. For example, the ML and deep learning of subsets of AI can be used to mine the deep-level information in genomics, transcriptomics, proteomics, radiomics, digital pathological images, and other data, which can make clinicians synthetically and comprehensively understand tumors. In addition, AI can find new biomarkers from data to assist tumor screening, detection, diagnosis, treatment and prognosis prediction, so as to providing the best treatment for individual patients and improving their clinical outcomes.
Collapse
Affiliation(s)
- Jinzhuang Liao
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Xiaoying Li
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Yu Gan
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Shuangze Han
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China
| | - Pengfei Rong
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| | - Wei Wang
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| | - Wei Li
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| | - Li Zhou
- Department of Radiology, The Third Xiangya Hospital of Central South University, Changsha, Hunan, China,Cell Transplantation and Gene Therapy Institute, The Third Xiangya Hospital, Central South University, Changsha, Hunan, China,Department of Pathology, The Xiangya Hospital of Central South University, Changsha, Hunan, China,*Correspondence: Pengfei Rong, ; Wei Wang, ; Wei Li, ; Li Zhou,
| |
Collapse
|
11
|
Omics Data and Data Representations for Deep Learning-Based Predictive Modeling. Int J Mol Sci 2022; 23:ijms232012272. [PMID: 36293133 PMCID: PMC9603455 DOI: 10.3390/ijms232012272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Revised: 10/03/2022] [Accepted: 10/12/2022] [Indexed: 11/25/2022] Open
Abstract
Medical discoveries mainly depend on the capability to process and analyze biological datasets, which inundate the scientific community and are still expanding as the cost of next-generation sequencing technologies is decreasing. Deep learning (DL) is a viable method to exploit this massive data stream since it has advanced quickly with there being successive innovations. However, an obstacle to scientific progress emerges: the difficulty of applying DL to biology, and this because both fields are evolving at a breakneck pace, thus making it hard for an individual to occupy the front lines of both of them. This paper aims to bridge the gap and help computer scientists bring their valuable expertise into the life sciences. This work provides an overview of the most common types of biological data and data representations that are used to train DL models, with additional information on the models themselves and the various tasks that are being tackled. This is the essential information a DL expert with no background in biology needs in order to participate in DL-based research projects in biomedicine, biotechnology, and drug discovery. Alternatively, this study could be also useful to researchers in biology to understand and utilize the power of DL to gain better insights into and extract important information from the omics data.
Collapse
|
12
|
Guarino A, Lettieri N, Malandrino D, Zaccagnino R, Capo C. Adam or Eve? Automatic users’ gender classification via gestures analysis on touch devices. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07454-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
AbstractGender classification of mobile devices’ users has drawn a great deal of attention for its applications in healthcare, smart spaces, biometric-based access control systems and customization of user interface (UI). Previous works have shown that authentication systems can be more effective when considering soft biometric traits such as the gender, while others highlighted the significance of this trait for enhancing UIs. This paper presents a novel machine learning-based approach to gender classification leveraging the only touch gestures information derived from smartphones’ APIs. To identify the most useful gesture and combination thereof for gender classification, we have considered two strategies: single-view learning, analyzing, one at a time, datasets relating to a single type of gesture, and multi-view learning, analyzing together datasets describing different types of gestures. This is one of the first works to apply such a strategy for gender recognition via gestures analysis on mobile devices. The methods have been evaluated on a large dataset of gestures collected through a mobile application, which includes not only scrolls, swipes, and taps but also pinch-to-zooms and drag-and-drops which are mostly overlooked in the literature. Conversely to the previous literature, we have also provided experiments of the solution in different scenarios, thus proposing a more comprehensive evaluation. The experimental results show that scroll down is the most useful gesture and random forest is the most convenient classifier for gender classification. Based on the (combination of) gestures taken into account, we have obtained F1-score up to 0.89 in validation and 0.85 in testing phase. Furthermore, the multi-view approach is recommended when dealing with unknown devices and combinations of gestures can be effectively adopted, building on the requirements of the system our solution is built-into. Solutions proposed turn out to be both an opportunity for gender-aware technologies and a potential risk deriving from unwanted gender classification.
Collapse
|
13
|
Kaur I, Doja M, Ahmad T. Data Mining and Machine Learning in Cancer Survival Research: An Overview and Future Recommendations. J Biomed Inform 2022; 128:104026. [DOI: 10.1016/j.jbi.2022.104026] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 02/07/2022] [Accepted: 02/09/2022] [Indexed: 12/29/2022]
|
14
|
Kang M, Ko E, Mersha TB. A roadmap for multi-omics data integration using deep learning. Brief Bioinform 2022; 23:bbab454. [PMID: 34791014 PMCID: PMC8769688 DOI: 10.1093/bib/bbab454] [Citation(s) in RCA: 79] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 09/30/2021] [Accepted: 10/05/2021] [Indexed: 12/18/2022] Open
Abstract
High-throughput next-generation sequencing now makes it possible to generate a vast amount of multi-omics data for various applications. These data have revolutionized biomedical research by providing a more comprehensive understanding of the biological systems and molecular mechanisms of disease development. Recently, deep learning (DL) algorithms have become one of the most promising methods in multi-omics data analysis, due to their predictive performance and capability of capturing nonlinear and hierarchical features. While integrating and translating multi-omics data into useful functional insights remain the biggest bottleneck, there is a clear trend towards incorporating multi-omics analysis in biomedical research to help explain the complex relationships between molecular layers. Multi-omics data have a role to improve prevention, early detection and prediction; monitor progression; interpret patterns and endotyping; and design personalized treatments. In this review, we outline a roadmap of multi-omics integration using DL and offer a practical perspective into the advantages, challenges and barriers to the implementation of DL in multi-omics data.
Collapse
Affiliation(s)
- Mingon Kang
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Euiseong Ko
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Tesfaye B Mersha
- Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| |
Collapse
|
15
|
Anklam E, Bahl MI, Ball R, Beger RD, Cohen J, Fitzpatrick S, Girard P, Halamoda-Kenzaoui B, Hinton D, Hirose A, Hoeveler A, Honma M, Hugas M, Ishida S, Kass GEN, Kojima H, Krefting I, Liachenko S, Liu Y, Masters S, Marx U, McCarthy T, Mercer T, Patri A, Pelaez C, Pirmohamed M, Platz S, Ribeiro AJS, Rodricks JV, Rusyn I, Salek RM, Schoonjans R, Silva P, Svendsen CN, Sumner S, Sung K, Tagle D, Tong L, Tong W, van den Eijnden-van-Raaij J, Vary N, Wang T, Waterton J, Wang M, Wen H, Wishart D, Yuan Y, Slikker Jr. W. Emerging technologies and their impact on regulatory science. Exp Biol Med (Maywood) 2022; 247:1-75. [PMID: 34783606 PMCID: PMC8749227 DOI: 10.1177/15353702211052280] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
There is an evolution and increasing need for the utilization of emerging cellular, molecular and in silico technologies and novel approaches for safety assessment of food, drugs, and personal care products. Convergence of these emerging technologies is also enabling rapid advances and approaches that may impact regulatory decisions and approvals. Although the development of emerging technologies may allow rapid advances in regulatory decision making, there is concern that these new technologies have not been thoroughly evaluated to determine if they are ready for regulatory application, singularly or in combinations. The magnitude of these combined technical advances may outpace the ability to assess fit for purpose and to allow routine application of these new methods for regulatory purposes. There is a need to develop strategies to evaluate the new technologies to determine which ones are ready for regulatory use. The opportunity to apply these potentially faster, more accurate, and cost-effective approaches remains an important goal to facilitate their incorporation into regulatory use. However, without a clear strategy to evaluate emerging technologies rapidly and appropriately, the value of these efforts may go unrecognized or may take longer. It is important for the regulatory science field to keep up with the research in these technically advanced areas and to understand the science behind these new approaches. The regulatory field must understand the critical quality attributes of these novel approaches and learn from each other's experience so that workforces can be trained to prepare for emerging global regulatory challenges. Moreover, it is essential that the regulatory community must work with the technology developers to harness collective capabilities towards developing a strategy for evaluation of these new and novel assessment tools.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Reza M Salek
- International Agency for Research on Cancer, France
| | | | | | | | | | | | | | - Li Tong
- Universities of Georgia Tech and Emory, USA
| | | | | | - Neil Vary
- Canadian Food Inspection Agency, Canada
| | - Tao Wang
- National Medical Products Administration, China
| | | | - May Wang
- Universities of Georgia Tech and Emory, USA
| | - Hairuo Wen
- National Institutes for Food and Drug Control, China
| | | | | | | |
Collapse
|
16
|
Arslan E, Schulz J, Rai K. Machine Learning in Epigenomics: Insights into Cancer Biology and Medicine. Biochim Biophys Acta Rev Cancer 2021; 1876:188588. [PMID: 34245839 PMCID: PMC8595561 DOI: 10.1016/j.bbcan.2021.188588] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 05/29/2021] [Accepted: 07/02/2021] [Indexed: 02/01/2023]
Abstract
The recent deluge of genome-wide technologies for the mapping of the epigenome and resulting data in cancer samples has provided the opportunity for gaining insights into and understanding the roles of epigenetic processes in cancer. However, the complexity, high-dimensionality, sparsity, and noise associated with these data pose challenges for extensive integrative analyses. Machine Learning (ML) algorithms are particularly suited for epigenomic data analyses due to their flexibility and ability to learn underlying hidden structures. We will discuss four overlapping but distinct major categories under ML: dimensionality reduction, unsupervised methods, supervised methods, and deep learning (DL). We review the preferred use cases of these algorithms in analyses of cancer epigenomics data with the hope to provide an overview of how ML approaches can be used to explore fundamental questions on the roles of epigenome in cancer biology and medicine.
Collapse
Affiliation(s)
- Emre Arslan
- Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, United States of America
| | - Jonathan Schulz
- Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, United States of America
| | - Kunal Rai
- Department of Genomic Medicine, MD Anderson Cancer Center, Houston, TX 77030, United States of America.
| |
Collapse
|
17
|
Reska D, Czajkowski M, Jurczuk K, Boldak C, Kwedlo W, Bauer W, Koszelew J, Kretowski M. Integration of solutions and services for multi-omics data analysis towards personalized medicine. Biocybern Biomed Eng 2021. [DOI: 10.1016/j.bbe.2021.10.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
18
|
SurvCNN: A Discrete Time-to-Event Cancer Survival Estimation Framework Using Image Representations of Omics Data. Cancers (Basel) 2021; 13:cancers13133106. [PMID: 34206288 PMCID: PMC8269306 DOI: 10.3390/cancers13133106] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 06/02/2021] [Accepted: 06/16/2021] [Indexed: 01/04/2023] Open
Abstract
The utility of multi-omics in personalized therapy and cancer survival analysis has been debated and demonstrated extensively in the recent past. Most of the current methods still suffer from data constraints such as high-dimensionality, unexplained interdependence, and subpar integration methods. Here, we propose SurvCNN, an alternative approach to process multi-omics data with robust computer vision architectures, to predict cancer prognosis for Lung Adenocarcinoma patients. Numerical multi-omics data were transformed into their image representations and fed into a Convolutional Neural network with a discrete-time model to predict survival probabilities. The framework also dichotomized patients into risk subgroups based on their survival probabilities over time. SurvCNN was evaluated on multiple performance metrics and outperformed existing methods with a high degree of confidence. Moreover, comprehensive insights into the relative performance of various combinations of omics datasets were probed. Critical biological processes, pathways and cell types identified from downstream processing of differentially expressed genes suggested that the framework could elucidate elements detrimental to a patient's survival. Such integrative models with high predictive power would have a significant impact and utility in precision oncology.
Collapse
|
19
|
Venugopalan J, Tong L, Hassanzadeh HR, Wang MD. Multimodal deep learning models for early detection of Alzheimer's disease stage. Sci Rep 2021; 11:3254. [PMID: 33547343 PMCID: PMC7864942 DOI: 10.1038/s41598-020-74399-w] [Citation(s) in RCA: 106] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 01/22/2020] [Indexed: 02/06/2023] Open
Abstract
Most current Alzheimer's disease (AD) and mild cognitive disorders (MCI) studies use single data modality to make predictions such as AD stages. The fusion of multiple data modalities can provide a holistic view of AD staging analysis. Thus, we use deep learning (DL) to integrally analyze imaging (magnetic resonance imaging (MRI)), genetic (single nucleotide polymorphisms (SNPs)), and clinical test data to classify patients into AD, MCI, and controls (CN). We use stacked denoising auto-encoders to extract features from clinical and genetic data, and use 3D-convolutional neural networks (CNNs) for imaging data. We also develop a novel data interpretation method to identify top-performing features learned by the deep-models with clustering and perturbation analysis. Using Alzheimer's disease neuroimaging initiative (ADNI) dataset, we demonstrate that deep models outperform shallow models, including support vector machines, decision trees, random forests, and k-nearest neighbors. In addition, we demonstrate that integrating multi-modality data outperforms single modality models in terms of accuracy, precision, recall, and meanF1 scores. Our models have identified hippocampus, amygdala brain areas, and the Rey Auditory Verbal Learning Test (RAVLT) as top distinguished features, which are consistent with the known AD literature.
Collapse
Affiliation(s)
- Janani Venugopalan
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Li Tong
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Hamid Reza Hassanzadeh
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - May D Wang
- Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA.
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA.
- Winship Cancer Institute, Parker H. Petit Institute for Bioengineering and Biosciences, Institute of People and Technology, Georgia Institute of Technology and Emory University, Atlanta, GA, USA.
| |
Collapse
|