1
|
Chen H, Lu D, Xiao Z, Li S, Zhang W, Luan X, Zhang W, Zheng G. Comprehensive applications of the artificial intelligence technology in new drug research and development. Health Inf Sci Syst 2024; 12:41. [PMID: 39130617 PMCID: PMC11310389 DOI: 10.1007/s13755-024-00300-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 07/27/2024] [Indexed: 08/13/2024] Open
Abstract
Purpose Target-based strategy is a prevalent means of drug research and development (R&D), since targets provide effector molecules of drug action and offer the foundation of pharmacological investigation. Recently, the artificial intelligence (AI) technology has been utilized in various stages of drug R&D, where AI-assisted experimental methods show higher efficiency than sole experimental ones. It is a critical need to give a comprehensive review of AI applications in drug R &D for biopharmaceutical field. Methods Relevant literatures about AI-assisted drug R&D were collected from the public databases (Including Google Scholar, Web of Science, PubMed, IEEE Xplore Digital Library, Springer, and ScienceDirect) through a keyword searching strategy with the following terms [("Artificial Intelligence" OR "Knowledge Graph" OR "Machine Learning") AND ("Drug Target Identification" OR "New Drug Development")]. Results In this review, we first introduced common strategies and novel trends of drug R&D, followed by characteristic description of AI algorithms widely used in drug R&D. Subsequently, we depicted detailed applications of AI algorithms in target identification, lead compound identification and optimization, drug repurposing, and drug analytical platform construction. Finally, we discussed the challenges and prospects of AI-assisted methods for drug discovery. Conclusion Collectively, this review provides comprehensive overview of AI applications in drug R&D and presents future perspectives for biopharmaceutical field, which may promote the development of drug industry.
Collapse
Affiliation(s)
- Hongyu Chen
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Dong Lu
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Ziyi Xiao
- Johns Hopkins Bloomberg School of Public Health, Baltimore, MD USA
| | - Shensuo Li
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Wen Zhang
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Xin Luan
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Weidong Zhang
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| | - Guangyong Zheng
- Shanghai Frontiers Science Center for Chinese Medicine Chemical Biology, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai, China
| |
Collapse
|
2
|
Mok ETY, Chitty JL, Cox TR. miRNAs in pancreatic cancer progression and metastasis. Clin Exp Metastasis 2024; 41:163-186. [PMID: 38240887 PMCID: PMC11213741 DOI: 10.1007/s10585-023-10256-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 12/06/2023] [Indexed: 06/30/2024]
Abstract
Small non-coding RNA or microRNA (miRNA) are critical regulators of eukaryotic cells. Dysregulation of miRNA expression and function has been linked to a variety of diseases including cancer. They play a complex role in cancers, having both tumour suppressor and promoter properties. In addition, a single miRNA can be involved in regulating several mRNAs or many miRNAs can regulate a single mRNA, therefore assessing these roles is essential to a better understanding in cancer initiation and development. Pancreatic cancer is a leading cause of cancer death worldwide, in part due to the lack of diagnostic tools and limited treatment options. The most common form of pancreatic cancer, pancreatic ductal adenocarcinoma (PDAC), is characterised by major genetic mutations that drive cancer initiation and progression. The regulation or interaction of miRNAs with these cancer driving mutations suggests a strong link between the two. Understanding this link between miRNA and PDAC progression may give rise to novel treatments or diagnostic tools. This review summarises the role of miRNAs in PDAC, the downstream signalling pathways that they play a role in, how these are being used and studied as therapeutic targets as well as prognostic/diagnostic tools to improve the clinical outcome of PDAC.
Collapse
Affiliation(s)
- Ellie T Y Mok
- Matrix & Metastasis Lab, Cancer Ecosystems Program, The Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Darlinghurst, NSW, Australia
- School of Clinical Medicine, St Vincent's Healthcare Clinical Campus, UNSW Medicine and Health, UNSW Sydney, Sydney, NSW, Australia
| | - Jessica L Chitty
- Matrix & Metastasis Lab, Cancer Ecosystems Program, The Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Darlinghurst, NSW, Australia.
- School of Clinical Medicine, St Vincent's Healthcare Clinical Campus, UNSW Medicine and Health, UNSW Sydney, Sydney, NSW, Australia.
| | - Thomas R Cox
- Matrix & Metastasis Lab, Cancer Ecosystems Program, The Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Darlinghurst, NSW, Australia.
- School of Clinical Medicine, St Vincent's Healthcare Clinical Campus, UNSW Medicine and Health, UNSW Sydney, Sydney, NSW, Australia.
| |
Collapse
|
3
|
Chakraborty S, Sharma G, Karmakar S, Banerjee S. Multi-OMICS approaches in cancer biology: New era in cancer therapy. Biochim Biophys Acta Mol Basis Dis 2024; 1870:167120. [PMID: 38484941 DOI: 10.1016/j.bbadis.2024.167120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/06/2024] [Accepted: 03/06/2024] [Indexed: 04/01/2024]
Abstract
Innovative multi-omics frameworks integrate diverse datasets from the same patients to enhance our understanding of the molecular and clinical aspects of cancers. Advanced omics and multi-view clustering algorithms present unprecedented opportunities for classifying cancers into subtypes, refining survival predictions and treatment outcomes, and unravelling key pathophysiological processes across various molecular layers. However, with the increasing availability of cost-effective high-throughput technologies (HTT) that generate vast amounts of data, analyzing single layers often falls short of establishing causal relations. Integrating multi-omics data spanning genomes, epigenomes, transcriptomes, proteomes, metabolomes, and microbiomes offers unique prospects to comprehend the underlying biology of complex diseases like cancer. This discussion explores algorithmic frameworks designed to uncover cancer subtypes, disease mechanisms, and methods for identifying pivotal genomic alterations. It also underscores the significance of multi-omics in tumor classifications, diagnostics, and prognostications. Despite its unparalleled advantages, the integration of multi-omics data has been slow to find its way into everyday clinics. A major hurdle is the uneven maturity of different omics approaches and the widening gap between the generation of large datasets and the capacity to process this data. Initiatives promoting the standardization of sample processing and analytical pipelines, as well as multidisciplinary training for experts in data analysis and interpretation, are crucial for translating theoretical findings into practical applications.
Collapse
Affiliation(s)
- Sohini Chakraborty
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Gaurav Sharma
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Sricheta Karmakar
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India
| | - Satarupa Banerjee
- Department of Biotechnology, School of Biosciences and Technology, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India.
| |
Collapse
|
4
|
Mondello A, Dal Bo M, Toffoli G, Polano M. Machine learning in onco-pharmacogenomics: a path to precision medicine with many challenges. Front Pharmacol 2024; 14:1260276. [PMID: 38264526 PMCID: PMC10803549 DOI: 10.3389/fphar.2023.1260276] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Accepted: 12/26/2023] [Indexed: 01/25/2024] Open
Abstract
Over the past two decades, Next-Generation Sequencing (NGS) has revolutionized the approach to cancer research. Applications of NGS include the identification of tumor specific alterations that can influence tumor pathobiology and also impact diagnosis, prognosis and therapeutic options. Pharmacogenomics (PGx) studies the role of inheritance of individual genetic patterns in drug response and has taken advantage of NGS technology as it provides access to high-throughput data that can, however, be difficult to manage. Machine learning (ML) has recently been used in the life sciences to discover hidden patterns from complex NGS data and to solve various PGx problems. In this review, we provide a comprehensive overview of the NGS approaches that can be employed and the different PGx studies implicating the use of NGS data. We also provide an excursus of the ML algorithms that can exert a role as fundamental strategies in the PGx field to improve personalized medicine in cancer.
Collapse
Affiliation(s)
| | | | | | - Maurizio Polano
- Experimental and Clinical Pharmacology Unit, Centro di Riferimento Oncologico di Aviano (CRO), Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Aviano, Italy
| |
Collapse
|
5
|
Wu Y, Seufert I, Al-Shaheri FN, Kurilov R, Bauer AS, Manoochehri M, Moskalev EA, Brors B, Tjaden C, Giese NA, Hackert T, Büchler MW, Hoheisel JD. DNA-methylation signature accurately differentiates pancreatic cancer from chronic pancreatitis in tissue and plasma. Gut 2023; 72:2344-2353. [PMID: 37709492 PMCID: PMC10715533 DOI: 10.1136/gutjnl-2023-330155] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 08/31/2023] [Indexed: 09/16/2023]
Abstract
OBJECTIVE Pancreatic ductal adenocarcinoma (PDAC) is a lethal malignancy. Differentiation from chronic pancreatitis (CP) is currently inaccurate in about one-third of cases. Misdiagnoses in both directions, however, have severe consequences for patients. We set out to identify molecular markers for a clear distinction between PDAC and CP. DESIGN Genome-wide variations of DNA-methylation, messenger RNA and microRNA level as well as combinations thereof were analysed in 345 tissue samples for marker identification. To improve diagnostic performance, we established a random-forest machine-learning approach. Results were validated on another 48 samples and further corroborated in 16 liquid biopsy samples. RESULTS Machine-learning succeeded in defining markers to differentiate between patients with PDAC and CP, while low-dimensional embedding and cluster analysis failed to do so. DNA-methylation yielded the best diagnostic accuracy by far, dwarfing the importance of transcript levels. Identified changes were confirmed with data taken from public repositories and validated in independent sample sets. A signature of six DNA-methylation sites in a CpG-island of the protein kinase C beta type gene achieved a validated diagnostic accuracy of 100% in tissue and in circulating free DNA isolated from patient plasma. CONCLUSION The success of machine-learning to identify an effective marker signature documents the power of this approach. The high diagnostic accuracy of discriminating PDAC from CP could have tremendous consequences for treatment success, once the result from still a limited number of liquid biopsy samples would be confirmed in a larger cohort of patients with suspected pancreatic cancer.
Collapse
Affiliation(s)
- Yenan Wu
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Isabelle Seufert
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Faculty of Biosciences, Heidelberg University, Heidelberg, Germany
| | - Fawaz N Al-Shaheri
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
- Medical Faculty Heidelberg, University of Heidelberg, Heidelberg, Germany
| | - Roman Kurilov
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Andrea S Bauer
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Mehdi Manoochehri
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Evgeny A Moskalev
- Institute of Pathology, Universitätsklinikum Erlangen, Friedrich Alexander Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Benedikt Brors
- Division of Applied Bioinformatics, German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Christin Tjaden
- Department of Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Nathalia A Giese
- Department of Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Thilo Hackert
- Department of Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Markus W Büchler
- Department of Surgery, Heidelberg University Hospital, Heidelberg, Germany
| | - Jörg D Hoheisel
- Division of Functional Genome Analysis, German Cancer Research Center (DKFZ), Heidelberg, Germany
| |
Collapse
|
6
|
Fawaz A, Ferraresi A, Isidoro C. Systems Biology in Cancer Diagnosis Integrating Omics Technologies and Artificial Intelligence to Support Physician Decision Making. J Pers Med 2023; 13:1590. [PMID: 38003905 PMCID: PMC10672164 DOI: 10.3390/jpm13111590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 11/07/2023] [Accepted: 11/08/2023] [Indexed: 11/26/2023] Open
Abstract
Cancer is the second major cause of disease-related death worldwide, and its accurate early diagnosis and therapeutic intervention are fundamental for saving the patient's life. Cancer, as a complex and heterogeneous disorder, results from the disruption and alteration of a wide variety of biological entities, including genes, proteins, mRNAs, miRNAs, and metabolites, that eventually emerge as clinical symptoms. Traditionally, diagnosis is based on clinical examination, blood tests for biomarkers, the histopathology of a biopsy, and imaging (MRI, CT, PET, and US). Additionally, omics biotechnologies help to further characterize the genome, metabolome, microbiome traits of the patient that could have an impact on the prognosis and patient's response to the therapy. The integration of all these data relies on gathering of several experts and may require considerable time, and, unfortunately, it is not without the risk of error in the interpretation and therefore in the decision. Systems biology algorithms exploit Artificial Intelligence (AI) combined with omics technologies to perform a rapid and accurate analysis and integration of patient's big data, and support the physician in making diagnosis and tailoring the most appropriate therapeutic intervention. However, AI is not free from possible diagnostic and prognostic errors in the interpretation of images or biochemical-clinical data. Here, we first describe the methods used by systems biology for combining AI with omics and then discuss the potential, challenges, limitations, and critical issues in using AI in cancer research.
Collapse
Affiliation(s)
| | | | - Ciro Isidoro
- Laboratory of Molecular Pathology, Department of Health Sciences, Università del Piemonte Orientale, 28100 Novara, Italy; (A.F.); (A.F.)
| |
Collapse
|
7
|
Chen Y, Meng J, Lu X, Li X, Wang C. Clustering analysis revealed the autophagy classification and potential autophagy regulators' sensitivity of pancreatic cancer based on multi-omics data. Cancer Med 2023; 12:733-746. [PMID: 35684936 PMCID: PMC9844610 DOI: 10.1002/cam4.4932] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 05/06/2022] [Accepted: 05/24/2022] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Pancreatic ductal adenocarcinoma (PDAC) is a lethal malignancy and is unresponsive to conventional therapeutic modalities due to its high heterogeneity, expounding the necessity, and priority of searching for effective biomarkers and drugs. Autophagy, as an evolutionarily conserved biological process, is upregulated in PDAC and its regulation is linked to a poor prognosis. Increased autophagy sequestered MHC-I on PDAC cells and weaken the antigen presentation and antitumor immune response, indicating the potential therapeutic strategies of autophagy inhibitors. METHODS By performing 10 state-of-the-art multi-omics clustering algorithms, we constructed a robust PDAC classification model to reveal the autophagy-related genes among different subgroups. OUTCOMES After building a more comprehensive regulating network for potential autophagy regulators exploration, we concluded the top 20 autophagy-related hub genes (GAPDH, MAPK3, RHEB, SQSTM1, EIF2S1, RAB5A, CTSD, MAP1LC3B, RAB7A, RAB11A, FADD, CFKN2A, HSP90AB1, VEGFA, RELA, DDIT3, HSPA5, BCL2L1, BAG3, and ERBB2), six miRNAs, five transcription factors, and five immune infiltrated cells as biomarkers. The drug sensitivity database was screened based on the biomarkers to predict possible drug-targeting signal pathways, hoping to yield novel insights, and promote the progress of the anticancer therapeutic strategy. CONCLUSION We succefully constructed an autophagy-related mRNA/miRNA/TF/Immune cells network based on a 10 state-of art algorithm multi-omics analysis, and screened the drug sensitivity dataset for detecting potential signal pathway which might be possible autophagy modulators' targets.
Collapse
Affiliation(s)
- Yonghao Chen
- Department of GastroenterologyWest China Hospital of Sichuan UniversityChengduSichuanP.R. China
| | - Jialin Meng
- Department of Urology, The First Affiliated Hospital of Anhui Medical UniversityHefeiP.R. China
- Institute of UrologyAnhui Medical UniversityHefeiP.R. China
- Anhui Province Key Laboratory of Genitourinary Diseases, Anhui Medical UniversityHefeiP.R. China
| | - Xiaofan Lu
- State Key Laboratory of Natural Medicines, Research Center of Biostatistics and Computational PharmacyChina Pharmaceutical UniversityNanjingP.R. China
| | - Xiao Li
- Department of GastroenterologyWest China Hospital of Sichuan UniversityChengduSichuanP.R. China
| | - Chunhui Wang
- Department of GastroenterologyWest China Hospital of Sichuan UniversityChengduSichuanP.R. China
| |
Collapse
|
8
|
Ma C, Wu M, Ma S. Analysis of cancer omics data: a selective review of statistical techniques. Brief Bioinform 2022; 23:6510158. [PMID: 35039832 DOI: 10.1093/bib/bbab585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 12/19/2021] [Accepted: 12/20/2021] [Indexed: 11/13/2022] Open
Abstract
Cancer is an omics disease. The development in high-throughput profiling has fundamentally changed cancer research and clinical practice. Compared with clinical, demographic and environmental data, the analysis of omics data-which has higher dimensionality, weaker signals and more complex distributional properties-is much more challenging. Developments in the literature are often 'scattered', with individual studies focused on one or a few closely related methods. The goal of this review is to assist cancer researchers with limited statistical expertise in establishing the 'overall framework' of cancer omics data analysis. To facilitate understanding, we mainly focus on intuition, concepts and key steps, and refer readers to the original publications for mathematical details. This review broadly covers unsupervised and supervised analysis, as well as individual-gene-based, gene-set-based and gene-network-based analysis. We also briefly discuss 'special topics' including interaction analysis, multi-datasets analysis and multi-omics analysis.
Collapse
Affiliation(s)
- Chenjin Ma
- College of Statistics and Data Science, Faculty of Science, Beijing University of Technology, Beijing, China
| | - Mengyun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| | - Shuangge Ma
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| |
Collapse
|
9
|
The Impact of Biomarkers in Pancreatic Ductal Adenocarcinoma on Diagnosis, Surveillance and Therapy. Cancers (Basel) 2022; 14:cancers14010217. [PMID: 35008381 PMCID: PMC8750069 DOI: 10.3390/cancers14010217] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 12/21/2021] [Accepted: 12/22/2021] [Indexed: 12/24/2022] Open
Abstract
Simple Summary Pancreatic ductal adenocarcinoma is a leading cause of cancer death worldwide. Due to the frequently late diagnosis, early metastasis and high therapy resistance curation is rare and prognosis remains poor overall. To provide early diagnostic and therapeutic predictors, various molecules from blood, tissue and other origin e.g., saliva, urine and stool, have been identified as biomarkers. This review summarizes current trends in biomarkers for diagnosis and therapy of pancreatic ductal adenocarcinoma. Abstract Pancreatic ductal adenocarcinoma (PDAC) is still difficult to treat due to insufficient methods for early diagnosis and prediction of therapy response. Furthermore, surveillance after curatively intended surgery lacks adequate methods for timely detection of recurrence. Therefore, several molecules have been analyzed as predictors of recurrence or early detection of PDAC. Enhanced understanding of molecular tumorigenesis and treatment response triggered the identification of novel biomarkers as predictors for response to conventional chemotherapy or targeted therapy. In conclusion, progress has been made especially in the prediction of therapy response with biomarkers. The use of molecules for early detection and recurrence of PDAC is still at an early stage, but there are promising approaches in noninvasive biomarkers, composite panels and scores that can already ameliorate the current clinical practice. The present review summarizes the current state of research on biomarkers for diagnosis and therapy of pancreatic cancer.
Collapse
|
10
|
Güven E. Gene Expression Characteristics of Tumor and Adjacent Non-Tumor Tissues of Pancreatic Ductal Adenocarcinoma (PDAC) In-Silico. IRANIAN JOURNAL OF BIOTECHNOLOGY 2022; 20:e3092. [PMID: 35891953 PMCID: PMC9284245 DOI: 10.30498/ijb.2021.292558.3092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
BACKGROUND One of the deadliest and most prevalent cancer is pancreatic ductal adenocarcinoma (PDAC). Microarray has become an important tool in the research of PDAC genes and target therapeutic drugs. OBJECTIVES This study intends to clarify the promising prognostic and biomarker targets in PDAC using GSE78229 and GSE62452 datasets, publicly accessible at the Gene Expression Omnibus database. MATERIALS AND METHODS Utilizing GEOquery, Bio base, gplots, and ggplot2 packages in the R program, this study detects 428 differentially expressed genes that are further applied to build a co-expression network by the weighted correlation network analysis (WGCNA). The turquoise module presented a higher correlation with PDAC progression. 79 candidate genes were selected based on the co-expression and protein-protein interaction (PPI) networks. In addition, the functional enrichment analysis was studied. RESULTS Five significant KEGG pathways linked to PDAC were detected, in which the endoplasmic reticulum protein processing pathway was remarked to be vital. The resulting 19 hub genes as HSPA4, PABPC1, HSP90B1, PPP1CC, USP9X, EIF2S3, MSN, RAB10, BMPR2, P4HB, UBC, B2M, SLC25A5, MMP7, SPTBN1, RALB, DNAJB1, CENPE, and PDIA6 were identified by the Network Analyst web tool founded on PPI network by the STRING. These were identified as the most connected hub proteins. The quantification of the expression of levels and survival probabilities were analyzed overall survival (OS) of the real hub genes and were investigated by Kaplan-Meier (KM) plotter through The Cancer Genome Atlas Program (TCGA) database. CONCLUSIONS The protein-protein interactions and KEGG pathway enrichment by DAVID indicated that some pathways were involved in PDAC, such as "pathways in cancer (hsa05200)", "protein processing in the endoplasmic reticulum (hsa04141)", "antigen processing and presentation (hsa04612)", "dopaminergic synapse (hsa04728)", and "measles (hsa05162)"; in which these pathways, the "protein processing in endoplasmic reticulum (hsa04141)", was further studied because of its closely relationship with PDAC. The rest of the hub genes reviewed throughout the study might be promising targets for diagnosing and treating PDAC and relevant diseases.
Collapse
Affiliation(s)
- Emine Güven
- Department of Biomedical Engineering, Engineering Faculty, Düzce University, Yörük/Düzce Merkez/Düzce, 81620, Turkey
| |
Collapse
|
11
|
Sindhu KJ, Venkatesan N, Karunagaran D. MicroRNA Interactome Multiomics Characterization for Cancer Research and Personalized Medicine: An Expert Review. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2021; 25:545-566. [PMID: 34448651 DOI: 10.1089/omi.2021.0087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
MicroRNAs (miRNAs) that are mutually modulated by their interacting partners (interactome) are being increasingly noted for their significant role in pathogenesis and treatment of various human cancers. Recently, miRNA interactome dissected with multiomics approaches has been the subject of focus since individual tools or methods failed to provide the necessary comprehensive clues on the complete interactome. Even though single-omics technologies such as proteomics can uncover part of the interactome, the biological and clinical understanding still remain incomplete. In this study, we present an expert review of studies involving multiomics approaches to identification of miRNA interactome and its application in mechanistic characterization, classification, and therapeutic target identification in a variety of cancers, and with a focus on proteomics. We also discuss individual or multiple miRNA-based interactome identification in various pathological conditions of relevance to clinical medicine. Various new single-omics methods that can be integrated into multiomics cancer research and the computational approaches to analyze and predict miRNA interactome are also highlighted in this review. In all, we contextulize the power of multiomics approaches and the importance of the miRNA interactome to achieve the vision and practice of predictive, preventive, and personalized medicine in cancer research and clinical oncology.
Collapse
Affiliation(s)
- K J Sindhu
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Nalini Venkatesan
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Devarajan Karunagaran
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| |
Collapse
|
12
|
Ding Q, Sun Y, Shang J, Li F, Zhang Y, Liu JX. NMFNA: A Non-negative Matrix Factorization Network Analysis Method for Identifying Modules and Characteristic Genes of Pancreatic Cancer. Front Genet 2021; 12:678642. [PMID: 34367241 PMCID: PMC8340025 DOI: 10.3389/fgene.2021.678642] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Accepted: 06/03/2021] [Indexed: 01/15/2023] Open
Abstract
Pancreatic cancer (PC) is a highly fatal disease, yet its causes remain unclear. Comprehensive analysis of different types of PC genetic data plays a crucial role in understanding its pathogenic mechanisms. Currently, non-negative matrix factorization (NMF)-based methods are widely used for genetic data analysis. Nevertheless, it is a challenge for them to integrate and decompose different types of genetic data simultaneously. In this paper, a non-NMF network analysis method, NMFNA, is proposed, which introduces a graph-regularized constraint to the NMF, for identifying modules and characteristic genes from two-type PC data of methylation (ME) and copy number variation (CNV). Firstly, three PC networks, i.e., ME network, CNV network, and ME-CNV network, are constructed using the Pearson correlation coefficient (PCC). Then, modules are detected from these three PC networks effectively due to the introduced graph-regularized constraint, which is the highlight of the NMFNA. Finally, both gene ontology (GO) and pathway enrichment analyses are performed, and characteristic genes are detected by the multimeasure score, to deeply understand biological functions of PC core modules. Experimental results demonstrated that the NMFNA facilitates the integration and decomposition of two types of PC data simultaneously and can further serve as an alternative method for detecting modules and characteristic genes from multiple genetic data of complex diseases.
Collapse
Affiliation(s)
- Qian Ding
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Yan Sun
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Feng Li
- School of Computer Science, Qufu Normal University, Rizhao, China
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao, China
| |
Collapse
|
13
|
Reel PS, Reel S, Pearson E, Trucco E, Jefferson E. Using machine learning approaches for multi-omics data analysis: A review. Biotechnol Adv 2021; 49:107739. [PMID: 33794304 DOI: 10.1016/j.biotechadv.2021.107739] [Citation(s) in RCA: 284] [Impact Index Per Article: 94.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 03/01/2021] [Accepted: 03/25/2021] [Indexed: 02/06/2023]
Abstract
With the development of modern high-throughput omic measurement platforms, it has become essential for biomedical studies to undertake an integrative (combined) approach to fully utilise these data to gain insights into biological systems. Data from various omics sources such as genetics, proteomics, and metabolomics can be integrated to unravel the intricate working of systems biology using machine learning-based predictive algorithms. Machine learning methods offer novel techniques to integrate and analyse the various omics data enabling the discovery of new biomarkers. These biomarkers have the potential to help in accurate disease prediction, patient stratification and delivery of precision medicine. This review paper explores different integrative machine learning methods which have been used to provide an in-depth understanding of biological systems during normal physiological functioning and in the presence of a disease. It provides insight and recommendations for interdisciplinary professionals who envisage employing machine learning skills in multi-omics studies.
Collapse
Affiliation(s)
- Parminder S Reel
- Division of Population Health and Genomics, School of Medicine, University of Dundee, Dundee, United Kingdom
| | - Smarti Reel
- Division of Population Health and Genomics, School of Medicine, University of Dundee, Dundee, United Kingdom
| | - Ewan Pearson
- Division of Population Health and Genomics, School of Medicine, University of Dundee, Dundee, United Kingdom
| | - Emanuele Trucco
- VAMPIRE project, Computing, School of Science and Engineering, University of Dundee, Dundee, United Kingdom
| | - Emily Jefferson
- Division of Population Health and Genomics, School of Medicine, University of Dundee, Dundee, United Kingdom.
| |
Collapse
|
14
|
Hira MT, Razzaque MA, Angione C, Scrivens J, Sawan S, Sarker M. Integrated multi-omics analysis of ovarian cancer using variational autoencoders. Sci Rep 2021; 11:6265. [PMID: 33737557 PMCID: PMC7973750 DOI: 10.1038/s41598-021-85285-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 02/28/2021] [Indexed: 02/06/2023] Open
Abstract
Cancer is a complex disease that deregulates cellular functions at various molecular levels (e.g., DNA, RNA, and proteins). Integrated multi-omics analysis of data from these levels is necessary to understand the aberrant cellular functions accountable for cancer and its development. In recent years, Deep Learning (DL) approaches have become a useful tool in integrated multi-omics analysis of cancer data. However, high dimensional multi-omics data are generally imbalanced with too many molecular features and relatively few patient samples. This imbalance makes a DL based integrated multi-omics analysis difficult. DL-based dimensionality reduction technique, including variational autoencoder (VAE), is a potential solution to balance high dimensional multi-omics data. However, there are few VAE-based integrated multi-omics analyses, and they are limited to pancancer. In this work, we did an integrated multi-omics analysis of ovarian cancer using the compressed features learned through VAE and an improved version of VAE, namely Maximum Mean Discrepancy VAE (MMD-VAE). First, we designed and developed a DL architecture for VAE and MMD-VAE. Then we used the architecture for mono-omics, integrated di-omics and tri-omics data analysis of ovarian cancer through cancer samples identification, molecular subtypes clustering and classification, and survival analysis. The results show that MMD-VAE and VAE-based compressed features can respectively classify the transcriptional subtypes of the TCGA datasets with an accuracy in the range of 93.2-95.5% and 87.1-95.7%. Also, survival analysis results show that VAE and MMD-VAE based compressed representation of omics data can be used in cancer prognosis. Based on the results, we can conclude that (i) VAE and MMD-VAE outperform existing dimensionality reduction techniques, (ii) integrated multi-omics analyses perform better or similar compared to their mono-omics counterparts, and (iii) MMD-VAE performs better than VAE in most omics dataset.
Collapse
Affiliation(s)
- Muta Tah Hira
- School of Health and Life Sciences, Teesside University, Middlesbrough, TS4 3BX, UK
| | - M A Razzaque
- School of Computing, Eng. & Digital Tech., Teesside University, Middlesbrough, TS4 3BX, UK.
| | - Claudio Angione
- School of Computing, Eng. & Digital Tech., Teesside University, Middlesbrough, TS4 3BX, UK
| | - James Scrivens
- School of Health and Life Sciences, Teesside University, Middlesbrough, TS4 3BX, UK
| | - Saladin Sawan
- The James Cook University Hospital, Middlesbrough, TS4 3BW, UK
| | - Mosharraf Sarker
- School of Health and Life Sciences, Teesside University, Middlesbrough, TS4 3BX, UK
| |
Collapse
|
15
|
Roy S, Singh AP, Gupta D. Unsupervised subtyping and methylation landscape of pancreatic ductal adenocarcinoma. Heliyon 2021; 7:e06000. [PMID: 33521362 PMCID: PMC7820567 DOI: 10.1016/j.heliyon.2021.e06000] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 11/14/2020] [Accepted: 01/12/2021] [Indexed: 02/06/2023] Open
Abstract
Pancreatic Ductal Adenocarcinoma (PDAC) is an aggressive form of pancreatic cancer that typically manifests itself at an advanced stage and does not respond to most treatment modalities. The survival rate of a PDAC patient is less than 5%, with a median survival of just a couple of months. A better understanding of the molecular pathology of PDAC is needed to guide research for the development of better clinical treatment modalities for PDAC patients. Gene expression studies performed to date have identified different subtypes of PDAC with prognostic and clinical relevance. Subtypes identified to date are highly heterogeneous since pancreatic cancer is heterogeneous cancer. Tumor microenvironment and stroma constitute a major chunk of PDAC and contribute to the heterogeneity. Better subtyping methods are need of the hour for better prognosis and classification of PDAC for future personalized treatment. In this work, we have performed an integrated analysis of DNA methylation and gene expression datasets to provide better mechanistic and molecular insights into Pancreatic cancers, especially PDAC. The use of varied and diverse datasets has provided valuable insights into different cancer types and can play an integral role in revealing the complex nature of underlying biological mechanisms. We performed subtyping of TCGA-PAAD gene expression and methylation datasets into different subtypes using state-of-the-art normalization methods and unsupervised clustering methods that reveal latent hidden factors, leading to additional insights for subtyping. Differential expression and differential methylation were performed for each of the subtypes obtained from clustering. Our analysis gave a consensus of five cluster solution with relevant pathways like MAPK, MET. The five subtypes corresponded to the tumor and stromal subtypes. This analysis helps in distinguishing and identifying different subtypes based on enriched putative genes. These results help propose novel experimentally-verifiable PDAC subtyping and demonstrate that using varied data sets and integrated methods can contribute to disease prognostication and precision medicine in PDAC treatment.
Collapse
Affiliation(s)
- Shikha Roy
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, India
| | - Amar Pratap Singh
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, India
| | - Dinesh Gupta
- Translational Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology, New Delhi, India
| |
Collapse
|
16
|
Baek B, Lee H. Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data. Sci Rep 2020; 10:18951. [PMID: 33144687 PMCID: PMC7609582 DOI: 10.1038/s41598-020-76025-1] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 10/20/2020] [Indexed: 01/08/2023] Open
Abstract
Predicting the prognosis of pancreatic cancer is important because of the very low survival rates of patients with this particular cancer. Although several studies have used microRNA and gene expression profiles and clinical data, as well as images of tissues and cells, to predict cancer survival and recurrence, the accuracies of these approaches in the prediction of high-risk pancreatic adenocarcinoma (PAAD) still need to be improved. Accordingly, in this study, we proposed two biological features based on multi-omics datasets to predict survival and recurrence among patients with PAAD. First, the clonal expansion of cancer cells with somatic mutations was used to predict prognosis. Using whole-exome sequencing data from 134 patients with PAAD from The Cancer Genome Atlas (TCGA), we found five candidate genes that were mutated in the early stages of tumorigenesis with high cellular prevalence (CP). CDKN2A, TP53, TTN, KCNJ18, and KRAS had the highest CP values among the patients with PAAD, and survival and recurrence rates were significantly different between the patients harboring mutations in these candidate genes and those harboring mutations in other genes (p = 2.39E-03, p = 8.47E-04, respectively). Second, we generated an autoencoder to integrate the RNA sequencing, microRNA sequencing, and DNA methylation data from 134 patients with PAAD from TCGA. The autoencoder robustly reduced the dimensions of these multi-omics data, and the K-means clustering method was then used to cluster the patients into two subgroups. The subgroups of patients had significant differences in survival and recurrence (p = 1.41E-03, p = 4.43E-04, respectively). Finally, we developed a prediction model for prognosis using these two biological features and clinical data. When support vector machines, random forest, logistic regression, and L2 regularized logistic regression were used as prediction models, logistic regression analysis generally revealed the best performance for both disease-free survival (DFS) and overall survival (OS) (accuracy [ACC] = 0.762 and area under the curve [AUC] = 0.795 for DFS; ACC = 0.776 and AUC = 0.769 for OS). Thus, we could classify patients with a high probability of recurrence and at a high risk of poor outcomes. Our study provides insights into new personalized therapies on the basis of mutation status and multi-omics data.
Collapse
Affiliation(s)
- Bin Baek
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, Korea
| | - Hyunju Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, Korea.
- Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, Gwangju, 61005, Korea.
| |
Collapse
|
17
|
Biswas N, Chakrabarti S. Artificial Intelligence (AI)-Based Systems Biology Approaches in Multi-Omics Data Analysis of Cancer. Front Oncol 2020; 10:588221. [PMID: 33154949 PMCID: PMC7591760 DOI: 10.3389/fonc.2020.588221] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 09/21/2020] [Indexed: 12/13/2022] Open
Abstract
Cancer is the manifestation of abnormalities of different physiological processes involving genes, DNAs, RNAs, proteins, and other biomolecules whose profiles are reflected in different omics data types. As these bio-entities are very much correlated, integrative analysis of different types of omics data, multi-omics data, is required to understanding the disease from the tumorigenesis to the disease progression. Artificial intelligence (AI), specifically machine learning algorithms, has the ability to make decisive interpretation of "big"-sized complex data and, hence, appears as the most effective tool for the analysis and understanding of multi-omics data for patient-specific observations. In this review, we have discussed about the recent outcomes of employing AI in multi-omics data analysis of different types of cancer. Based on the research trends and significance in patient treatment, we have primarily focused on the AI-based analysis for determining cancer subtypes, disease prognosis, and therapeutic targets. We have also discussed about AI analysis of some non-canonical types of omics data as they have the capability of playing the determiner role in cancer patient care. Additionally, we have briefly discussed about the data repositories because of their pivotal role in multi-omics data storing, processing, and analysis.
Collapse
Affiliation(s)
- Nupur Biswas
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, IICB TRUE Campus, Kolkata, India
| | - Saikat Chakrabarti
- Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, IICB TRUE Campus, Kolkata, India
| |
Collapse
|
18
|
Martín-Blázquez A, Jiménez-Luna C, Díaz C, Martínez-Galán J, Prados J, Vicente F, Melguizo C, Genilloud O, Pérez del Palacio J, Caba O. Discovery of Pancreatic Adenocarcinoma Biomarkers by Untargeted Metabolomics. Cancers (Basel) 2020; 12:E1002. [PMID: 32325731 PMCID: PMC7225994 DOI: 10.3390/cancers12041002] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 04/08/2020] [Accepted: 04/13/2020] [Indexed: 12/12/2022] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is one of the most aggressive and lethal cancers, with a 5-year survival rate of less than 5%. In fact, complete surgical resection remains the only curative treatment. However, fewer than 20% of patients are candidates for surgery at the time of presentation. Hence, there is a critical need to identify diagnostic biomarkers with potential clinical utility in this pathology. In this context, metabolomics could be a powerful tool to search for new robust biomarkers. Comparative metabolomic profiling was performed in serum samples from 59 unresectable PDAC patients and 60 healthy controls. Samples were analyzed by using an untargeted metabolomics workflow based on liquid chromatography, coupled to high-resolution mass spectrometry in positive and negative electrospray ionization modes. Univariate and multivariate analysis allowed the identification of potential candidates that were significantly altered in PDAC patients. A panel of nine candidates yielded excellent diagnostic capacities. Pathway analysis revealed four altered pathways in our patients. This study shows the potential of liquid chromatography coupled to high-resolution mass spectrometry as a diagnostic tool for PDAC. Furthermore, it identified novel robust biomarkers with excellent diagnostic capacities.
Collapse
Affiliation(s)
- Ariadna Martín-Blázquez
- Fundación MEDINA, Centro de Excelencia en Investigación de Medicamentos Innovadores en Andalucía, 18016 Granada, Spain; (A.M.-B.); (C.D.); (F.V.); (O.G.); (J.P.d.P.)
| | - Cristina Jiménez-Luna
- Department of Oncology, Ludwig Institute for Cancer Research, University of Lausanne, 1066 Epalinges, Switzerland;
- Institute of Biopathology and Regenerative Medicine (IBIMER), Center of Biomedical Research (CIBM), University of Granada, 18016 Granada, Spain; (C.M.); (O.C.)
| | - Caridad Díaz
- Fundación MEDINA, Centro de Excelencia en Investigación de Medicamentos Innovadores en Andalucía, 18016 Granada, Spain; (A.M.-B.); (C.D.); (F.V.); (O.G.); (J.P.d.P.)
| | - Joaquina Martínez-Galán
- Service of Medical Oncology, Hospital Universitario Virgen de las Nieves, 18014 Granada, Spain;
| | - Jose Prados
- Institute of Biopathology and Regenerative Medicine (IBIMER), Center of Biomedical Research (CIBM), University of Granada, 18016 Granada, Spain; (C.M.); (O.C.)
- Instituto Biosanitario de Granada (ibs. GRANADA), 18016 Granada, Spain
| | - Francisca Vicente
- Fundación MEDINA, Centro de Excelencia en Investigación de Medicamentos Innovadores en Andalucía, 18016 Granada, Spain; (A.M.-B.); (C.D.); (F.V.); (O.G.); (J.P.d.P.)
| | - Consolación Melguizo
- Institute of Biopathology and Regenerative Medicine (IBIMER), Center of Biomedical Research (CIBM), University of Granada, 18016 Granada, Spain; (C.M.); (O.C.)
- Instituto Biosanitario de Granada (ibs. GRANADA), 18016 Granada, Spain
| | - Olga Genilloud
- Fundación MEDINA, Centro de Excelencia en Investigación de Medicamentos Innovadores en Andalucía, 18016 Granada, Spain; (A.M.-B.); (C.D.); (F.V.); (O.G.); (J.P.d.P.)
| | - José Pérez del Palacio
- Fundación MEDINA, Centro de Excelencia en Investigación de Medicamentos Innovadores en Andalucía, 18016 Granada, Spain; (A.M.-B.); (C.D.); (F.V.); (O.G.); (J.P.d.P.)
| | - Octavio Caba
- Institute of Biopathology and Regenerative Medicine (IBIMER), Center of Biomedical Research (CIBM), University of Granada, 18016 Granada, Spain; (C.M.); (O.C.)
- Instituto Biosanitario de Granada (ibs. GRANADA), 18016 Granada, Spain
| |
Collapse
|
19
|
de Anda-Jáuregui G, Hernández-Lemus E. Computational Oncology in the Multi-Omics Era: State of the Art. Front Oncol 2020; 10:423. [PMID: 32318338 PMCID: PMC7154096 DOI: 10.3389/fonc.2020.00423] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Accepted: 03/10/2020] [Indexed: 12/24/2022] Open
Abstract
Cancer is the quintessential complex disease. As technologies evolve faster each day, we are able to quantify the different layers of biological elements that contribute to the emergence and development of malignancies. In this multi-omics context, the use of integrative approaches is mandatory in order to gain further insights on oncological phenomena, and to move forward toward the precision medicine paradigm. In this review, we will focus on computational oncology as an integrative discipline that incorporates knowledge from the mathematical, physical, and computational fields to further the biomedical understanding of cancer. We will discuss the current roles of computation in oncology in the context of multi-omic technologies, which include: data acquisition and processing; data management in the clinical and research settings; classification, diagnosis, and prognosis; and the development of models in the research setting, including their use for therapeutic target identification. We will discuss the machine learning and network approaches as two of the most promising emerging paradigms, in computational oncology. These approaches provide a foundation on how to integrate different layers of biological description into coherent frameworks that allow advances both in the basic and clinical settings.
Collapse
Affiliation(s)
- Guillermo de Anda-Jáuregui
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Cátedras Conacyt Para Jóvenes Investigadores, National Council on Science and Technology, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
20
|
Seal DB, Das V, Goswami S, De RK. Estimating gene expression from DNA methylation and copy number variation: A deep learning regression model for multi-omics integration. Genomics 2020; 112:2833-2841. [PMID: 32234433 DOI: 10.1016/j.ygeno.2020.03.021] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 03/17/2020] [Accepted: 03/22/2020] [Indexed: 12/21/2022]
Abstract
Gene expression analysis plays a significant role for providing molecular insights in cancer. Various genetic and epigenetic factors (being dealt under multi-omics) affect gene expression giving rise to cancer phenotypes. A recent growth in understanding of multi-omics seems to provide a resource for integration in interdisciplinary biology since they altogether can draw the comprehensive picture of an organism's developmental and disease biology in cancers. Such large scale multi-omics data can be obtained from public consortium like The Cancer Genome Atlas (TCGA) and several other platforms. Integrating these multi-omics data from varied platforms is still challenging due to high noise and sensitivity of the platforms used. Currently, a robust integrative predictive model to estimate gene expression from these genetic and epigenetic data is lacking. In this study, we have developed a deep learning-based predictive model using Deep Denoising Auto-encoder (DDAE) and Multi-layer Perceptron (MLP) that can quantitatively capture how genetic and epigenetic alterations correlate with directionality of gene expression for liver hepatocellular carcinoma (LIHC). The DDAE used in the study has been trained to extract significant features from the input omics data to estimate the gene expression. These features have then been used for back-propagation learning by the multilayer perceptron for the task of regression and classification. We have benchmarked the proposed model against state-of-the-art regression models. Finally, the deep learning-based integration model has been evaluated for its disease classification capability, where an accuracy of 95.1% has been obtained.
Collapse
Affiliation(s)
- Dibyendu Bikash Seal
- A. K. Choudhury School of Information Technology, University of Calcutta, JD-2, Sector III, Salt Lake City, Kolkata 700106, India
| | - Vivek Das
- Novo Nordisk Research Center Seattle, Inc., 530 Fairview Ave N # 5000, Seattle, WA 98109, United States
| | - Saptarsi Goswami
- Bangabasi Morning College, 35 Rajkumar Chakraborty Sarani, Scott Ln, Kolkata 700009, India
| | - Rajat K De
- Machine Intelligence Unit, Indian Statistical Institute, 203 Barrackpore Trunk Road, Kolkata 700108, India.
| |
Collapse
|
21
|
Zhao L, Lee VHF, Ng MK, Yan H, Bijlsma MF. Molecular subtyping of cancer: current status and moving toward clinical applications. Brief Bioinform 2020; 20:572-584. [PMID: 29659698 DOI: 10.1093/bib/bby026] [Citation(s) in RCA: 79] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Revised: 03/01/2018] [Indexed: 12/14/2022] Open
Abstract
Cancer is a collection of genetic diseases, with large phenotypic differences and genetic heterogeneity between different types of cancers and even within the same cancer type. Recent advances in genome-wide profiling provide an opportunity to investigate global molecular changes during the development and progression of cancer. Meanwhile, numerous statistical and machine learning algorithms have been designed for the processing and interpretation of high-throughput molecular data. Molecular subtyping studies have allowed the allocation of cancer into homogeneous groups that are considered to harbor similar molecular and clinical characteristics. Furthermore, this has helped researchers to identify both actionable targets for drug design as well as biomarkers for response prediction. In this review, we introduce five frequently applied techniques for generating molecular data, which are microarray, RNA sequencing, quantitative polymerase chain reaction, NanoString and tissue microarray. Commonly used molecular data for cancer subtyping and clinical applications are discussed. Next, we summarize a workflow for molecular subtyping of cancer, including data preprocessing, cluster analysis, supervised classification and subtype characterizations. Finally, we identify and describe four major challenges in the molecular subtyping of cancer that may preclude clinical implementation. We suggest that standardized methods should be established to help identify intrinsic subgroup signatures and build robust classifiers that pave the way toward stratified treatment of cancer patients.
Collapse
Affiliation(s)
- Lan Zhao
- Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong
| | - Victor H F Lee
- Department of Clinical Oncology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Pokfulam, Hong Kong
| | - Michael K Ng
- Centre for Mathematical Imaging and Vision and Department of Mathematics, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - Hong Yan
- Department of Electronic Engineering, City University of Hong Kong, Kowloon, Hong Kong
| | - Maarten F Bijlsma
- Laboratory for Experimental Oncology and Radiobiology, Center for Experimental and Molecular Medicine, Cancer Center Amsterdam and Academic Medical Center, Amsterdam, The Netherlands
| |
Collapse
|
22
|
FABP4 and MMP9 levels identified as predictive factors for poor prognosis in patients with nonalcoholic fatty liver using data mining approaches and gene expression analysis. Sci Rep 2019; 9:19785. [PMID: 31874999 PMCID: PMC6930227 DOI: 10.1038/s41598-019-56235-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Accepted: 12/07/2019] [Indexed: 02/06/2023] Open
Abstract
Nonalcoholic fatty liver (NAFLD) may progress to nonalcoholic steatohepatitis (NASH) and ultimately to cirrhosis and hepatocellular carcinoma (HCC). Prognostic markers for these conditions are poorly defined. The aim of this study was to identify predictive gene markers for the transition from NAFL to NASH and then to poorer conditions. Gene expression omnibus datasets associated with a prediction analysis algorithm were used to create a matrix composed of control subject (n = 52), healthy obese (n = 51), obese with NAFL (n = 42) and NASH patients (n = 37) and 19,085 genes in order to identify specific genes predictive of the transition from steatosis to NASH and from NASH to cirrhosis and HCC and thus patients at high risk of complications. A validation cohort was used to validate these results. We identified two genes, fatty acid binding protein-4 (FABP4) and matrix metalloproteinase-9 (MMP9), which respectively allowed distinguishing patients at risk of progression from NAFL to NASH and from NASH to cirrhosis and HCC. Thus, NAFL patients expressing high hepatic levels of FABP4 and NASH patients expressing high hepatic levels of MMP9 are likely to experience disease progression. Therefore, using FABP4 and MMP9 as blood markers could help to predict poor outcomes and/or progression of NAFL during clinical trial follow-up.
Collapse
|
23
|
Integrative Deep Learning for Identifying Differentially Expressed (DE) Biomarkers. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2019; 2019:8418760. [PMID: 31915462 PMCID: PMC6935456 DOI: 10.1155/2019/8418760] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 06/19/2019] [Accepted: 08/04/2019] [Indexed: 11/17/2022]
Abstract
As a large amount of genetic data are accumulated, an effective analytical method and a significant interpretation are required. Recently, various methods of machine learning have emerged to process genetic data. In addition, machine learning analysis tools using statistical models have been proposed. In this study, we propose adding an integrated layer to the deep learning structure, which would enable the effective analysis of genetic data and the discovery of significant biomarkers of diseases. We conducted a simulation study in order to compare the proposed method with metalogistic regression and meta-SVM methods. The objective function with lasso penalty is used for parameter estimation, and the Youden J index is used for model comparison. The simulation results indicate that the proposed method is more robust for the variance of the data than metalogistic regression and meta-SVM methods. We also conducted real data (breast cancer data (TCGA)) analysis. Based on the results of gene set enrichment analysis, we obtained that TCGA multiple omics data involve significantly enriched pathways which contain information related to breast cancer. Therefore, it is expected that the proposed method will be helpful to discover biomarkers.
Collapse
|
24
|
Shengyun D, Yuqi W, Fei W, Xiaodan M, Jiayu Z. A proposed protocol based on integrative metabonomics analysis for the rapid detection and mechanistic understanding of sulfur fumigation of Chinese herbal medicines. RSC Adv 2019; 9:31150-31161. [PMID: 35529375 PMCID: PMC9072333 DOI: 10.1039/c9ra05032a] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Accepted: 09/12/2019] [Indexed: 01/24/2023] Open
Abstract
In the current work, Lonicera japonica Flos (FLJ) was selected as a model Chinese herbal medicine (CHM) and a protocol was proposed for the rapid detection of sulfur-fumigated (SF) CHMs. A multiple metabonomics analysis was conducted using HPLC, NIR spectroscopy and a UHPLC-LTQ-Orbitrap mass spectrometer. First, the group discriminatory potential of each technique was respectively investigated based on PCA. Then, the effect of mid-level metabonomics data fusion on sample spatial distribution was evaluated based on data obtained using the above three technologies. Furthermore, based on the acquired HRMS data, 76 markers discriminating SF from non-sulfur-fumigated (NSF) CHMs were observed and 49 of them were eventually characterized. Moreover, NIR absorptions of 18 sulfur-containing markers were identified to be in close correlation with the discriminatory NIR wavebands. In conclusion, the proposed protocol based on integrative metabonomics analysis that we established for the rapid detection and mechanistic explanation of the sulfur fumigation of CHMs was able to achieve variable selection, enhance group separation and reveal the intrinsic mechanism of the sulfur fumigation of CHMs.
Collapse
Affiliation(s)
- Dai Shengyun
- School of Chinese Pharmacy, Beijing University of Chinese Medicine Beijing 102488 China
- National Institute of Food and Drug Control Beijing 100050 China
| | - Wang Yuqi
- School of Chinese Pharmacy, Beijing University of Chinese Medicine Beijing 102488 China
| | - Wang Fei
- School of Chinese Pharmacy, Beijing University of Chinese Medicine Beijing 102488 China
- Department of Pharmacy, People Hospital of Peking University Beijing 100044 China
| | - Mei Xiaodan
- School of Chinese Pharmacy, Beijing University of Chinese Medicine Beijing 102488 China
| | - Zhang Jiayu
- Beijing Research Institute of Chinese Medicine, Beijing University of Chinese Medicine Beijing 100029 China
- School of Pharmacy, Binzhou Medical University Yantai 264003 China
| |
Collapse
|
25
|
Discovery of the Oncogenic Parp1, a Target of bcr-abl and a Potential Therapeutic, in mir-181a/PPFIA1 Signaling Pathway. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 16:1-14. [PMID: 30825668 PMCID: PMC6393709 DOI: 10.1016/j.omtn.2019.01.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Revised: 01/26/2019] [Accepted: 01/30/2019] [Indexed: 02/06/2023]
Abstract
miR-181a is downregulated in leukemia and affects its progression, drug resistance, and prognosis. However, the exact mechanism of its targets in leukemia, particularly in chronic myelogenous leukemia (CML), has not previously been established. Here, we use a multi-omics approach to demonstrate that protein tyrosine phosphatase, receptor type, f polypeptide, leukocyte common antigen (LAR) interacting protein (liprin), alpha 1 (PPFIA1) is a direct target for miR-181a in CML. Phospho-array assay shows that multiple phosphorylated proteins, particularly KIT signaling molecules, were downregulated in PPFIA1 inhibition. Additionally, PPFIA1 bound PARP1, a common molecule downstream of both PPFIA1 and BCR/ABL, to upregulate KIT protein through activation of nuclear factor kappa B (NF-κB)-P65 expression. Targeted inhibition of PPFIA1 and PARP1 downregulated c-KIT level, inhibited CML cell growth, and prolonged mouse survival. Overall, we report a critical regulatory miR-181a/PPFIA1/PARP1/NF-κB-P65/KIT axis in CML, and our preclinical study supports that targeted PPFIA1 and PARP1 may serve as a potential CML therapy.
Collapse
|
26
|
Mirza B, Wang W, Wang J, Choi H, Chung NC, Ping P. Machine Learning and Integrative Analysis of Biomedical Big Data. Genes (Basel) 2019; 10:E87. [PMID: 30696086 PMCID: PMC6410075 DOI: 10.3390/genes10020087] [Citation(s) in RCA: 162] [Impact Index Per Article: 32.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2018] [Revised: 01/08/2019] [Accepted: 01/21/2019] [Indexed: 12/11/2022] Open
Abstract
Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues.
Collapse
Affiliation(s)
- Bilal Mirza
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Wei Wang
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Scalable Analytics Institute (ScAi), University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Jie Wang
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Howard Choi
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
| | - Neo Christopher Chung
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Banacha 2, 02-097 Warsaw, Poland.
| | - Peipei Ping
- NIH BD2K Center of Excellence for Biomedical Computing, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Physiology, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Scalable Analytics Institute (ScAi), University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Bioinformatics, University of California Los Angeles, Los Angeles, CA 90095, USA.
- Department of Medicine (Cardiology), University of California Los Angeles, Los Angeles, CA 90095, USA.
| |
Collapse
|
27
|
Abstract
Several challenges present themselves when discussing current approaches to the prevention or treatment of pancreatic cancer. Up to 45% of the risk of pancreatic cancer is attributed to unknown causes, making effective prevention programs difficult to design. The most common type of pancreatic cancer, pancreatic ductal adenocarcinoma (PDAC), is generally diagnosed at a late stage, leading to a poor prognosis and 5-year survival estimate. PDAC tumors are heterogeneous, leading to many identified cell subtypes within one patient’s primary tumor. This explains why there is a high frequency of tumors that are resistant to standard treatments, leading to high relapse rates. This review will discuss how epigenetic technologies and epigenome-wide association studies have been used to address some of these challenges and the future promises these approaches hold.
Collapse
Affiliation(s)
- Rahul R Singh
- Department of Biological Sciences, North Dakota State University, Fargo, ND 58102, USA; (R.R.S.); (K.M.R.)
| | - Katie M Reindl
- Department of Biological Sciences, North Dakota State University, Fargo, ND 58102, USA; (R.R.S.); (K.M.R.)
| | - Rick J Jansen
- Department of Public Health, North Dakota State University, Fargo, ND 58102, USA
- Biostatistics Core Facility, North Dakota State University, Fargo, ND 58102, USA
- Center for Immunization Research and Education, North Dakota State University, Fargo, ND 58102, USA
- Genomics and Bioinformatics Program, North Dakota State University, Fargo, ND 58102, USA
| |
Collapse
|
28
|
Srivastava A, Creek DJ. Discovery and Validation of Clinical Biomarkers of Cancer: A Review Combining Metabolomics and Proteomics. Proteomics 2018; 19:e1700448. [PMID: 30353665 DOI: 10.1002/pmic.201700448] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 10/11/2018] [Indexed: 12/19/2022]
Abstract
Early detection and diagnosis of cancer can allow timely medical intervention, which greatly improves chances of survival and enhances quality of life. Biomarkers play an important role in assisting clinicians and health care providers in cancer diagnosis and treatment follow-up. In spite of years of research and the discovery of thousands of candidate cancer biomarkers, only a few have transitioned to routine usage in the clinic. This review highlights advances in proteomics technologies that have enabled high rates of discovery of candidate cancer biomarkers and evaluates integration with other omics technologies to improve their progress through to validation and clinical translation. Furthermore, it gauges the role of metabolomics technology in cancer biomarker research and assesses it as a complementary tool in aiding cancer biomarker discovery and validation.
Collapse
Affiliation(s)
- Anubhav Srivastava
- Monash Institute of Pharmaceutical Sciences, Monash University, Melbourne, Victoria, 3052, Australia
| | - Darren John Creek
- Monash Institute of Pharmaceutical Sciences, Monash University, Melbourne, Victoria, 3052, Australia
| |
Collapse
|
29
|
Lin J, Wu YJ, Liang X, Ji M, Ying HM, Wang XY, Sun X, Shao CH, Zhan LX, Zhang Y. Network-based integration of mRNA and miRNA profiles reveals new target genes involved in pancreatic cancer. Mol Carcinog 2018; 58:206-218. [PMID: 30294829 DOI: 10.1002/mc.22920] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2018] [Revised: 08/31/2018] [Accepted: 10/03/2018] [Indexed: 12/30/2022]
Abstract
Pancreatic cancer is regarded as the most fatal and aggressive malignancy cancer due to its low 5-year survival rate and poor prognosis. The approaches of early diagnosis and treatment are limited, which makes it urgent to identify the complex mechanism of pancreatic oncogenesis. In this study, we used RNA-seq to investigate the transcriptomic (mRNA and miRNA) profiles of pancreatic cancer in paired tumor and normal pancreatic samples from ten patients. More than 1000 differentially expressed genes were identified, nearly half of which were also found to be differentially expressed in the majority of examined patients. Functional enrichment analysis revealed that these genes were significantly enriched in multicellular organismal and metabolic process, secretion, mineral transport, and intercellular communication. In addition, only 24 differentially expressed miRNAs were found, all of which have been reported to be associated with pancreatic cancer. Furthermore, an integrated miRNA-mRNA interaction network was generated using multiple resources. Based on the calculation of disease correlation scores developed here, several genes present in the largest connected subnetwork, such as albumin, ATPase H+ /K+ exchanging alpha polypeptide and carcinoembryonic antigen-related cell adhesion molecule 1, were considered as novel genes that play important roles in the development of pancreatic cancer. Overall, our data provide new insights into further understanding of key molecular mechanisms underlying pancreatic tumorigenesis.
Collapse
Affiliation(s)
- Jie Lin
- Shenzhen Key Laboratory of Marine Bioresources and Ecology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, Guangdong Province, P. R. China.,Key Laboratory of Nutrition, Metabolism, and Food Safety, Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Yan-Jun Wu
- Key Laboratory of Nutrition, Metabolism, and Food Safety, Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Xing Liang
- Department of Pancreatic-Biliary Surgery, Changzheng Hospital, Second Military Medical University, Shanghai, P. R. China
| | - Meng Ji
- Department of Pancreatic-Biliary Surgery, Changzheng Hospital, Second Military Medical University, Shanghai, P. R. China
| | - Hui-Min Ying
- Department of Endocrinology, Hangzhou Xixi Hospital, Hangzhou, Zhejiang, P. R. China
| | - Xin-Yu Wang
- Key Laboratory of Nutrition, Metabolism, and Food Safety, Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Xia Sun
- Key Laboratory of Nutrition, Metabolism, and Food Safety, Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Cheng-Hao Shao
- Department of Pancreatic-Biliary Surgery, Changzheng Hospital, Second Military Medical University, Shanghai, P. R. China
| | - Li-Xing Zhan
- Key Laboratory of Nutrition, Metabolism, and Food Safety, Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, P. R. China
| | - Yan Zhang
- Shenzhen Key Laboratory of Marine Bioresources and Ecology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, Guangdong Province, P. R. China
| |
Collapse
|
30
|
Kim Y, Lee S, Choi S, Jang JY, Park T. Hierarchical structural component modeling of microRNA-mRNA integration analysis. BMC Bioinformatics 2018; 19:75. [PMID: 29745843 PMCID: PMC5998903 DOI: 10.1186/s12859-018-2070-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Identification of multi-markers is one of the most challenging issues in personalized medicine era. Nowadays, many different types of omics data are generated from the same subject. Although many methods endeavor to identify candidate markers, for each type of omics data, few or none can facilitate such identification. RESULTS It is well known that microRNAs affect phenotypes only indirectly, through regulating mRNA expression and/or protein translation. Toward addressing this issue, we suggest a hierarchical structured component analysis of microRNA-mRNA integration ("HisCoM-mimi") model that accounts for this biological relationship, to efficiently study and identify such integrated markers. In simulation studies, HisCoM-mimi showed the better performance than the other three methods. Also, in real data analysis, HisCoM-mimi successfully identified more gives more informative miRNA-mRNA integration sets relationships for pancreatic ductal adenocarcinoma (PDAC) diagnosis, compared to the other methods. CONCLUSION As exemplified by an application to pancreatic cancer data, our proposed model effectively identified integrated miRNA/target mRNA pairs as markers for early diagnosis, providing a much broader biological interpretation.
Collapse
Affiliation(s)
- Yongkang Kim
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Sungyoung Lee
- Interdisciplinary program in Bioinformatics, Seoul National University, Seoul, Korea
| | - Sungkyoung Choi
- Interdisciplinary program in Bioinformatics, Seoul National University, Seoul, Korea
| | - Jin-Young Jang
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, Korea.
- Interdisciplinary program in Bioinformatics, Seoul National University, Seoul, Korea.
| |
Collapse
|
31
|
Park J, Choi Y, Namkung J, Yi SG, Kim H, Yu J, Kim Y, Kwon MS, Kwon W, Oh DY, Kim SW, Jeong SY, Han W, Lee KE, Heo JS, Park JO, Park JK, Kim SC, Kang CM, Lee WJ, Lee S, Han S, Park T, Jang JY, Kim Y. Diagnostic performance enhancement of pancreatic cancer using proteomic multimarker panel. Oncotarget 2017; 8:93117-93130. [PMID: 29190982 PMCID: PMC5696248 DOI: 10.18632/oncotarget.21861] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Accepted: 08/29/2017] [Indexed: 12/15/2022] Open
Abstract
Due to its high mortality rate and asymptomatic nature, early detection rates of pancreatic ductal adenocarcinoma (PDAC) remain poor. We measured 1000 biomarker candidates in 134 clinical plasma samples by multiple reaction monitoring-mass spectrometry (MRM-MS). Differentially abundant proteins were assembled into a multimarker panel from a training set (n=684) and validated in independent set (n=318) from five centers. The level of panel proteins was also confirmed by immunoassays. The panel including leucine-rich alpha-2 glycoprotein (LRG1), transthyretin (TTR), and CA19-9 had a sensitivity of 82.5% and a specificity of 92.1%. The triple-marker panel exceeded the diagnostic performance of CA19-9 by more than 10% (AUCCA19-9 = 0.826, AUCpanel= 0.931, P < 0.01) in all PDAC samples and by more than 30% (AUCCA19-9 = 0.520, AUCpanel = 0.830, P < 0.001) in patients with normal range of CA19-9 (<37U/mL). Further, it differentiated PDAC from benign pancreatic disease (AUCCA19-9 = 0.812, AUCpanel = 0.892, P < 0.01) and other cancers (AUCCA19-9 = 0.796, AUCpanel = 0.899, P < 0.001). Overall, the multimarker panel that we have developed and validated in large-scale samples by MRM-MS and immunoassay has clinical applicability in the early detection of PDAC.
Collapse
Affiliation(s)
- Jiyoung Park
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea.,Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Korea
| | - Yonghwan Choi
- Immunodiagnostics R&D Team, IVD Business Unit 5, SK Telecom, Seoul, Korea
| | - Junghyun Namkung
- Immunodiagnostics R&D Team, IVD Business Unit 5, SK Telecom, Seoul, Korea
| | - Sung Gon Yi
- Immunodiagnostics R&D Team, IVD Business Unit 5, SK Telecom, Seoul, Korea
| | - Hyunsoo Kim
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea.,Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Korea
| | - Jiyoung Yu
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea.,Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Korea
| | - Yongkang Kim
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Min-Seok Kwon
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Wooil Kwon
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea
| | - Do-Youn Oh
- Department of Internal Medicine and Cancer Research Institute, Seoul National University Hospital, Seoul, Korea
| | - Sun-Whe Kim
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea
| | - Seung-Yong Jeong
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea
| | - Wonshik Han
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea
| | - Kyu Eun Lee
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea
| | - Jin Seok Heo
- Department of Surgery, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Joon Oh Park
- Internal Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
| | - Joo Kyung Park
- Department of Internal Medicine, Seoul National University Hospital Healthcare System Gangnam Center, Seoul, Korea
| | - Song Cheol Kim
- Department of Surgery, University of Ulsan College of Medicine and Asan Medical Center, Seoul, Korea
| | - Chang Moo Kang
- Department of Surgery, Severance Hospital, Yonsei University College of Medicine, Seoul, Korea
| | - Woo Jin Lee
- Center for Liver Cancer, National Cancer Center, Seoul, Korea
| | - Seungyeoun Lee
- Department of Mathematics and Statistics, Sejong University, Seoul, Korea
| | - Sangjo Han
- Immunodiagnostics R&D Team, IVD Business Unit 5, SK Telecom, Seoul, Korea
| | - Taesung Park
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Jin-Young Jang
- Department of Surgery and Cancer Research Institute, Seoul National University College of Medicine, Seoul, Korea
| | - Youngsoo Kim
- Department of Biomedical Sciences, Seoul National University College of Medicine, Seoul, Korea.,Department of Biomedical Engineering, Seoul National University College of Medicine, Seoul, Korea
| |
Collapse
|
32
|
Kim S, Jhong JH, Lee J, Koo JY. Meta-analytic support vector machine for integrating multiple omics data. BioData Min 2017; 10:2. [PMID: 28149325 PMCID: PMC5270233 DOI: 10.1186/s13040-017-0126-8] [Citation(s) in RCA: 82] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 01/11/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Of late, high-throughput microarray and sequencing data have been extensively used to monitor biomarkers and biological processes related to many diseases. Under this circumstance, the support vector machine (SVM) has been popularly used and been successful for gene selection in many applications. Despite surpassing benefits of the SVMs, single data analysis using small- and mid-size of data inevitably runs into the problem of low reproducibility and statistical power. To address this problem, we propose a meta-analytic support vector machine (Meta-SVM) that can accommodate multiple omics data, making it possible to detect consensus genes associated with diseases across studies. RESULTS Experimental studies show that the Meta-SVM is superior to the existing meta-analysis method in detecting true signal genes. In real data applications, diverse omics data of breast cancer (TCGA) and mRNA expression data of lung disease (idiopathic pulmonary fibrosis; IPF) were applied. As a result, we identified gene sets consistently associated with the diseases across studies. In particular, the ascertained gene set of TCGA omics data was found to be significantly enriched in the ABC transporters pathways well known as critical for the breast cancer mechanism. CONCLUSION The Meta-SVM effectively achieves the purpose of meta-analysis as jointly leveraging multiple omics data, and facilitates identifying potential biomarkers and elucidating the disease process.
Collapse
Affiliation(s)
- SungHwan Kim
- Department of Statistics, Korea University, Anam-dong, Seoul, 136-701 South Korea.,Department of Statistics, Keimyung University, Dalseoku, Daegu, 42601 South Korea
| | - Jae-Hwan Jhong
- Department of Statistics, Korea University, Anam-dong, Seoul, 136-701 South Korea
| | - JungJun Lee
- Department of Statistics, Korea University, Anam-dong, Seoul, 136-701 South Korea
| | - Ja-Yong Koo
- Department of Statistics, Korea University, Anam-dong, Seoul, 136-701 South Korea
| |
Collapse
|
33
|
Kwon MS, Kim Y, Lee S, Namkung J, Yun T, Yi SG, Han S, Kang M, Kim SW, Jang JY, Park T. Erratum to: Integrative analysis of multi-omics data for identifying multi-markers for diagnosing pancreatic cancer. BMC Genomics 2017; 18:88. [PMID: 28093064 PMCID: PMC5238516 DOI: 10.1186/s12864-016-3464-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Accepted: 12/22/2016] [Indexed: 08/30/2023] Open
Affiliation(s)
- Min-Seok Kwon
- Interdisciplinary program in Bioinformatics, Seoul National University, Seoul, Korea
| | - Yongkang Kim
- Department of Statistics, Seoul National University, Seoul, Korea
| | - Seungyeoun Lee
- Department of Mathematics and Statistics, Sejong University, Seoul, Korea
| | - Junghyun Namkung
- Immunodiagnostics R&D Team, IVD Business Unit, New Business Division, SK telecom Co, Seongnam, Korea
| | - Taegyun Yun
- Immunodiagnostics R&D Team, IVD Business Unit, New Business Division, SK telecom Co, Seongnam, Korea
| | - Sung Gon Yi
- Immunodiagnostics R&D Team, IVD Business Unit, New Business Division, SK telecom Co, Seongnam, Korea
| | - Sangjo Han
- Immunodiagnostics R&D Team, IVD Business Unit, New Business Division, SK telecom Co, Seongnam, Korea
| | - Meejoo Kang
- Department of Surgery, Seoul National University Hospital, Seoul, Korea
| | - Sun Whe Kim
- Department of Surgery, Seoul National University Hospital, Seoul, Korea
| | - Jin-Young Jang
- Department of Surgery, Seoul National University Hospital, Seoul, Korea.
| | - Taesung Park
- Interdisciplinary program in Bioinformatics, Seoul National University, Seoul, Korea. .,Department of Statistics, Seoul National University, Seoul, Korea.
| |
Collapse
|
34
|
Pei X, Zhu J, Yang R, Tan Z, An M, Shi J, Lubman DM. CD90 and CD24 Co-Expression Is Associated with Pancreatic Intraepithelial Neoplasias. PLoS One 2016; 11:e0158021. [PMID: 27332878 PMCID: PMC4917090 DOI: 10.1371/journal.pone.0158021] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Accepted: 06/08/2016] [Indexed: 12/28/2022] Open
Abstract
Thy-1 (CD90) has been shown to be a potential marker for several different types of cancer. However, reports on CD90 expression in pancreatic intraepithelial neoplasia (PanIN) lesions are still limited where PanINs are the most important precursor lesion of pancreatic ductal adenocarcinoma (PDAC). Herein, we investigate candidate markers for PanIN lesions by examining the distribution and trend of CD90 and CD24 expression as well as their co-expression in various stages of PanINs. Thirty cases of PanINs, which were confirmed histopathologically and clinically, were used to evaluate protein expression of CD90 and CD24 by immunofluoresence double staining. CD90 was found to be mainly expressed in stroma around lesion ducts while not observed in acini and islets in PanINs. CD90 also showed increased expression in PanIN III compared to PanIN III. CD24 was mainly present in the cytoplasm and membrane of pancreatic ductal epithelia, especially in the apical epithelium of the duct. CD24 had higher expression in PanIN III compared with PanIN IIIIII or PanIN III. CD90 was expressed around CD24 sites, but there was little overlap between cells that expressed each of these proteins. A correlation analysis showed that these two proteins have a moderate relationship with PanIN stages respectively. These results suggest that co-expression of CD90 and CD24 may have an important role in the development and progression of PanINs, which is also conducive to early detection and treatment of PDAC.
Collapse
Affiliation(s)
- Xiucong Pei
- Department of Surgery, University of Michigan Medical Center, Ann Arbor, Michigan, 48109, United States of America
- Department of Toxicology, School of Public Health, Shenyang Medical College, Liaoning, 110034, China
| | - Jianhui Zhu
- Department of Surgery, University of Michigan Medical Center, Ann Arbor, Michigan, 48109, United States of America
| | - Rui Yang
- Department of Surgery, University of Michigan Medical Center, Ann Arbor, Michigan, 48109, United States of America
| | - Zhijing Tan
- Department of Surgery, University of Michigan Medical Center, Ann Arbor, Michigan, 48109, United States of America
| | - Mingrui An
- Department of Surgery, University of Michigan Medical Center, Ann Arbor, Michigan, 48109, United States of America
| | - Jiaqi Shi
- Department of Pathology, University of Michigan School of Medicine, Ann Arbor, Michigan, 48109, United States of America
| | - David M. Lubman
- Department of Surgery, University of Michigan Medical Center, Ann Arbor, Michigan, 48109, United States of America
| |
Collapse
|
35
|
Identification of key regulators of pancreatic cancer progression through multidimensional systems-level analysis. Genome Med 2016; 8:38. [PMID: 27137215 PMCID: PMC4853852 DOI: 10.1186/s13073-016-0282-3] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2015] [Accepted: 02/19/2016] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND Pancreatic cancer is an aggressive cancer with dismal prognosis, urgently necessitating better biomarkers to improve therapeutic options and early diagnosis. Traditional approaches of biomarker detection that consider only one aspect of the biological continuum like gene expression alone are limited in their scope and lack robustness in identifying the key regulators of the disease. We have adopted a multidimensional approach involving the cross-talk between the omics spaces to identify key regulators of disease progression. METHODS Multidimensional domain-specific disease signatures were obtained using rank-based meta-analysis of individual omics profiles (mRNA, miRNA, DNA methylation) related to pancreatic ductal adenocarcinoma (PDAC). These domain-specific PDAC signatures were integrated to identify genes that were affected across multiple dimensions of omics space in PDAC (genes under multiple regulatory controls, GMCs). To further pin down the regulators of PDAC pathophysiology, a systems-level network was generated from knowledge-based interaction information applied to the above identified GMCs. Key regulators were identified from the GMC network based on network statistics and their functional importance was validated using gene set enrichment analysis and survival analysis. RESULTS Rank-based meta-analysis identified 5391 genes, 109 miRNAs and 2081 methylation-sites significantly differentially expressed in PDAC (false discovery rate ≤ 0.05). Bimodal integration of meta-analysis signatures revealed 1150 and 715 genes regulated by miRNAs and methylation, respectively. Further analysis identified 189 altered genes that are commonly regulated by miRNA and methylation, hence considered GMCs. Systems-level analysis of the scale-free GMCs network identified eight potential key regulator hubs, namely E2F3, HMGA2, RASA1, IRS1, NUAK1, ACTN1, SKI and DLL1, associated with important pathways driving cancer progression. Survival analysis on individual key regulators revealed that higher expression of IRS1 and DLL1 and lower expression of HMGA2, ACTN1 and SKI were associated with better survival probabilities. CONCLUSIONS It is evident from the results that our hierarchical systems-level multidimensional analysis approach has been successful in isolating the converging regulatory modules and associated key regulatory molecules that are potential biomarkers for pancreatic cancer progression.
Collapse
|