201
|
Wei T, Fa B, Luo C, Johnston L, Zhang Y, Yu Z. An Efficient and Easy-to-Use Network-Based Integrative Method of Multi-Omics Data for Cancer Genes Discovery. Front Genet 2021; 11:613033. [PMID: 33488678 PMCID: PMC7820902 DOI: 10.3389/fgene.2020.613033] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Accepted: 11/25/2020] [Indexed: 12/25/2022] Open
Abstract
Identifying personalized driver genes is essential for discovering critical biomarkers and developing effective personalized therapies of cancers. However, few methods consider weights for different types of mutations and efficiently distinguish driver genes over a larger number of passenger genes. We propose MinNetRank (Minimum used for Network-based Ranking), a new method for prioritizing cancer genes that sets weights for different types of mutations, considers the incoming and outgoing degree of interaction network simultaneously, and uses minimum strategy to integrate multi-omics data. MinNetRank prioritizes cancer genes among multi-omics data for each sample. The sample-specific rankings of genes are then integrated into a population-level ranking. When evaluating the accuracy and robustness of prioritizing driver genes, our method almost always significantly outperforms other methods in terms of precision, F1 score, and partial area under the curve (AUC) on six cancer datasets. Importantly, MinNetRank is efficient in discovering novel driver genes. SP1 is selected as a candidate driver gene only by our method (ranked top three), and SP1 RNA and protein differential expression between tumor and normal samples are statistically significant in liver hepatocellular carcinoma. The top seven genes stratify patients into two subtypes exhibiting statistically significant survival differences in five cancer types. These top seven genes are associated with overall survival, as illustrated by previous researchers. MinNetRank can be very useful for identifying cancer driver genes, and these biologically relevant marker genes are associated with clinical outcome. The R package of MinNetRank is available at https://github.com/weitinging/MinNetRank.
Collapse
Affiliation(s)
- Ting Wei
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Botao Fa
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Chengwen Luo
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Luke Johnston
- SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Yue Zhang
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| | - Zhangsheng Yu
- Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.,SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
202
|
Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer. Nat Commun 2021; 12:124. [PMID: 33402734 PMCID: PMC7785750 DOI: 10.1038/s41467-020-20430-7] [Citation(s) in RCA: 73] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Accepted: 12/02/2020] [Indexed: 01/08/2023] Open
Abstract
High-dimensional multi-omics data are now standard in biology. They can greatly enhance our understanding of biological systems when effectively integrated. To achieve proper integration, joint Dimensionality Reduction (jDR) methods are among the most efficient approaches. However, several jDR methods are available, urging the need for a comprehensive benchmark with practical guidelines. We perform a systematic evaluation of nine representative jDR methods using three complementary benchmarks. First, we evaluate their performances in retrieving ground-truth sample clustering from simulated multi-omics datasets. Second, we use TCGA cancer data to assess their strengths in predicting survival, clinical annotations and known pathways/biological processes. Finally, we assess their classification of multi-omics single-cell data. From these in-depth comparisons, we observe that intNMF performs best in clustering, while MCIA offers an effective behavior across many contexts. The code developed for this benchmark study is implemented in a Jupyter notebook—multi-omics mix (momix)—to foster reproducibility, and support users and future developers. Advances in omics technology have resulted in the generation of multi-view data for cancer samples. Here, the authors compare dimensionality reduction techniques using simulated and TCGA data and identify the features of the methods with superior performance.
Collapse
|
203
|
"Omics" in traumatic brain injury: novel approaches to a complex disease. Acta Neurochir (Wien) 2021; 163:2581-2594. [PMID: 34273044 PMCID: PMC8357753 DOI: 10.1007/s00701-021-04928-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 06/23/2021] [Indexed: 11/12/2022]
Abstract
BACKGROUND To date, there is neither any pharmacological treatment with efficacy in traumatic brain injury (TBI) nor any method to halt the disease progress. This is due to an incomplete understanding of the vast complexity of the biological cascades and failure to appreciate the diversity of secondary injury mechanisms in TBI. In recent years, techniques for high-throughput characterization and quantification of biological molecules that include genomics, proteomics, and metabolomics have evolved and referred to as omics. METHODS In this narrative review, we highlight how omics technology can be applied to potentiate diagnostics and prognostication as well as to advance our understanding of injury mechanisms in TBI. RESULTS The omics platforms provide possibilities to study function, dynamics, and alterations of molecular pathways of normal and TBI disease states. Through advanced bioinformatics, large datasets of molecular information from small biological samples can be analyzed in detail and provide valuable knowledge of pathophysiological mechanisms, to include in prognostic modeling when connected to clinically relevant data. In such a complex disease as TBI, omics enables broad categories of studies from gene compositions associated with susceptibility to secondary injury or poor outcome, to potential alterations in metabolites following TBI. CONCLUSION The field of omics in TBI research is rapidly evolving. The recent data and novel methods reviewed herein may form the basis for improved precision medicine approaches, development of pharmacological approaches, and individualization of therapeutic efforts by implementing mathematical "big data" predictive modeling in the near future.
Collapse
|
204
|
Schmitz U, Monteuuis G, Petrova V, Shah JS, Rasko JE. Computational Methods for Intron Retention Identification and Quantification. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11567-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
205
|
Qin G, Liu Z, Xie L. Multiple Omics Data Integration. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11508-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
206
|
Singh R, Singh PK, Kumar R, Kabir MT, Kamal MA, Rauf A, Albadrani GM, Sayed AA, Mousa SA, Abdel-Daim MM, Uddin MS. Multi-Omics Approach in the Identification of Potential Therapeutic Biomolecule for COVID-19. Front Pharmacol 2021. [PMID: 34054532 DOI: 10.3389/fphar2021652335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2023] Open
Abstract
COVID-19 is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It has a disastrous effect on mankind due to the contagious and rapid nature of its spread. Although vaccines for SARS-CoV-2 have been successfully developed, the proven, effective, and specific therapeutic molecules are yet to be identified for the treatment. The repurposing of existing drugs and recognition of new medicines are continuously in progress. Efforts are being made to single out plant-based novel therapeutic compounds. As a result, some of these biomolecules are in their testing phase. During these efforts, the whole-genome sequencing of SARS-CoV-2 has given the direction to explore the omics systems and approaches to overcome this unprecedented health challenge globally. Genome, proteome, and metagenome sequence analyses have helped identify virus nature, thereby assisting in understanding the molecular mechanism, structural understanding, and disease propagation. The multi-omics approaches offer various tools and strategies for identifying potential therapeutic biomolecules for COVID-19 and exploring the plants producing biomolecules that can be used as biopharmaceutical products. This review explores the available multi-omics approaches and their scope to investigate the therapeutic promises of plant-based biomolecules in treating SARS-CoV-2 infection.
Collapse
Affiliation(s)
- Rachana Singh
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow Campus, Lucknow, India
| | - Pradhyumna Kumar Singh
- Plant Molecular Biology and Biotechnology Division, Council of Scientific and Industrial Research- National Botanical Research Institute (CSIR-NBRI), Lucknow, India
| | - Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh, Lucknow Campus, Lucknow, India
| | | | - Mohammad Amjad Kamal
- West China School of Nursing/Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
- King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia
- Enzymoics, Novel Global Community Educational Foundation, Hebersham, NSW, Australia
| | - Abdur Rauf
- Department of Chemistry, University of Swabi, Khyber Pakhtunkhwa, Pakistan
| | - Ghadeer M Albadrani
- Department of Biology, College of Science, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Amany A Sayed
- Zoology Department, Faculty of Science, Cairo University, Giza, Egypt
| | - Shaker A Mousa
- Pharmaceutical Research Institute, Albany College of Pharmacy and Health Sciences, Rensselaer, NY, United States
| | - Mohamed M Abdel-Daim
- Pharmacology Department, Faculty of Veterinary Medicine, Suez Canal University, Ismailia, Egypt
| | - Md Sahab Uddin
- Department of Pharmacy, Southeast University, Dhaka, Bangladesh
- Pharmakon Neuroscience Research Network, Dhaka, Bangladesh
| |
Collapse
|
207
|
Sha Q, Lyu J, Zhao M, Li H, Guo M, Sun Q. Multi-Omics Analysis of Diabetic Nephropathy Reveals Potential New Mechanisms and Drug Targets. Front Genet 2020; 11:616435. [PMID: 33362869 PMCID: PMC7759603 DOI: 10.3389/fgene.2020.616435] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 11/23/2020] [Indexed: 12/21/2022] Open
Abstract
Diabetic nephropathy (DN) is one of the most common diabetic complications, which is the major course of end-stage renal disease (ESRD). However, the systematical molecular characterizations during DN pathogenesis and progression has not been not well understood. To identify the fundamental mediators of the pathogenesis and progression of DN. we performed a combination RNASeq, proteomics, and metabolomics analyses of both patients’ derived kidney biopsy samples and kidneys from in vivo DN model. As a result, molecular changes of DN contain extracellular matrix accumulation, abnormal activated inflamed microenvironment, and metabolism disorders, bringing about glomerular sclerosis and tubular interstitial fibrosis. Specificity, Further integration analyses have identified that the linoleic acid metabolism and fatty-acids β-oxidation are significantly inhibited during DN pathogenesis and progression, the transporter protein ABCD3, the fatty acyl-CoA activated enzymes ACOX1, ACOX2, and ACOX3, and some corresponding metabolites such as 13′-HODE, stearidonic acid, docosahexaenoic acid, (±)10(11)-EpDPA were also significantly reduced. Our study thus provides potential molecular mechanisms for DN progression and suggests that targeting the key enzymes or supplying some lipids may be a promising avenue in the treatment of DN, especially advanced-stage DN.
Collapse
Affiliation(s)
- Qian Sha
- Department of Pharmacy, The Affiliated Hospital of Xuzhou Medical University, Xuzhou, China.,Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, China
| | - Jinxiu Lyu
- Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, China
| | - Meng Zhao
- Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, China
| | - Haijuan Li
- Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, China
| | - Mengzhe Guo
- Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, China
| | - Qiang Sun
- Jiangsu Key Laboratory of New Drug Research and Clinical Pharmacy, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|
208
|
Burton-Pimentel KJ, Pimentel G, Hughes M, Michielsen CC, Fatima A, Vionnet N, Afman LA, Roche HM, Brennan L, Ibberson M, Vergères G. Discriminating Dietary Responses by Combining Transcriptomics and Metabolomics Data in Nutrition Intervention Studies. Mol Nutr Food Res 2020; 65:e2000647. [PMID: 33325641 PMCID: PMC8221028 DOI: 10.1002/mnfr.202000647] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2020] [Revised: 11/03/2020] [Indexed: 12/17/2022]
Abstract
Scope Combining different “omics” data types in a single, integrated analysis may better characterize the effects of diet on human health. Methods and results The performance of two data integration tools, similarity network fusion tool (SNFtool) and Data Integration Analysis for Biomarker discovery using Latent variable approaches for “Omics” (DIABLO; MixOmics), in discriminating responses to diet and metabolic phenotypes is investigated by combining transcriptomics and metabolomics datasets from three human intervention studies: a postprandial crossover study testing dairy foods (n = 7; study 1), a postprandial challenge study comparing obese and non‐obese subjects (n = 13; study 2); and an 8‐week parallel intervention study that assessed three diets with variable lipid content on fasting parameters (n = 39; study 3). In study 1, combining datasets using SNF or DIABLO significantly improve sample classification. For studies 2 and 3, the value of SNF integration depends on the dietary groups being compared, while DIABLO discriminates samples well but does not perform better than transcriptomic data alone. Conclusion The integration of associated “omics” datasets can help clarify the subtle signals observed in nutritional interventions. The performance of each integration tool is differently influenced by study design, size of the datasets, and sample size.
Collapse
Affiliation(s)
- Kathryn J Burton-Pimentel
- Federal Department of Economic Affairs, Education and Research EAER, Agroscope, Schwarzenburgstrasse 161, Bern, 3003, Switzerland
| | - Grégory Pimentel
- Federal Department of Economic Affairs, Education and Research EAER, Agroscope, Schwarzenburgstrasse 161, Bern, 3003, Switzerland
| | - Maria Hughes
- UCD Institute of Food and Health, School of Public Health, Physiotherapy, and Sports Science, University College Dublin, Belfield, Dublin 4, D04 C7X2, Ireland.,Diabetes Complications Research Centre, Conway Institute of Biomolecular and Biomedical Research, Belfield, Dublin 4, Ireland.,Nutrigenomics Research Group, UCD Conway Institute and UCD Institute of Food and Health, School of Public Health, Physiotherapy and Sports Science, Belfield, Dublin 4, D04 V1W8, Ireland
| | - Charlotte Cjr Michielsen
- Nutrition, Metabolism and Genomics Group, Division of Human Nutrition and Health, Wageningen University and Research, P.O. Box 17, Wageningen, 6700 AA, The Netherlands
| | - Attia Fatima
- UCD Institute of Food and Health, School of Public Health, Physiotherapy, and Sports Science, University College Dublin, Belfield, Dublin 4, D04 C7X2, Ireland.,Nutrigenomics Research Group, UCD Conway Institute and UCD Institute of Food and Health, School of Public Health, Physiotherapy and Sports Science, Belfield, Dublin 4, D04 V1W8, Ireland
| | - Nathalie Vionnet
- Service of Endocrinology, Diabetes and Metabolism, Lausanne University Hospital, Lausanne, 1011, Switzerland
| | - Lydia A Afman
- Nutrition, Metabolism and Genomics Group, Division of Human Nutrition and Health, Wageningen University and Research, P.O. Box 17, Wageningen, 6700 AA, The Netherlands
| | - Helen M Roche
- UCD Institute of Food and Health, School of Public Health, Physiotherapy, and Sports Science, University College Dublin, Belfield, Dublin 4, D04 C7X2, Ireland.,Diabetes Complications Research Centre, Conway Institute of Biomolecular and Biomedical Research, Belfield, Dublin 4, Ireland.,Nutrigenomics Research Group, UCD Conway Institute and UCD Institute of Food and Health, School of Public Health, Physiotherapy and Sports Science, Belfield, Dublin 4, D04 V1W8, Ireland.,Institute for Global Food Security, Queens University Belfast, Belfast, BT7 1NN, United Kingdom
| | - Lorraine Brennan
- UCD Institute of Food & Health, School of Agriculture and Food Science, University College Dublin, Belfield, Dublin 4, D04 V1W8, Ireland
| | - Mark Ibberson
- Vital IT, Quartier UNIL-Sorge, Lausanne, 1015, Switzerland.,Swiss Institute of Bioinformatics, Quartier UNIL-Sorge, Lausanne, 1015, Switzerland
| | - Guy Vergères
- Federal Department of Economic Affairs, Education and Research EAER, Agroscope, Schwarzenburgstrasse 161, Bern, 3003, Switzerland
| |
Collapse
|
209
|
Romanis CS, Pearson LA, Neilan BA. Cyanobacterial blooms in wastewater treatment facilities: Significance and emerging monitoring strategies. J Microbiol Methods 2020; 180:106123. [PMID: 33316292 DOI: 10.1016/j.mimet.2020.106123] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 12/06/2020] [Accepted: 12/08/2020] [Indexed: 12/30/2022]
Abstract
Municipal wastewater treatment facilities (WWTFs) are prone to the proliferation of cyanobacterial species which thrive in stable, nutrient-rich environments. Dense cyanobacterial blooms frequently disrupt treatment processes and the supply of recycled water due to their production of extracellular polymeric substances, which hinder microfiltration, and toxins, which pose a health risk to end-users. A variety of methods are employed by water utilities for the identification and monitoring of cyanobacteria and their toxins in WWTFs, including microscopy, flow cytometry, ELISA, chemoanalytical methods, and more recently, molecular methods. Here we review the literature on the occurrence and significance of cyanobacterial blooms in WWTFs and discuss the pros and cons of the various strategies for monitoring these potentially hazardous events. Particular focus is directed towards next-generation metagenomic sequencing technologies for the development of site-specific cyanobacterial bloom management strategies. Long-term multi-omic observations will enable the identification of indicator species and the development of site-specific bloom dynamics models for the mitigation and management of cyanobacterial blooms in WWTFs. While emerging metagenomic tools could potentially provide deep insight into the diversity and flux of problematic cyanobacterial species in these systems, they should be considered a complement to, rather than a replacement of, quantitative chemoanalytical approaches.
Collapse
Affiliation(s)
- Caitlin S Romanis
- School of Environmental and Life Sciences, University of Newcastle, Newcastle 2308, Australia
| | - Leanne A Pearson
- School of Environmental and Life Sciences, University of Newcastle, Newcastle 2308, Australia
| | - Brett A Neilan
- School of Environmental and Life Sciences, University of Newcastle, Newcastle 2308, Australia.
| |
Collapse
|
210
|
Bokulich NA, Ziemski M, Robeson MS, Kaehler BD. Measuring the microbiome: Best practices for developing and benchmarking microbiomics methods. Comput Struct Biotechnol J 2020; 18:4048-4062. [PMID: 33363701 PMCID: PMC7744638 DOI: 10.1016/j.csbj.2020.11.049] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 11/27/2020] [Accepted: 11/28/2020] [Indexed: 12/12/2022] Open
Abstract
Microbiomes are integral components of diverse ecosystems, and increasingly recognized for their roles in the health of humans, animals, plants, and other hosts. Given their complexity (both in composition and function), the effective study of microbiomes (microbiomics) relies on the development, optimization, and validation of computational methods for analyzing microbial datasets, such as from marker-gene (e.g., 16S rRNA gene) and metagenome data. This review describes best practices for benchmarking and implementing computational methods (and software) for studying microbiomes, with particular focus on unique characteristics of microbiomes and microbiomics data that should be taken into account when designing and testing microbiomics methods.
Collapse
Affiliation(s)
- Nicholas A. Bokulich
- Laboratory of Food Systems Biotechnology, Institute of Food, Nutrition, and Health, ETH Zurich, Switzerland
| | - Michal Ziemski
- Laboratory of Food Systems Biotechnology, Institute of Food, Nutrition, and Health, ETH Zurich, Switzerland
| | - Michael S. Robeson
- University of Arkansas for Medical Sciences, Department of Biomedical Informatics, Little Rock, AR, USA
| | | |
Collapse
|
211
|
Wu Z, Lawrence PJ, Ma A, Zhu J, Xu D, Ma Q. Single-Cell Techniques and Deep Learning in Predicting Drug Response. Trends Pharmacol Sci 2020; 41:1050-1065. [PMID: 33153777 PMCID: PMC7669610 DOI: 10.1016/j.tips.2020.10.004] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 10/04/2020] [Accepted: 10/09/2020] [Indexed: 12/19/2022]
Abstract
Rapidly developing single-cell sequencing analyses produce more comprehensive profiles of the genomic, transcriptomic, and epigenomic heterogeneity of tumor subpopulations than do traditional bulk sequencing analyses. Moreover, single-cell techniques allow the response of a tumor to drug exposure to be more thoroughlyinvestigated. Deep learning (DL) models have successfully extracted features from complex bulk sequence data to predict drug responses. We review recent innovations in single-cell technologies and DL-based approaches related to drug sensitivity predictions. We believe that, by using insights from bulk sequencedata, deep transfer learning (DTL) can facilitate the use of single-cell data for training superior DL-based drug prediction models.
Collapse
Affiliation(s)
- Zhenyu Wu
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
| | - Patrick J Lawrence
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
| | - Anjun Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA
| | - Jian Zhu
- Department of Pathology, The Ohio State University, Columbus, OH 43210, USA
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43210, USA.
| |
Collapse
|
212
|
Lv J, Wang J, Shang X, Liu F, Guo S. Survival prediction in patients with colon adenocarcinoma via multi-omics data integration using a deep learning algorithm. Biosci Rep 2020; 40:BSR20201482. [PMID: 33258470 PMCID: PMC7753845 DOI: 10.1042/bsr20201482] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2020] [Revised: 11/25/2020] [Accepted: 11/30/2020] [Indexed: 01/20/2023] Open
Abstract
This study proposed a deep learning (DL) algorithm to predict survival in patients with colon adenocarcinoma (COAD) based on multi-omics integration. The survival-sensitive model was constructed using an autoencoder for DL implementation based on The Cancer Genome Atlas (TCGA) data of patients with COAD. The autoencoder framework was compared to PCA, NMF, t-SNE, and univariable Cox-PH model for identifying survival-related features. The prognostic robustness of the inferred survival risk groups was validated using three independent confirmation cohorts. Differential expression analysis, Pearson's correlation analysis, construction of miRNA-target gene network, and function enrichment analysis were performed. Two risk groups with significant survival differences were identified in TCGA set using the autoencoder-based model (log-rank p-value = 5.51e-07). The autoencoder framework showed superior performance compared to PCA, NMF, t-SNE, and the univariable Cox-PH model based on the C-index, log-rank p-value, and Brier score. The robustness of the classification model was successfully verified in three independent validation sets. There were 1271 differentially expressed genes, 10 differentially expressed miRNAs, and 12 hypermethylated genes between the survival risk groups. Among these, miR-133b and its target genes (GNB4, PTPRZ1, RUNX1T1, EPHA7, GPM6A, BICC1, and ADAMTS5) were used to construct a network. These genes were significantly enriched in ECM-receptor interaction, focal adhesion, PI3K-Akt signaling pathway, and glucose metabolism-related pathways. The risk subgroups obtained through a multi-omics data integration pipeline using the DL algorithm had good robustness. miR-133b and its target genes could be potential diagnostic markers. The results would assist in elucidating the possible pathogenesis of COAD.
Collapse
Affiliation(s)
- Jiudi Lv
- Department of General Surgery Three, Xinxiang Central Hospital, No. 56 Jinsui Avenue, Xinxiang, Henan 453000, China
| | - Junjie Wang
- Department of Oncology Medicine Three, Xinxiang Central Hospital, No. 56 Jinsui Avenue, Xinxiang, Henan 453000, China
| | - Xiujuan Shang
- Department of General Surgery Three, Xinxiang Central Hospital, No. 56 Jinsui Avenue, Xinxiang, Henan 453000, China
| | - Fangfang Liu
- Department of General Surgery Three, Xinxiang Central Hospital, No. 56 Jinsui Avenue, Xinxiang, Henan 453000, China
| | - Shixun Guo
- Severe Medical Section, Xinxiang Central Hospital, No. 56 Jinsui Avenue, Xinxiang, Henan 453000, China
| |
Collapse
|
213
|
A Customizable Analysis Flow in Integrative Multi-Omics. Biomolecules 2020; 10:biom10121606. [PMID: 33260881 PMCID: PMC7760368 DOI: 10.3390/biom10121606] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 11/20/2020] [Accepted: 11/23/2020] [Indexed: 12/21/2022] Open
Abstract
The number of researchers using multi-omics is growing. Though still expensive, every year it is cheaper to perform multi-omic studies, often exponentially so. In addition to its increasing accessibility, multi-omics reveals a view of systems biology to an unprecedented depth. Thus, multi-omics can be used to answer a broad range of biological questions in finer resolution than previous methods. We used six omic measurements—four nucleic acid (i.e., genomic, epigenomic, transcriptomics, and metagenomic) and two mass spectrometry (proteomics and metabolomics) based—to highlight an analysis workflow on this type of data, which is often vast. This workflow is not exhaustive of all the omic measurements or analysis methods, but it will provide an experienced or even a novice multi-omic researcher with the tools necessary to analyze their data. This review begins with analyzing a single ome and study design, and then synthesizes best practices in data integration techniques that include machine learning. Furthermore, we delineate methods to validate findings from multi-omic integration. Ultimately, multi-omic integration offers a window into the complexity of molecular interactions and a comprehensive view of systems biology.
Collapse
|
214
|
Demetrowitsch TJ, Schlicht K, Knappe C, Zimmermann J, Jensen-Kroll J, Pisarevskaja A, Brix F, Brandes J, Geisler C, Marinos G, Sommer F, Schulte DM, Kaleta C, Andersen V, Laudes M, Schwarz K, Waschina S. Precision Nutrition in Chronic Inflammation. Front Immunol 2020; 11:587895. [PMID: 33329569 PMCID: PMC7719806 DOI: 10.3389/fimmu.2020.587895] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 10/22/2020] [Indexed: 12/11/2022] Open
Abstract
The molecular foundation of chronic inflammatory diseases (CIDs) can differ markedly between individuals. As our understanding of the biochemical mechanisms underlying individual disease manifestations and progressions expands, new strategies to adjust treatments to the patient's characteristics will continue to profoundly transform clinical practice. Nutrition has long been recognized as an important determinant of inflammatory disease phenotypes and treatment response. Yet empirical work demonstrating the therapeutic effectiveness of patient-tailored nutrition remains scarce. This is mainly due to the challenges presented by long-term effects of nutrition, variations in inter-individual gastrointestinal microbiota, the multiplicity of human metabolic pathways potentially affected by food ingredients, nutrition behavior, and the complexity of food composition. Historically, these challenges have been addressed in both human studies and experimental model laboratory studies primarily by using individual nutrition data collection in tandem with large-scale biomolecular data acquisition (e.g. genomics, metabolomics, etc.). This review highlights recent findings in the field of precision nutrition and their potential implications for the development of personalized treatment strategies for CIDs. It emphasizes the importance of computational approaches to integrate nutritional information into multi-omics data analysis and to predict which molecular mechanisms may explain how nutrients intersect with disease pathways. We conclude that recent findings point towards the unexhausted potential of nutrition as part of personalized medicine in chronic inflammation.
Collapse
Affiliation(s)
- Tobias J. Demetrowitsch
- Division of Food Technology, Institute of Human Nutrition and Food Science, Kiel University, Kiel, Germany
| | - Kristina Schlicht
- Division of Endocrinology, Diabetes and Clinical Nutrition, Department of Medicine 1, Kiel University, Kiel, Germany
| | - Carina Knappe
- Division of Endocrinology, Diabetes and Clinical Nutrition, Department of Medicine 1, Kiel University, Kiel, Germany
| | - Johannes Zimmermann
- Research Group Medical Systems Biology, Institute of Experimental Medicine, Kiel University, Kiel, Germany
| | - Julia Jensen-Kroll
- Division of Food Technology, Institute of Human Nutrition and Food Science, Kiel University, Kiel, Germany
| | - Alina Pisarevskaja
- Division of Food Technology, Institute of Human Nutrition and Food Science, Kiel University, Kiel, Germany
- Division of Nutriinformatics, Institute of Human Nutrition and Food Science, Kiel University, Kiel, Germany
| | - Fynn Brix
- Division of Food Technology, Institute of Human Nutrition and Food Science, Kiel University, Kiel, Germany
| | - Juliane Brandes
- Division of Endocrinology, Diabetes and Clinical Nutrition, Department of Medicine 1, Kiel University, Kiel, Germany
| | - Corinna Geisler
- Division of Endocrinology, Diabetes and Clinical Nutrition, Department of Medicine 1, Kiel University, Kiel, Germany
| | - Georgios Marinos
- Research Group Medical Systems Biology, Institute of Experimental Medicine, Kiel University, Kiel, Germany
| | - Felix Sommer
- Institute of Clinical Molecular Biology (IKMB), Kiel University, Kiel, Germany
| | - Dominik M. Schulte
- Division of Endocrinology, Diabetes and Clinical Nutrition, Department of Medicine 1, Kiel University, Kiel, Germany
| | - Christoph Kaleta
- Research Group Medical Systems Biology, Institute of Experimental Medicine, Kiel University, Kiel, Germany
| | - Vibeke Andersen
- Institute of Regional Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
- Focused Research Unit for Molecular Diagnostic and Clinical Research, University Hospital of Southern Denmark, Aabenraa, Denmark
| | - Matthias Laudes
- Division of Endocrinology, Diabetes and Clinical Nutrition, Department of Medicine 1, Kiel University, Kiel, Germany
| | - Karin Schwarz
- Division of Food Technology, Institute of Human Nutrition and Food Science, Kiel University, Kiel, Germany
| | - Silvio Waschina
- Division of Nutriinformatics, Institute of Human Nutrition and Food Science, Kiel University, Kiel, Germany
| |
Collapse
|
215
|
Zhang WH, Wang WQ, Han X, Gao HL, Li TJ, Xu SS, Li S, Xu HX, Li H, Ye LY, Lin X, Wu CT, Long J, Yu XJ, Liu L. Advances on diagnostic biomarkers of pancreatic ductal adenocarcinoma: A systems biology perspective. Comput Struct Biotechnol J 2020; 18:3606-3614. [PMID: 33304458 PMCID: PMC7710502 DOI: 10.1016/j.csbj.2020.11.018] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 11/08/2020] [Accepted: 11/10/2020] [Indexed: 12/26/2022] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) is a lethal malignancy that is usually diagnosed at an advanced stage when curative surgery is no longer an option. Robust diagnostic biomarkers with high sensitivity and specificity for early detection are urgently needed. Systems biology provides a powerful tool for understanding diseases and solving challenging biological problems, allowing biomarkers to be identified and quantified with increasing accuracy, sensitivity, and comprehensiveness. Here, we present a comprehensive overview of efforts to identify biomarkers of PDAC using genomics, transcriptomics, proteomics, metabonomics, and bioinformatics. Systems biology perspective provides a crucial “network” to integrate multi-omics approaches to biomarker identification, shedding additional light on early PDAC detection.
Collapse
Affiliation(s)
- Wu-Hu Zhang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Wen-Quan Wang
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Xuan Han
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - He-Li Gao
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Tian-Jiao Li
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Shuai-Shuai Xu
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Shuo Li
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Hua-Xiang Xu
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Hao Li
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Long-Yun Ye
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Xuan Lin
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Chun-Tao Wu
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Jiang Long
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Xian-Jun Yu
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| | - Liang Liu
- Department of Pancreatic Surgery, Fudan University Shanghai Cancer Center, Shanghai, China.,Department of Oncology, Shanghai Medical College, Fudan University, Shanghai, China.,Shanghai Pancreatic Cancer Institute, Shanghai, China.,Pancreatic Cancer Institute, Fudan University, Shanghai, China
| |
Collapse
|
216
|
Chitoiu L, Dobranici A, Gherghiceanu M, Dinescu S, Costache M. Multi-Omics Data Integration in Extracellular Vesicle Biology-Utopia or Future Reality? Int J Mol Sci 2020; 21:ijms21228550. [PMID: 33202771 PMCID: PMC7697477 DOI: 10.3390/ijms21228550] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 11/10/2020] [Accepted: 11/11/2020] [Indexed: 12/15/2022] Open
Abstract
Extracellular vesicles (EVs) are membranous structures derived from the endosomal system or generated by plasma membrane shedding. Due to their composition of DNA, RNA, proteins, and lipids, EVs have garnered a lot of attention as an essential mechanism of cell-to-cell communication, with various implications in physiological and pathological processes. EVs are not only a highly heterogeneous population by means of size and biogenesis, but they are also a source of diverse, functionally rich biomolecules. Recent advances in high-throughput processing of biological samples have facilitated the development of databases comprised of characteristic genomic, transcriptomic, proteomic, metabolomic, and lipidomic profiles for EV cargo. Despite the in-depth approach used to map functional molecules in EV-mediated cellular cross-talk, few integrative methods have been applied to analyze the molecular interplay in these targeted delivery systems. New perspectives arise from the field of systems biology, where accounting for heterogeneity may lead to finding patterns in an apparently random pool of data. In this review, we map the biological and methodological causes of heterogeneity in EV multi-omics data and present current applications or possible statistical methods for integrating such data while keeping track of the current bottlenecks in the field.
Collapse
Affiliation(s)
- Leona Chitoiu
- Ultrastructural Pathology and Bioimaging Laboratory, ‘Victor Babeș’ National Institute of Pathology, Bucharest 050096, Romania; (L.C.); (M.G.)
| | - Alexandra Dobranici
- Department of Biochemistry and Molecular Biology, University of Bucharest, Bucharest 050095, Romania; (A.D.); (M.C.)
| | - Mihaela Gherghiceanu
- Ultrastructural Pathology and Bioimaging Laboratory, ‘Victor Babeș’ National Institute of Pathology, Bucharest 050096, Romania; (L.C.); (M.G.)
- Department of Cellular, Molecular Biology and Histology, ‘Carol Davila’ University of Medicine and Pharmacy, Bucharest 050474, Romania
| | - Sorina Dinescu
- Department of Biochemistry and Molecular Biology, University of Bucharest, Bucharest 050095, Romania; (A.D.); (M.C.)
- Research Institute of the University of Bucharest, University of Bucharest, Bucharest 050663, Romania
- Correspondence:
| | - Marieta Costache
- Department of Biochemistry and Molecular Biology, University of Bucharest, Bucharest 050095, Romania; (A.D.); (M.C.)
- Research Institute of the University of Bucharest, University of Bucharest, Bucharest 050663, Romania
| |
Collapse
|
217
|
Edison AS, Colonna M, Gouveia GJ, Holderman NR, Judge MT, Shen X, Zhang S. NMR: Unique Strengths That Enhance Modern Metabolomics Research. Anal Chem 2020; 93:478-499. [DOI: 10.1021/acs.analchem.0c04414] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
218
|
Labory J, Fierville M, Ait-El-Mkadem S, Bannwarth S, Paquis-Flucklinger V, Bottini S. Multi-Omics Approaches to Improve Mitochondrial Disease Diagnosis: Challenges, Advances, and Perspectives. Front Mol Biosci 2020; 7:590842. [PMID: 33240932 PMCID: PMC7667268 DOI: 10.3389/fmolb.2020.590842] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 10/14/2020] [Indexed: 01/06/2023] Open
Abstract
Mitochondrial diseases (MD) are rare disorders caused by deficiency of the mitochondrial respiratory chain, which provides energy in each cell. They are characterized by a high clinical and genetic heterogeneity and in most patients, the responsible gene is unknown. Diagnosis is based on the identification of the causative gene that allows genetic counseling, prenatal diagnosis, understanding of pathological mechanisms, and personalized therapeutic approaches. Despite the emergence of Next Generation Sequencing (NGS), to date, more than one out of two patients has no diagnosis in the absence of identification of the responsible gene. Technologies currently used for detecting causal variants (genetic alterations) is far from complete, leading many variants of unknown significance (VUS) and mainly based on the use of whole exome sequencing thus neglecting the identification of non-coding variants. The complexity of human genome and its regulation at multiple levels has led biologists to develop several assays to interrogate the different aspects of biological processes. While one-dimension single omics investigation offers a peek of this complex system, the combination of different omics data allows the discovery of coherent signatures. The community of computational biologists and bioinformaticians, in order to integrate data from different omics, has developed several approaches and tools. However, it is difficult to understand which suits the best to predict diverse phenotypic outcome. First attempts to use multi-omics approaches showed an improvement of the diagnostic power. However, we are far from a complete understanding of MD and their diagnosis. After reviewing multi-omics algorithms developed in the latest years, we are proposing here a novel data-driven classification and we will discuss how multi-omics will change and improve the diagnosis of MD. Due to the growing use of multi-omics approaches in MD, we foresee that this work will contribute to set up good practices to perform multi-omics data integration to improve the prediction of phenotypic outcomes and the diagnostic power of MD.
Collapse
Affiliation(s)
- Justine Labory
- Université Côte d’Azur, Center of Modeling, Simulation and Interactions, Nice, France
| | - Morgane Fierville
- Université Côte d’Azur, Center of Modeling, Simulation and Interactions, Nice, France
| | - Samira Ait-El-Mkadem
- Université Côte d’Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre hospitalier universitaire (CHU) de Nice, Nice, France
| | - Sylvie Bannwarth
- Université Côte d’Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre hospitalier universitaire (CHU) de Nice, Nice, France
| | - Véronique Paquis-Flucklinger
- Université Côte d’Azur, Center of Modeling, Simulation and Interactions, Nice, France
- Université Côte d’Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre hospitalier universitaire (CHU) de Nice, Nice, France
| | - Silvia Bottini
- Université Côte d’Azur, Center of Modeling, Simulation and Interactions, Nice, France
| |
Collapse
|
219
|
Rodosthenous T, Shahrezaei V, Evangelou M. Integrating multi-OMICS data through sparse canonical correlation analysis for the prediction of complex traits: a comparison study. Bioinformatics 2020; 36:4616-4625. [PMID: 32437529 PMCID: PMC7750936 DOI: 10.1093/bioinformatics/btaa530] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2019] [Revised: 04/22/2020] [Accepted: 05/16/2020] [Indexed: 01/08/2023] Open
Abstract
Motivation Recent developments in technology have enabled researchers to collect multiple OMICS datasets for the same individuals. The conventional approach for understanding the relationships between the collected datasets and the complex trait of interest would be through the analysis of each OMIC dataset separately from the rest, or to test for associations between the OMICS datasets. In this work we show that integrating multiple OMICS datasets together, instead of analysing them separately, improves our understanding of their in-between relationships as well as the predictive accuracy for the tested trait. Several approaches have been proposed for the integration of heterogeneous and high-dimensional (p≫n) data, such as OMICS. The sparse variant of canonical correlation analysis (CCA) approach is a promising one that seeks to penalize the canonical variables for producing sparse latent variables while achieving maximal correlation between the datasets. Over the last years, a number of approaches for implementing sparse CCA (sCCA) have been proposed, where they differ on their objective functions, iterative algorithm for obtaining the sparse latent variables and make different assumptions about the original datasets. Results Through a comparative study we have explored the performance of the conventional CCA proposed by Parkhomenko et al., penalized matrix decomposition CCA proposed by Witten and Tibshirani and its extension proposed by Suo et al. The aforementioned methods were modified to allow for different penalty functions. Although sCCA is an unsupervised learning approach for understanding of the in-between relationships, we have twisted the problem as a supervised learning one and investigated how the computed latent variables can be used for predicting complex traits. The approaches were extended to allow for multiple (more than two) datasets where the trait was included as one of the input datasets. Both ways have shown improvement over conventional predictive models that include one or multiple datasets. Availability and implementation https://github.com/theorod93/sCCA. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Vahid Shahrezaei
- Department of Mathematics, Imperial College London, London SW7 2AZ, UK
| | - Marina Evangelou
- Department of Mathematics, Imperial College London, London SW7 2AZ, UK
| |
Collapse
|
220
|
Low Entropy Sub-Networks Prevent the Integration of Metabolomic and Transcriptomic Data. ENTROPY 2020; 22:e22111238. [PMID: 33287006 PMCID: PMC7712986 DOI: 10.3390/e22111238] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/23/2020] [Accepted: 10/27/2020] [Indexed: 02/08/2023]
Abstract
The constantly and rapidly increasing amount of the biological data gained from many different high-throughput experiments opens up new possibilities for data- and model-driven inference. Yet, alongside, emerges a problem of risks related to data integration techniques. The latter are not so widely taken account of. Especially, the approaches based on the flux balance analysis (FBA) are sensitive to the structure of a metabolic network for which the low-entropy clusters can prevent the inference from the activity of the metabolic reactions. In the following article, we set forth problems that may arise during the integration of metabolomic data with gene expression datasets. We analyze common pitfalls, provide their possible solutions, and exemplify them by a case study of the renal cell carcinoma (RCC). Using the proposed approach we provide a metabolic description of the known morphological RCC subtypes and suggest a possible existence of the poor-prognosis cluster of patients, which are commonly characterized by the low activity of the drug transporting enzymes crucial in the chemotherapy. This discovery suits and extends the already known poor-prognosis characteristics of RCC. Finally, the goal of this work is also to point out the problem that arises from the integration of high-throughput data with the inherently nonuniform, manually curated low-throughput data. In such cases, the over-represented information may potentially overshadow the non-trivial discoveries.
Collapse
|
221
|
Song M, Greenbaum J, Luttrell J, Zhou W, Wu C, Shen H, Gong P, Zhang C, Deng HW. A Review of Integrative Imputation for Multi-Omics Datasets. Front Genet 2020; 11:570255. [PMID: 33193667 PMCID: PMC7594632 DOI: 10.3389/fgene.2020.570255] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Accepted: 09/16/2020] [Indexed: 01/05/2023] Open
Abstract
Multi-omics studies, which explore the interactions between multiple types of biological factors, have significant advantages over single-omics analysis for their ability to provide a more holistic view of biological processes, uncover the causal and functional mechanisms for complex diseases, and facilitate new discoveries in precision medicine. However, omics datasets often contain missing values, and in multi-omics study designs it is common for individuals to be represented for some omics layers but not all. Since most statistical analyses cannot be applied directly to the incomplete datasets, imputation is typically performed to infer the missing values. Integrative imputation techniques which make use of the correlations and shared information among multi-omics datasets are expected to outperform approaches that rely on single-omics information alone, resulting in more accurate results for the subsequent downstream analyses. In this review, we provide an overview of the currently available imputation methods for handling missing values in bioinformatics data with an emphasis on multi-omics imputation. In addition, we also provide a perspective on how deep learning methods might be developed for the integrative imputation of multi-omics datasets.
Collapse
Affiliation(s)
- Meng Song
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, United States
| | - Jonathan Greenbaum
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, United States
| | - Joseph Luttrell
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, United States
| | - Weihua Zhou
- College of Computing, Michigan Technological University, Houghton, MI, United States
| | - Chong Wu
- Department of Statistics, Florida State University, Tallahassee, FL, United States
| | - Hui Shen
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, United States
| | - Ping Gong
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, United States
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, United States
| | - Hong-Wen Deng
- Tulane Center of Biomedical Informatics and Genomics, School of Medicine, Tulane University, New Orleans, LA, United States
| |
Collapse
|
222
|
Xu K, Aouizerat BE. Searching for Genomic Biomarkers for Major Depressive Disorder in Peripheral Immune Cells. Biol Psychiatry 2020; 88:591-593. [PMID: 32972513 DOI: 10.1016/j.biopsych.2020.07.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Revised: 07/28/2020] [Accepted: 07/29/2020] [Indexed: 11/29/2022]
Affiliation(s)
- Ke Xu
- Department of Psychiatry, Yale School of Medicine, New Haven, Connecticut; Connecticut Veteran Healthcare System, West Haven, Connecticut.
| | - Bradley E Aouizerat
- Bluestone Center for Clinical Research, College of Dentistry, New York University, New York, New York; Department of Oral and Maxillofacial Surgery, College of Dentistry, New York University, New York, New York
| |
Collapse
|
223
|
Statistical and Machine-Learning Analyses in Nutritional Genomics Studies. Nutrients 2020; 12:nu12103140. [PMID: 33066636 PMCID: PMC7602401 DOI: 10.3390/nu12103140] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 10/08/2020] [Accepted: 10/10/2020] [Indexed: 12/18/2022] Open
Abstract
Nutritional compounds may have an influence on different OMICs levels, including genomics, epigenomics, transcriptomics, proteomics, metabolomics, and metagenomics. The integration of OMICs data is challenging but may provide new knowledge to explain the mechanisms involved in the metabolism of nutrients and diseases. Traditional statistical analyses play an important role in description and data association; however, these statistical procedures are not sufficiently enough powered to interpret the large integrated multiple OMICs (multi-OMICS) datasets. Machine learning (ML) approaches can play a major role in the interpretation of multi-OMICS in nutrition research. Specifically, ML can be used for data mining, sample clustering, and classification to produce predictive models and algorithms for integration of multi-OMICs in response to dietary intake. The objective of this review was to investigate the strategies used for the analysis of multi-OMICs data in nutrition studies. Sixteen recent studies aimed to understand the association between dietary intake and multi-OMICs data are summarized. Multivariate analysis in multi-OMICs nutrition studies is used more commonly for analyses. Overall, as nutrition research incorporated multi-OMICs data, the use of novel approaches of analysis such as ML needs to complement the traditional statistical analyses to fully explain the impact of nutrition on health and disease.
Collapse
|
224
|
Seneviratne CJ, Suriyanarayanan T, Widyarman AS, Lee LS, Lau M, Ching J, Delaney C, Ramage G. Multi-omics tools for studying microbial biofilms: current perspectives and future directions. Crit Rev Microbiol 2020; 46:759-778. [PMID: 33030973 DOI: 10.1080/1040841x.2020.1828817] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The advent of omics technologies has greatly improved our understanding of microbial biology, particularly in the last two decades. The field of microbial biofilms is, however, relatively new, consolidated in the 1980s. The morphogenic switching by microbes from planktonic to biofilm phenotype confers numerous survival advantages such as resistance to desiccation, antibiotics, biocides, ultraviolet radiation, and host immune responses, thereby complicating treatment strategies for pathogenic microorganisms. Hence, understanding the mechanisms governing the biofilm phenotype can result in efficient treatment strategies directed specifically against molecular markers mediating this process. The application of omics technologies for studying microbial biofilms is relatively less explored and holds great promise in furthering our understanding of biofilm biology. In this review, we provide an overview of the application of omics tools such as transcriptomics, proteomics, and metabolomics as well as multi-omics approaches for studying microbial biofilms in the current literature. We also highlight how the use of omics tools directed at various stages of the biological information flow, from genes to metabolites, can be integrated via multi-omics platforms to provide a holistic view of biofilm biology. Following this, we propose a future artificial intelligence-based multi-omics platform that can predict the pathways associated with different biofilm phenotypes.
Collapse
Affiliation(s)
- Chaminda J Seneviratne
- Singapore Oral Microbiomics Initiative (SOMI), National Dental Research Institute Singapore, National Dental Centre, Singapore, Singapore.,Duke NUS Medical School, Singapore, Singapore
| | - Tanujaa Suriyanarayanan
- Singapore Oral Microbiomics Initiative (SOMI), National Dental Research Institute Singapore, National Dental Centre, Singapore, Singapore.,Duke NUS Medical School, Singapore, Singapore
| | - Armelia Sari Widyarman
- Department of Microbiology, Faculty of Dentistry, Trisakti University, Grogol, West Jakarta, Indonesia
| | - Lye Siang Lee
- Duke-NUS Medical School, Metabolomics Lab, Cardiovascular and Metabolic Disorders, Singapore, Singapore
| | - Matthew Lau
- Singapore Oral Microbiomics Initiative (SOMI), National Dental Research Institute Singapore, National Dental Centre, Singapore, Singapore
| | - Jianhong Ching
- Duke-NUS Medical School, Metabolomics Lab, Cardiovascular and Metabolic Disorders, Singapore, Singapore
| | - Christopher Delaney
- School of Medicine, Dentistry & Nursing, Glasgow Dental Hospital & School, University of Glasgow, Glasgow, UK
| | - Gordon Ramage
- School of Medicine, Dentistry & Nursing, Glasgow Dental Hospital & School, University of Glasgow, Glasgow, UK
| |
Collapse
|
225
|
Cruickshank IJ, Carley KM. Characterizing communities of hashtag usage on twitter during the 2020 COVID-19 pandemic by multi-view clustering. APPLIED NETWORK SCIENCE 2020; 5:66. [PMID: 32953977 PMCID: PMC7492790 DOI: 10.1007/s41109-020-00317-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2020] [Accepted: 09/08/2020] [Indexed: 05/31/2023]
Abstract
The COVID-19 pandemic has produced a flurry of online activity on social media sites. As such, analysis of social media data during the COVID-19 pandemic can produce unique insights into discussion topics and how those topics evolve over the course of the pandemic. In this study, we propose analyzing discussion topics on Twitter by clustering hashtags. In order to obtain high-quality clusters of the Twitter hashtags, we also propose a novel multi-view clustering technique that incorporates multiple different data types that can be used to describe how users interact with hashtags. The results of our multi-view clustering show that there are distinct temporal and topical trends present within COVID-19 twitter discussion. In particular, we find that some topical clusters of hashtags shift over the course of the pandemic, while others are persistent throughout, and that there are distinct temporal trends in hashtag usage. This study is the first to use multi-view clustering to analyze hashtags and the first analysis of the greater trends of discussion occurring online during the COVID-19 pandemic.
Collapse
|
226
|
Mudadu MDA, Zerlotini A. Machado: Open source genomics data integration framework. Gigascience 2020; 9:giaa097. [PMID: 32930331 PMCID: PMC7490629 DOI: 10.1093/gigascience/giaa097] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 07/13/2020] [Accepted: 08/29/2020] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND Genome projects and multiomics experiments generate huge volumes of data that must be stored, mined, and transformed into useful knowledge. All this information is supposed to be accessible and, if possible, browsable afterwards. Computational biologists have been dealing with this scenario for more than a decade and have been implementing software and databases to meet this challenge. The GMOD's (Generic Model Organism Database) biological relational database schema, known as Chado, is one of the few successful open source initiatives; it is widely adopted and many software packages are able to connect to it. FINDINGS We have been developing an open source software package named Machado, a genomics data integration framework implemented in Python, to enable research groups to both store and visualize genomics data. The framework relies on the Chado database schema and, therefore, should be very intuitive for current developers to adopt it or have it running on top of already existing databases. It has several data-loading tools for genomics and transcriptomics data and also for annotation results from tools such as BLAST, InterproScan, OrthoMCL, and LSTrAP. There is an API to connect to JBrowse, and a web visualization tool is implemented using Django Views and Templates. The Haystack library integrated with the ElasticSearch engine was used to implement a Google-like search, i.e., single auto-complete search box that provides fast results and filters. CONCLUSION Machado aims to be a modern object-relational framework that uses the latest Python libraries to produce an effective open source resource for genomics research.
Collapse
Affiliation(s)
| | - Adhemar Zerlotini
- Embrapa Informática Agropecuária, Campinas, São Paulo, Post Code 13083–886, PO Box 6041, Brazil
| |
Collapse
|
227
|
Mudadu MDA, Zerlotini A. Machado: Open source genomics data integration framework. Gigascience 2020; 9:5905760. [PMID: 32930331 DOI: 10.1101/2020.05.08.084731] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 07/13/2020] [Accepted: 08/29/2020] [Indexed: 05/28/2023] Open
Abstract
BACKGROUND Genome projects and multiomics experiments generate huge volumes of data that must be stored, mined, and transformed into useful knowledge. All this information is supposed to be accessible and, if possible, browsable afterwards. Computational biologists have been dealing with this scenario for more than a decade and have been implementing software and databases to meet this challenge. The GMOD's (Generic Model Organism Database) biological relational database schema, known as Chado, is one of the few successful open source initiatives; it is widely adopted and many software packages are able to connect to it. FINDINGS We have been developing an open source software package named Machado, a genomics data integration framework implemented in Python, to enable research groups to both store and visualize genomics data. The framework relies on the Chado database schema and, therefore, should be very intuitive for current developers to adopt it or have it running on top of already existing databases. It has several data-loading tools for genomics and transcriptomics data and also for annotation results from tools such as BLAST, InterproScan, OrthoMCL, and LSTrAP. There is an API to connect to JBrowse, and a web visualization tool is implemented using Django Views and Templates. The Haystack library integrated with the ElasticSearch engine was used to implement a Google-like search, i.e., single auto-complete search box that provides fast results and filters. CONCLUSION Machado aims to be a modern object-relational framework that uses the latest Python libraries to produce an effective open source resource for genomics research.
Collapse
Affiliation(s)
| | - Adhemar Zerlotini
- Embrapa Informática Agropecuária, Campinas, São Paulo, Post Code 13083-886, PO Box 6041, Brazil
| |
Collapse
|
228
|
Fu Y, Xu J, Tang Z, Wang L, Yin D, Fan Y, Zhang D, Deng F, Zhang Y, Zhang H, Wang H, Xing W, Yin L, Zhu S, Zhu M, Yu M, Li X, Liu X, Yuan X, Zhao S. A gene prioritization method based on a swine multi-omics knowledgebase and a deep learning model. Commun Biol 2020; 3:502. [PMID: 32913254 PMCID: PMC7483748 DOI: 10.1038/s42003-020-01233-4] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 08/07/2020] [Indexed: 12/27/2022] Open
Abstract
The analyses of multi-omics data have revealed candidate genes for objective traits. However, they are integrated poorly, especially in non-model organisms, and they pose a great challenge for prioritizing candidate genes for follow-up experimental verification. Here, we present a general convolutional neural network model that integrates multi-omics information to prioritize the candidate genes of objective traits. By applying this model to Sus scrofa, which is a non-model organism, but one of the most important livestock animals, the model precision was 72.9%, recall 73.5%, and F1-Measure 73.4%, demonstrating a good prediction performance compared with previous studies in Arabidopsis thaliana and Oryza sativa. Additionally, to facilitate the use of the model, we present ISwine (http://iswine.iomics.pro/), which is an online comprehensive knowledgebase in which we incorporated almost all the published swine multi-omics data. Overall, the results suggest that the deep learning strategy will greatly facilitate analyses of multi-omics integration in the future. Yuhua Fu et al. develop a CNN model that integrates multi-omics information to prioritize candidate genes of objective traits. Their model performs well when applied to important livestock non-model animals like Sus scrofa. Finally, the authors present ISwine, an online comprehensive knowledgebase which includes all published swine omics data to facilitate the integration of heterogeneous data.
Collapse
Affiliation(s)
- Yuhua Fu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.,School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Jingya Xu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Zhenshuang Tang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Lu Wang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Dong Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Yu Fan
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Dongdong Zhang
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Fei Deng
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Yanping Zhang
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Haohao Zhang
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Haiyan Wang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Wenhui Xing
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China
| | - Lilin Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Shilin Zhu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Mengjin Zhu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Mei Yu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China
| | - Xiaolei Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.
| | - Xiaohui Yuan
- School of Computer Science and Technology, Wuhan University of Technology, 430070, Wuhan, Hubei, P.R. China.
| | - Shuhong Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, 430070, Wuhan, Hubei, P.R. China.
| |
Collapse
|
229
|
Ma A, McDermaid A, Xu J, Chang Y, Ma Q. Integrative Methods and Practical Challenges for Single-Cell Multi-omics. Trends Biotechnol 2020; 38:1007-1022. [PMID: 32818441 PMCID: PMC7442857 DOI: 10.1016/j.tibtech.2020.02.013] [Citation(s) in RCA: 118] [Impact Index Per Article: 29.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Revised: 02/27/2020] [Accepted: 02/28/2020] [Indexed: 12/19/2022]
Abstract
Fast-developing single-cell multimodal omics (scMulti-omics) technologies enable the measurement of multiple modalities, such as DNA methylation, chromatin accessibility, RNA expression, protein abundance, gene perturbation, and spatial information, from the same cell. scMulti-omics can comprehensively explore and identify cell characteristics, while also presenting challenges to the development of computational methods and tools for integrative analyses. Here, we review these integrative methods and summarize the existing tools for studying a variety of scMulti-omics data. The various functionalities and practical challenges in using the available tools in the public domain are explored through several case studies. Finally, we identify remaining challenges and future trends in scMulti-omics modeling and analyses.
Collapse
Affiliation(s)
- Anjun Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43235, USA
| | - Adam McDermaid
- Imagenetics, Sanford Health, Sioux Falls, SD 57104, USA; Department of Internal Medicine, University of South Dakota, Virmillion, SD 57069, USA
| | - Jennifer Xu
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43235, USA; Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Yuzhou Chang
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43235, USA
| | - Qin Ma
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH 43235, USA.
| |
Collapse
|
230
|
Ouattara DA, Remolue L, Becker J, Perret M, Bunescu A, Hennig K, Biliaut E, Badin A, Giacomini C, Reynier F, Andreoni C, Béquet F, Lecine P, De Luca K. An integrated transcriptomics and metabolomics study of the immune response of newly hatched chicks to the cytosine-phosphate-guanine oligonucleotide stimulation. Poult Sci 2020; 99:4360-4372. [PMID: 32867980 PMCID: PMC7598132 DOI: 10.1016/j.psj.2020.06.017] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 05/26/2020] [Accepted: 06/19/2020] [Indexed: 11/13/2022] Open
Abstract
The immunological immaturity of the innate immune system during the first-week post-hatch enables pathogens to infect chickens, leading to the death of the animals. Current preventive solutions to improve the resistance of chicks to infections include vaccination, breeding, and sanitation. Other prophylactic solutions have been investigated, such as the stimulation of animal health with immunostimulants. Recent studies showed that administration of immune-modulators to one-day-old chicks, or in ovo, significantly reduces mortality in experimental bacterial or viral infection challenge models. Owing to a lack of molecular biomarkers required to evaluate chicken immune responses and assess the efficacy of vaccines or immune-modulators, challenge models are still used. One way to reduce challenge experiments is to define molecular signatures through omics approaches, resulting in new methodologies to rapidly screen candidate molecules or vaccines. This study aims at identifying a dual transcriptomics and metabolomics blood signature after administration of CpG-ODN (cytosine-phosphate-guanine oligodeoxynucleotides), a reference immune-stimulatory molecule. A clinical study was conducted with chicks and transcriptomics and metabolomics analyses were performed on whole-blood and plasma samples, respectively. Differentially expressed genes and metabolites with different abundance were identified in chicks treated with CpG-ODN. The results showed that CpG-ODN activated the innate immune system, within hours after administration, and its effect lasted over time, as metabolomics and transcriptomics profiles still varied 6 D after administration. In conclusion, through an integrated clinical omics approach, we deciphered in part the mode of action of CpG-ODN in post-hatch chicks.
Collapse
Affiliation(s)
| | - Lydie Remolue
- Boehringer Ingelheim Animal Health, R&D, Lyon, France
| | - Jérémie Becker
- BIOASTER Microbiology Technology Institute, Lyon 69007, France
| | - Magali Perret
- BIOASTER Microbiology Technology Institute, Lyon 69007, France
| | - Andrei Bunescu
- BIOASTER Microbiology Technology Institute, Lyon 69007, France
| | - Kristin Hennig
- BIOASTER Microbiology Technology Institute, Lyon 69007, France
| | - Emeline Biliaut
- BIOASTER Microbiology Technology Institute, Lyon 69007, France
| | | | | | | | | | - Frédéric Béquet
- BIOASTER Microbiology Technology Institute, Lyon 69007, France.
| | - Patrick Lecine
- BIOASTER Microbiology Technology Institute, Lyon 69007, France
| | | |
Collapse
|
231
|
Rappoport N, Safra R, Shamir R. MONET: Multi-omic module discovery by omic selection. PLoS Comput Biol 2020; 16:e1008182. [PMID: 32931516 PMCID: PMC7518594 DOI: 10.1371/journal.pcbi.1008182] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 09/25/2020] [Accepted: 07/22/2020] [Indexed: 01/19/2023] Open
Abstract
Recent advances in experimental biology allow creation of datasets where several genome-wide data types (called omics) are measured per sample. Integrative analysis of multi-omic datasets in general, and clustering of samples in such datasets specifically, can improve our understanding of biological processes and discover different disease subtypes. In this work we present MONET (Multi Omic clustering by Non-Exhaustive Types), which presents a unique approach to multi-omic clustering. MONET discovers modules of similar samples, such that each module is allowed to have a clustering structure for only a subset of the omics. This approach differs from most existent multi-omic clustering algorithms, which assume a common structure across all omics, and from several recent algorithms that model distinct cluster structures. We tested MONET extensively on simulated data, on an image dataset, and on ten multi-omic cancer datasets from TCGA. Our analysis shows that MONET compares favorably with other multi-omic clustering methods. We demonstrate MONET's biological and clinical relevance by analyzing its results for Ovarian Serous Cystadenocarcinoma. We also show that MONET is robust to missing data, can cluster genes in multi-omic dataset, and reveal modules of cell types in single-cell multi-omic data. Our work shows that MONET is a valuable tool that can provide complementary results to those provided by existent algorithms for multi-omic analysis.
Collapse
Affiliation(s)
- Nimrod Rappoport
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Roy Safra
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Ron Shamir
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
232
|
Yu X, Wang T, Huang S, Zeng P. How Can Gene-Expression Information Improve Prognostic Prediction in TCGA Cancers: An Empirical Comparison Study on Regularization and Mixed Cox Models. Front Genet 2020; 11:920. [PMID: 32973875 PMCID: PMC7472843 DOI: 10.3389/fgene.2020.00920] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 07/23/2020] [Indexed: 12/30/2022] Open
Abstract
Background Previous cancer prognostic prediction models often consider only the most important transcriptomic expressions, and their power is limited. It is unknown whether prediction power can be further improved when additional transcriptomic information is incorporated. Methods To integrate transcriptomes, four models are compared based on 32 types of cancer in the Cancer Genome Atlas, including the general Cox model with only clinical covariates, the Cox model with a lasso penalty (coxlasso), the Cox model with an elastic net penalty (coxenet), and the mixed-effects Cox model (coxlmm). Furthermore, we partition the survival variance into the relative contribution of clinical and transcriptomic components within the framework of coxlmm. Finally, the influence of different numbers of genes was evaluated in the context of coxlmm. Results Compared with the clinical covariates–only Cox model, the average prediction gain was 2.4% for coxlasso, 4.2% for coxenet, and 7.2% for coxlmm across 16 low-censored cancers; a significant elevation of prediction power was observed for SARC, SKCM, LGG, PAAD, and HNSC. Similar findings were observed for all 32 cancers with the average prediction gain of 2.7, 3.8, and 5.8% for coxlasso, coxenet, and coxlmm. Coxlmm always had comparable or better prediction performance relative to coxlasso and coxenet with an average of 2.8% prediction improvement across the 16 low-censored cancers. In addition, it is shown that the predictive accuracy of coxlmm generally increases with the number of genes included. The survival variance partition analysis demonstrates that the transcriptomic contribution was higher for some cancers (e.g., LGG, CESC, PAAD, SKCM, and SARC) and lower for others (e.g., BRCA, COAD, KIRC, and STAD). Conclusion This study demonstrates that the integration of transcriptomic information can substantially improve prognostic prediction accuracy, but the prediction performance is cancer-specific and varies across cancer types. It further reveals that gene expression exhibits distinct contributions to survival variation across cancers.
Collapse
Affiliation(s)
- Xinghao Yu
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Ting Wang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Shuiping Huang
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China.,Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| | - Ping Zeng
- Department of Epidemiology and Biostatistics, School of Public Health, Xuzhou Medical University, Xuzhou, China.,Center for Medical Statistics and Data Analysis, School of Public Health, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|
233
|
Yang L, Fan W, Xu Y. Metaproteomics insights into traditional fermented foods and beverages. Compr Rev Food Sci Food Saf 2020; 19:2506-2529. [PMID: 33336970 DOI: 10.1111/1541-4337.12601] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2020] [Revised: 06/14/2020] [Accepted: 06/17/2020] [Indexed: 12/13/2022]
Abstract
Traditional fermented foods and beverages (TFFB) are important dietary components. Multi-omics techniques have been applied to all aspects of TFFB research to clarify the composition and nutritional value of TFFB, and to reveal the microbial community, microbial interactions, fermentative kinetics, and metabolic profiles during the fermentation process of TFFB. Because of the advantages of metaproteomics in providing functional information, this technology has increasingly been used in research to assess the functional diversity of microbial communities. Metaproteomics is gradually gaining attention in the field of TFFB research because it can reveal the nature of microorganism function at the protein level. This paper reviews the common methods of metaproteomics applied in TFFB research; systematically summarizes the results of metaproteomics research on TFFB, such as sauces, wines, fermented tea, cheese, and fermented fish; and compares the differences in conclusions reached through metaproteomics versus other omics methods. Metaproteomics has great advantages in revealing the microbial functions in TFFB and the interaction between the materials and microbial community. In the future, metaproteomics should be further applied to the study of functional protein markers and protein interaction in TFFB; multi-omics technology requires further integration to reveal the molecular nature of TFFB fermentation.
Collapse
Affiliation(s)
- Liang Yang
- Key Laboratory of Industrial Biotechnology of Ministry of Education, Laboratory of Brewing Microbiology and Applied Enzymology, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu, China
| | - Wenlai Fan
- Key Laboratory of Industrial Biotechnology of Ministry of Education, Laboratory of Brewing Microbiology and Applied Enzymology, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu, China
| | - Yan Xu
- Key Laboratory of Industrial Biotechnology of Ministry of Education, Laboratory of Brewing Microbiology and Applied Enzymology, School of Biotechnology, Jiangnan University, Wuxi, Jiangsu, China
| |
Collapse
|
234
|
Patel SK, George B, Rai V. Artificial Intelligence to Decode Cancer Mechanism: Beyond Patient Stratification for Precision Oncology. Front Pharmacol 2020; 11:1177. [PMID: 32903628 PMCID: PMC7438594 DOI: 10.3389/fphar.2020.01177] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Accepted: 07/20/2020] [Indexed: 12/13/2022] Open
Abstract
The multitude of multi-omics data generated cost-effectively using advanced high-throughput technologies has imposed challenging domain for research in Artificial Intelligence (AI). Data curation poses a significant challenge as different parameters, instruments, and sample preparations approaches are employed for generating these big data sets. AI could reduce the fuzziness and randomness in data handling and build a platform for the data ecosystem, and thus serve as the primary choice for data mining and big data analysis to make informed decisions. However, AI implication remains intricate for researchers/clinicians lacking specific training in computational tools and informatics. Cancer is a major cause of death worldwide, accounting for an estimated 9.6 million deaths in 2018. Certain cancers, such as pancreatic and gastric cancers, are detected only after they have reached their advanced stages with frequent relapses. Cancer is one of the most complex diseases affecting a range of organs with diverse disease progression mechanisms and the effectors ranging from gene-epigenetics to a wide array of metabolites. Hence a comprehensive study, including genomics, epi-genomics, transcriptomics, proteomics, and metabolomics, along with the medical/mass-spectrometry imaging, patient clinical history, treatments provided, genetics, and disease endemicity, is essential. Cancer Moonshot℠ Research Initiatives by NIH National Cancer Institute aims to collect as much information as possible from different regions of the world and make a cancer data repository. AI could play an immense role in (a) analysis of complex and heterogeneous data sets (multi-omics and/or inter-omics), (b) data integration to provide a holistic disease molecular mechanism, (c) identification of diagnostic and prognostic markers, and (d) monitor patient's response to drugs/treatments and recovery. AI enables precision disease management well beyond the prevalent disease stratification patterns, such as differential expression and supervised classification. This review highlights critical advances and challenges in omics data analysis, dealing with data variability from lab-to-lab, and data integration. We also describe methods used in data mining and AI methods to obtain robust results for precision medicine from "big" data. In the future, AI could be expanded to achieve ground-breaking progress in disease management.
Collapse
Affiliation(s)
- Sandip Kumar Patel
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Mumbai, India
- Buck Institute for Research on Aging, Novato, CA, United States
| | - Bhawana George
- Department of Hematopathology, The University of Texas MD Anderson Cancer Center, Houston, TX, United States
| | - Vineeta Rai
- Department of Entomology & Plant Pathology, North Carolina State University, Raleigh, NC, United States
| |
Collapse
|
235
|
Mochida K, Nishii R, Hirayama T. Decoding Plant-Environment Interactions That Influence Crop Agronomic Traits. PLANT & CELL PHYSIOLOGY 2020; 61:1408-1418. [PMID: 32392328 PMCID: PMC7434589 DOI: 10.1093/pcp/pcaa064] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Accepted: 04/26/2020] [Indexed: 05/16/2023]
Abstract
To ensure food security in the face of increasing global demand due to population growth and progressive urbanization, it will be crucial to integrate emerging technologies in multiple disciplines to accelerate overall throughput of gene discovery and crop breeding. Plant agronomic traits often appear during the plants' later growth stages due to the cumulative effects of their lifetime interactions with the environment. Therefore, decoding plant-environment interactions by elucidating plants' temporal physiological responses to environmental changes throughout their lifespans will facilitate the identification of genetic and environmental factors, timing and pathways that influence complex end-point agronomic traits, such as yield. Here, we discuss the expected role of the life-course approach to monitoring plant and crop health status in improving crop productivity by enhancing the understanding of plant-environment interactions. We review recent advances in analytical technologies for monitoring health status in plants based on multi-omics analyses and strategies for integrating heterogeneous datasets from multiple omics areas to identify informative factors associated with traits of interest. In addition, we showcase emerging phenomics techniques that enable the noninvasive and continuous monitoring of plant growth by various means, including three-dimensional phenotyping, plant root phenotyping, implantable/injectable sensors and affordable phenotyping devices. Finally, we present an integrated review of analytical technologies and applications for monitoring plant growth, developed across disciplines, such as plant science, data science and sensors and Internet-of-things technologies, to improve plant productivity.
Collapse
Affiliation(s)
- Keiichi Mochida
- RIKEN Center for Sustainable Resource Science, Tsurumi-ku, Yokohama, Japan
- Kihara Institute for Biological Research, Yokohama City University, Totsuka-ku, Yokohama, Japan
- Graduate School of Nanobioscience, Yokohama City University, Kanazawa-ku, Yokohama, Japan
- Institute of Plant Science and Resources, Okayama University, Kurashiki, Japan
- Corresponding author: E-mail, ; Fax, +81-45-503-9609
| | - Ryuei Nishii
- School of Information and Data Sciences, Nagasaki University, Nagasaki, Japan
| | - Takashi Hirayama
- Institute of Plant Science and Resources, Okayama University, Kurashiki, Japan
| |
Collapse
|
236
|
Arbet J, Brokamp C, Meinzen-Derr J, Trinkley KE, Spratt HM. Lessons and tips for designing a machine learning study using EHR data. J Clin Transl Sci 2020; 5:e21. [PMID: 33948244 PMCID: PMC8057454 DOI: 10.1017/cts.2020.513] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 06/18/2020] [Accepted: 07/13/2020] [Indexed: 02/08/2023] Open
Abstract
Machine learning (ML) provides the ability to examine massive datasets and uncover patterns within data without relying on a priori assumptions such as specific variable associations, linearity in relationships, or prespecified statistical interactions. However, the application of ML to healthcare data has been met with mixed results, especially when using administrative datasets such as the electronic health record. The black box nature of many ML algorithms contributes to an erroneous assumption that these algorithms can overcome major data issues inherent in large administrative healthcare data. As with other research endeavors, good data and analytic design is crucial to ML-based studies. In this paper, we will provide an overview of common misconceptions for ML, the corresponding truths, and suggestions for incorporating these methods into healthcare research while maintaining a sound study design.
Collapse
Affiliation(s)
- Jaron Arbet
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado-Denver Anschutz Medical Campus, Aurora, CO, USA
| | - Cole Brokamp
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
| | - Jareen Meinzen-Derr
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
| | - Katy E. Trinkley
- Department of Clinical Pharmacy, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado, Aurora, CO, USA
- Department of Medicine, School of Medicine, University of Colorado, Aurora, CO, USA
| | - Heidi M. Spratt
- Department of Preventive Medicine and Population Health, University of Texas Medical Branch, Galveston, TX, USA
| |
Collapse
|
237
|
Chakraborty N, Schmitt CW, Honnold CL, Moyler C, Butler S, Nachabe H, Gautam A, Hammamieh R. Protocol Improvement for RNA Extraction From Compromised Frozen Specimens Generated in Austere Conditions: A Path Forward to Transcriptomics-Pathology Systems Integration. Front Mol Biosci 2020; 7:142. [PMID: 32793629 PMCID: PMC7387682 DOI: 10.3389/fmolb.2020.00142] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 06/10/2020] [Indexed: 01/08/2023] Open
Abstract
At the heart of the phenome-to-genome approach is high throughput assays, which are liable to produce false results. This risk can be mitigated by minimizing the sample bias, specifically, recycling the same tissue specimen for both phenotypic and genotypic investigations. Therefore, our aim is to suggest a methodology of obtaining robust results from frozen specimens of compromised quality, particularly if the sample is produced in conditions with limited resources. For example, generating samples at the International Space Station (ISS) is challenging because the time and laboratory footprint allotted to a project can get expensive. In an effort to be economical with available resources, snap-frozen euthanized mice are the straightforward solution; however, this method increases the risk of temperature abuse during the thawing process at the beginning of the tissue collection. We found that prolonged immersion of snap frozen mouse carcass in 10% neutral buffered formalin at 4°C yielded minimal microscopic signs of ice crystallization and delivered tissues with histomorphology that is optimal for hematoxylin and eosin (H&E) staining and fixation on glass slides. We further optimized a method to sequester the tissue specimen from the H&E slides using an incubator shaker. Using this method, we were able to recover an optimal amount of RNA that could be used for downstream transcriptomics assays. Overall, we demonstrated a protocol that enables us to maximize scientific values from tissues collected in austere condition. Furthermore, our protocol can suggest an improvement in the spatial resolution of transcriptomic assays.
Collapse
Affiliation(s)
- Nabarun Chakraborty
- Geneva Foundation, Walter Reed Army Institute of Research, Silver Spring, MD, United States.,Medical Readiness Systems Biology, Walter Reed Army Institute of Research, Silver Spring, MD, United States
| | - Connie W Schmitt
- Comparative Pathology, US Army Medical Research Institute of Chemical Defense, Gunpowder, MD, United States
| | - Cary L Honnold
- Comparative Pathology, US Army Medical Research Institute of Chemical Defense, Gunpowder, MD, United States
| | - Candace Moyler
- Medical Readiness Systems Biology, Walter Reed Army Institute of Research, Silver Spring, MD, United States.,ORISE, Walter Reed Army Institute of Research, Silver Spring, MD, United States
| | - Stephen Butler
- Geneva Foundation, Walter Reed Army Institute of Research, Silver Spring, MD, United States.,Medical Readiness Systems Biology, Walter Reed Army Institute of Research, Silver Spring, MD, United States
| | - Hisham Nachabe
- Medical Readiness Systems Biology, Walter Reed Army Institute of Research, Silver Spring, MD, United States.,ORISE, Walter Reed Army Institute of Research, Silver Spring, MD, United States
| | - Aarti Gautam
- Medical Readiness Systems Biology, Walter Reed Army Institute of Research, Silver Spring, MD, United States
| | - Rasha Hammamieh
- Medical Readiness Systems Biology, Walter Reed Army Institute of Research, Silver Spring, MD, United States
| |
Collapse
|
238
|
Application of Multiblock Analysis on Small Metabolomic Multi-Tissue Dataset. Metabolites 2020; 10:metabo10070295. [PMID: 32709053 PMCID: PMC7407932 DOI: 10.3390/metabo10070295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Revised: 07/14/2020] [Accepted: 07/15/2020] [Indexed: 11/16/2022] Open
Abstract
Data integration has been proven to provide valuable information. The information extracted using data integration in the form of multiblock analysis can pinpoint both common and unique trends in the different blocks. When working with small multiblock datasets the number of possible integration methods is drastically reduced. To investigate the application of multiblock analysis in cases where one has a few number of samples and a lack of statistical power, we studied a small metabolomic multiblock dataset containing six blocks (i.e., tissue types), only including common metabolites. We used a single model multiblock analysis method called the joint and unique multiblock analysis (JUMBA) and compared it to a commonly used method, concatenated principal component analysis (PCA). These methods were used to detect trends in the dataset and identify underlying factors responsible for metabolic variations. Using JUMBA, we were able to interpret the extracted components and link them to relevant biological properties. JUMBA shows how the observations are related to one another, the stability of these relationships, and to what extent each of the blocks contribute to the components. These results indicate that multiblock methods can be useful even with a small number of samples.
Collapse
|
239
|
Randhawa V, Pathania S. Advancing from protein interactomes and gene co-expression networks towards multi-omics-based composite networks: approaches for predicting and extracting biological knowledge. Brief Funct Genomics 2020; 19:364-376. [PMID: 32678894 DOI: 10.1093/bfgp/elaa015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 05/31/2020] [Accepted: 06/15/2020] [Indexed: 01/17/2023] Open
Abstract
Prediction of biological interaction networks from single-omics data has been extensively implemented to understand various aspects of biological systems. However, more recently, there is a growing interest in integrating multi-omics datasets for the prediction of interactomes that provide a global view of biological systems with higher descriptive capability, as compared to single omics. In this review, we have discussed various computational approaches implemented to infer and analyze two of the most important and well studied interactomes: protein-protein interaction networks and gene co-expression networks. We have explicitly focused on recent methods and pipelines implemented to infer and extract biologically important information from these interactomes, starting from utilizing single-omics data and then progressing towards multi-omics data. Accordingly, recent examples and case studies are also briefly discussed. Overall, this review will provide a proper understanding of the latest developments in protein and gene network modelling and will also help in extracting practical knowledge from them.
Collapse
Affiliation(s)
- Vinay Randhawa
- Department of Biochemistry, Panjab University, Chandigarh, 160014, India
| | - Shivalika Pathania
- Department of Biotechnology, Panjab University, Chandigarh, 160014, India
| |
Collapse
|
240
|
O'Hara E, Neves ALA, Song Y, Guan LL. The Role of the Gut Microbiome in Cattle Production and Health: Driver or Passenger? Annu Rev Anim Biosci 2020; 8:199-220. [PMID: 32069435 DOI: 10.1146/annurev-animal-021419-083952] [Citation(s) in RCA: 107] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Ruminant production systems face significant challenges currently, driven by heightened awareness of their negative environmental impact and the rapidly rising global population. Recent findings have underscored how the composition and function of the rumen microbiome are associated with economically valuable traits, including feed efficiency and methane emission. Although omics-based technological advances in the last decade have revolutionized our understanding of host-associated microbial communities, there remains incongruence over the correct approach for analysis of large omic data sets. A global approach that examines host/microbiome interactions in both the rumen and the lower digestive tract is required to harness the full potential of the gastrointestinal microbiome for sustainable ruminant production. This review highlights how the ruminant animal production community may identify and exploit the causal relationships between the gut microbiome and host traits of interest for a practical application of omic data to animal health and production.
Collapse
Affiliation(s)
- Eóin O'Hara
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada; , ,
| | - André L A Neves
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada; , ,
| | - Yang Song
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada; , , .,College of Animal Science and Technology, Inner Mongolia University for the Nationalities, Tongliao, China 028000;
| | - Le Luo Guan
- Department of Agricultural, Food & Nutritional Science, University of Alberta, Edmonton, Alberta T6G 2P5, Canada; , ,
| |
Collapse
|
241
|
Maity AK, Lee SC, Mallick BK, Sarkar TR. Bayesian structural equation modeling in multiple omics data with application to circadian genes. Bioinformatics 2020; 36:3951-3958. [PMID: 32369552 DOI: 10.1093/bioinformatics/btaa286] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 03/30/2020] [Accepted: 04/27/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION It is well known that the integration among different data-sources is reliable because of its potential of unveiling new functionalities of the genomic expressions, which might be dormant in a single-source analysis. Moreover, different studies have justified the more powerful analyses of multi-platform data. Toward this, in this study, we consider the circadian genes' omics profile, such as copy number changes and RNA-sequence data along with their survival response. We develop a Bayesian structural equation modeling coupled with linear regressions and log normal accelerated failure-time regression to integrate the information between these two platforms to predict the survival of the subjects. We place conjugate priors on the regression parameters and derive the Gibbs sampler using the conditional distributions of them. RESULTS Our extensive simulation study shows that the integrative model provides a better fit to the data than its closest competitor. The analyses of glioblastoma cancer data and the breast cancer data from TCGA, the largest genomics and transcriptomics database, support our findings. AVAILABILITY AND IMPLEMENTATION The developed method is wrapped in R package available at https://github.com/MAITYA02/semmcmc. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Arnab Kumar Maity
- Early Clinical Development Oncology Statistics, Pfizer Inc., San Diego, CA 92121, USA
| | | | | | - Tapasree Roy Sarkar
- Department of Statistics.,Department of Biology, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|
242
|
Guinot F, Szafranski M, Chiquet J, Zancarini A, Le Signor C, Mougel C, Ambroise C. Fast computation of genome-metagenome interaction effects. Algorithms Mol Biol 2020; 15:13. [PMID: 32625242 PMCID: PMC7329492 DOI: 10.1186/s13015-020-00173-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 06/17/2020] [Indexed: 01/01/2023] Open
Abstract
Motivation Association studies have been widely used to search for associations between common genetic variants observations and a given phenotype. However, it is now generally accepted that genes and environment must be examined jointly when estimating phenotypic variance. In this work we consider two types of biological markers: genotypic markers, which characterize an observation in terms of inherited genetic information, and metagenomic marker which are related to the environment. Both types of markers are available in their millions and can be used to characterize any observation uniquely. Objective Our focus is on detecting interactions between groups of genetic and metagenomic markers in order to gain a better understanding of the complex relationship between environment and genome in the expression of a given phenotype. Contributions We propose a novel approach for efficiently detecting interactions between complementary datasets in a high-dimensional setting with a reduced computational cost. The method, named SICOMORE, reduces the dimension of the search space by selecting a subset of supervariables in the two complementary datasets. These supervariables are given by a weighted group structure defined on sets of variables at different scales. A Lasso selection is then applied on each type of supervariable to obtain a subset of potential interactions that will be explored via linear model testing. Results We compare SICOMORE with other approaches in simulations, with varying sample sizes, noise, and numbers of true interactions. SICOMORE exhibits convincing results in terms of recall, as well as competitive performances with respect to running time. The method is also used to detect interaction between genomic markers in Medicago truncatula and metagenomic markers in its rhizosphere bacterial community. Software availability An R package is available [4], along with its documentation and associated scripts, allowing the reader to reproduce the results presented in the paper.
Collapse
|
243
|
A survey on single and multi omics data mining methods in cancer data classification. J Biomed Inform 2020; 107:103466. [DOI: 10.1016/j.jbi.2020.103466] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 05/01/2020] [Accepted: 05/31/2020] [Indexed: 01/09/2023]
|
244
|
Shi WJ, Zhuang Y, Russell PH, Hobbs BD, Parker MM, Castaldi PJ, Rudra P, Vestal B, Hersh CP, Saba LM, Kechris K. Unsupervised discovery of phenotype-specific multi-omics networks. Bioinformatics 2020; 35:4336-4343. [PMID: 30957844 DOI: 10.1093/bioinformatics/btz226] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 02/01/2019] [Accepted: 04/05/2019] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION Complex diseases often involve a wide spectrum of phenotypic traits. Better understanding of the biological mechanisms relevant to each trait promotes understanding of the etiology of the disease and the potential for targeted and effective treatment plans. There have been many efforts towards omics data integration and network reconstruction, but limited work has examined the incorporation of relevant (quantitative) phenotypic traits. RESULTS We propose a novel technique, sparse multiple canonical correlation network analysis (SmCCNet), for integrating multiple omics data types along with a quantitative phenotype of interest, and for constructing multi-omics networks that are specific to the phenotype. As a case study, we focus on miRNA-mRNA networks. Through simulations, we demonstrate that SmCCNet has better overall prediction performance compared to popular gene expression network construction and integration approaches under realistic settings. Applying SmCCNet to studies on chronic obstructive pulmonary disease (COPD) and breast cancer, we found enrichment of known relevant pathways (e.g. the Cadherin pathway for COPD and the interferon-gamma signaling pathway for breast cancer) as well as less known omics features that may be important to the diseases. Although those applications focus on miRNA-mRNA co-expression networks, SmCCNet is applicable to a variety of omics and other data types. It can also be easily generalized to incorporate multiple quantitative phenotype simultaneously. The versatility of SmCCNet suggests great potential of the approach in many areas. AVAILABILITY AND IMPLEMENTATION The SmCCNet algorithm is written in R, and is freely available on the web at https://cran.r-project.org/web/packages/SmCCNet/index.html. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- W Jenny Shi
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Yonghua Zhuang
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Pamela H Russell
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Brian D Hobbs
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Margaret M Parker
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Peter J Castaldi
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Pratyaydipta Rudra
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.,Department of Statistics, Oklahoma State University, Stillwater, OK
| | - Brian Vestal
- Center for Genes, Environment & Health, National Jewish Health, Denver, CO, USA
| | - Craig P Hersh
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Laura M Saba
- Department of Pharmaceutical Sciences, University of Colorado, Aurora, CO, USA
| | - Katerina Kechris
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| |
Collapse
|
245
|
Nicora G, Vitali F, Dagliati A, Geifman N, Bellazzi R. Integrated Multi-Omics Analyses in Oncology: A Review of Machine Learning Methods and Tools. Front Oncol 2020; 10:1030. [PMID: 32695678 PMCID: PMC7338582 DOI: 10.3389/fonc.2020.01030] [Citation(s) in RCA: 110] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 05/26/2020] [Indexed: 12/16/2022] Open
Abstract
In recent years, high-throughput sequencing technologies provide unprecedented opportunity to depict cancer samples at multiple molecular levels. The integration and analysis of these multi-omics datasets is a crucial and critical step to gain actionable knowledge in a precision medicine framework. This paper explores recent data-driven methodologies that have been developed and applied to respond major challenges of stratified medicine in oncology, including patients' phenotyping, biomarker discovery, and drug repurposing. We systematically retrieved peer-reviewed journals published from 2014 to 2019, select and thoroughly describe the tools presenting the most promising innovations regarding the integration of heterogeneous data, the machine learning methodologies that successfully tackled the complexity of multi-omics data, and the frameworks to deliver actionable results for clinical practice. The review is organized according to the applied methods: Deep learning, Network-based methods, Clustering, Features Extraction, and Transformation, Factorization. We provide an overview of the tools available in each methodological group and underline the relationship among the different categories. Our analysis revealed how multi-omics datasets could be exploited to drive precision oncology, but also current limitations in the development of multi-omics data integration.
Collapse
Affiliation(s)
- Giovanna Nicora
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| | - Francesca Vitali
- Center for Innovation in Brain Science, University of Arizona, Tucson, AZ, United States.,Department of Neurology, College of Medicine, University of Arizona, Tucson, AZ, United States.,Center for Biomedical Informatics and Biostatistics, University of Arizona, Tucson, AZ, United States
| | - Arianna Dagliati
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy.,Centre for Health Informatics, The University of Manchester, Manchester, United Kingdom.,The Manchester Molecular Pathology Innovation Centre, The University of Manchester, Manchester, United Kingdom
| | - Nophar Geifman
- Centre for Health Informatics, The University of Manchester, Manchester, United Kingdom.,The Manchester Molecular Pathology Innovation Centre, The University of Manchester, Manchester, United Kingdom
| | - Riccardo Bellazzi
- Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy
| |
Collapse
|
246
|
Multi-omics network analysis reveals distinct stages in the human aging progression in epidermal tissue. Aging (Albany NY) 2020; 12:12393-12409. [PMID: 32554863 PMCID: PMC7343460 DOI: 10.18632/aging.103499] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 06/01/2020] [Indexed: 01/05/2023]
Abstract
In recent years, reports of non-linear regulations in age- and longevity-associated biological processes have been accumulating. Inspired by methodological advances in precision medicine involving the integrative analysis of multi-omics data, we sought to investigate the potential of multi-omics integration to identify distinct stages in the aging progression from ex vivo human skin tissue. For this we generated transcriptome and methylome profiling data from suction blister lesions of female subjects between 21 and 76 years, which were integrated using a network fusion approach. Unsupervised cluster analysis on the combined network identified four distinct subgroupings exhibiting a significant age-association. As indicated by DNAm age analysis and Hallmark of Aging enrichment signals, the stages captured the biological aging state more clearly than a mere grouping by chronological age and could further be recovered in a longitudinal validation cohort with high stability. Characterization of the biological processes driving the phases using machine learning enabled a data-driven reconstruction of the order of Hallmark of Aging manifestation. Finally, we investigated non-linearities in the mid-life aging progression captured by the aging phases and identified a far-reaching non-linear increase in transcriptional noise in the pathway landscape in the transition from mid- to late-life.
Collapse
|
247
|
Xie L, Varathan P, Nho K, Saykin AJ, Salama P, Yan J. Identification of functionally connected multi-omic biomarkers for Alzheimer's disease using modularity-constrained Lasso. PLoS One 2020; 15:e0234748. [PMID: 32555747 PMCID: PMC7299377 DOI: 10.1371/journal.pone.0234748] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 06/02/2020] [Indexed: 12/16/2022] Open
Abstract
Large-scale genome wide association studies (GWASs) have led to discovery of many genetic risk factors in Alzheimer's disease (AD), such as APOE, TOMM40 and CLU. Despite the significant progress, it remains a major challenge to functionally validate these genetic findings and translate them into targetable mechanisms. Integration of multiple types of molecular data is increasingly used to address this problem. In this paper, we proposed a modularity-constrained Lasso model to jointly analyze the genotype, gene expression and protein expression data for discovery of functionally connected multi-omic biomarkers in AD. With a prior network capturing the functional relationship between SNPs, genes and proteins, the newly introduced penalty term maximizes the global modularity of the subnetwork involving selected markers and encourages the selection of multi-omic markers with dense functional connectivity, instead of individual markers. We applied this new model to the real data collected in the ROS/MAP cohort where the cognitive performance was used as disease quantitative trait. A functionally connected subnetwork involving 276 multi-omic biomarkers, including SNPs, genes and proteins, were identified to bear predictive power. Within this subnetwork, multiple trans-omic paths from SNPs to genes and then proteins were observed. This suggests that cognitive performance deterioration in AD patients can be potentially a result of genetic variations due to their cascade effect on the downstream transcriptome and proteome level.
Collapse
Affiliation(s)
- Linhui Xie
- Department of Electrical and Computer Engineering, Indiana University Purdue University Indianapolis, Indianapolis, Indiana, United States of America
| | - Pradeep Varathan
- Department of BioHealth Informatics, Indiana University Purdue University Indianapolis, Indianapolis, Indiana, United States of America
| | - Kwangsik Nho
- Department of Radiology and Imaging Sciences, School of Medicine, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - Andrew J. Saykin
- Department of Radiology and Imaging Sciences, School of Medicine, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
| | - Paul Salama
- Department of Electrical and Computer Engineering, Indiana University Purdue University Indianapolis, Indianapolis, Indiana, United States of America
| | - Jingwen Yan
- Department of Radiology and Imaging Sciences, School of Medicine, Indiana University School of Medicine, Indianapolis, Indiana, United States of America
- Department of BioHealth Informatics, Indiana University Purdue University Indianapolis, Indianapolis, Indiana, United States of America
| |
Collapse
|
248
|
Meng X, Zhao X, Ding X, Li Y, Cao G, Chu Z, Su X, Liu Y, Chen X, Guo J, Cai Z, Ding X. Integrated Functional Omics Analysis of Flavonoid-Related Metabolism in AtMYB12 Transcript Factor Overexpressed Tomato. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2020; 68:6776-6787. [PMID: 32396374 DOI: 10.1021/acs.jafc.0c01894] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Genetic engineering (GE) technology is widely used in plant modification. However, the results of modification may not exactly meet the expectations. Herein, we propose a new multi-omics method for GE plant evaluation based on the optimized use of the metID algorithm. Using this method, we found that flavonoid accumulation was at the expense of the great sacrifice of l-phenylalanine in GE tomatoes for the first time. Meanwhile, the ceramide series of sphingolipid is synthesized de novo from l-serine, and ceramides are the primary source of vesicles coated with flavonoids and secreted from the endoplasmic reticulum. Therefore, the accumulation of the ceramide series of sphingolipid changed the cell component of intracellular organelles. Furthermore, the improvement of the method allows us to identify more metabolites related to dysregulated pathways.
Collapse
Affiliation(s)
- Xuanlin Meng
- College of Plant Protection, State Key Laboratory of Crop Biology, Shandong Agricultural University, Taian, Shandong 271000, People's Republic of China
- Department of Chemistry and State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong Special Administrative Region of the People's Republic of China
| | - Xingchen Zhao
- Department of Chemistry and State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong Special Administrative Region of the People's Republic of China
| | - Xiangyu Ding
- College of Plant Protection, State Key Laboratory of Crop Biology, Shandong Agricultural University, Taian, Shandong 271000, People's Republic of China
| | - Yang Li
- College of Plant Protection, State Key Laboratory of Crop Biology, Shandong Agricultural University, Taian, Shandong 271000, People's Republic of China
| | - Guodong Cao
- Department of Chemistry and State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong Special Administrative Region of the People's Republic of China
| | - Zhaohui Chu
- College of Plant Protection, State Key Laboratory of Crop Biology, Shandong Agricultural University, Taian, Shandong 271000, People's Republic of China
| | - Xiuli Su
- Department of Chemistry and State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong Special Administrative Region of the People's Republic of China
| | - Yuanchen Liu
- Department of Chemistry and State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong Special Administrative Region of the People's Republic of China
| | - Xiangfeng Chen
- Key Laboratory for Applied Technology of Sophisticated Analytic Instrument, Qilu University of Technology (Shandong Academy of Science), Jinan, Shandong 250014, People's Republic of China
| | - Jinggong Guo
- Center for Multi-Omics Research, State Key Laboratory of Cotton Biology, Institute of Plant Stress Biology, Henan University, Kaifeng, Henan 475004, People's Republic of China
| | - Zongwei Cai
- Department of Chemistry and State Key Laboratory of Environmental and Biological Analysis, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong Special Administrative Region of the People's Republic of China
| | - Xinhua Ding
- College of Plant Protection, State Key Laboratory of Crop Biology, Shandong Agricultural University, Taian, Shandong 271000, People's Republic of China
| |
Collapse
|
249
|
Rappoport N, Shamir R. NEMO: cancer subtyping by integration of partial multi-omic data. Bioinformatics 2020; 35:3348-3356. [PMID: 30698637 PMCID: PMC6748715 DOI: 10.1093/bioinformatics/btz058] [Citation(s) in RCA: 105] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2018] [Revised: 12/23/2018] [Accepted: 01/25/2019] [Indexed: 01/10/2023] Open
Abstract
Motivation Cancer subtypes were usually defined based on molecular characterization of single omic data. Increasingly, measurements of multiple omic profiles for the same cohort are available. Defining cancer subtypes using multi-omic data may improve our understanding of cancer, and suggest more precise treatment for patients. Results We present NEMO (NEighborhood based Multi-Omics clustering), a novel algorithm for multi-omics clustering. Importantly, NEMO can be applied to partial datasets in which some patients have data for only a subset of the omics, without performing data imputation. In extensive testing on ten cancer datasets spanning 3168 patients, NEMO achieved results comparable to the best of nine state-of-the-art multi-omics clustering algorithms on full data and showed an improvement on partial data. On some of the partial data tests, PVC, a multi-view algorithm, performed better, but it is limited to two omics and to positive partial data. Finally, we demonstrate the advantage of NEMO in detailed analysis of partial data of AML patients. NEMO is fast and much simpler than existing multi-omics clustering algorithms, and avoids iterative optimization. Availability and implementation Code for NEMO and for reproducing all NEMO results in this paper is in github: https://github.com/Shamir-Lab/NEMO. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nimrod Rappoport
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| | - Ron Shamir
- Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
250
|
Wei Z, Zhang Y, Weng W, Chen J, Cai H. Survey and comparative assessments of computational multi-omics integrative methods with multiple regulatory networks identifying distinct tumor compositions across pan-cancer data sets. Brief Bioinform 2020; 22:5856342. [PMID: 32533167 DOI: 10.1093/bib/bbaa102] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Revised: 05/02/2020] [Accepted: 05/04/2020] [Indexed: 12/20/2022] Open
Abstract
The significance of pan-cancer categories has recently been recognized as widespread in cancer research. Pan-cancer categorizes a cancer based on its molecular pathology rather than an organ. The molecular similarities among multi-omics data found in different cancer types can play several roles in both biological processes and therapeutic developments. Therefore, an integrated analysis for various genomic data is frequently used to reveal novel genetic and molecular mechanisms. However, a variety of algorithms for multi-omics clustering have been proposed in different fields. The comparison of different computational clustering methods in pan-cancer analysis performance remains unclear. To increase the utilization of current integrative methods in pan-cancer analysis, we first provide an overview of five popular computational integrative tools: similarity network fusion, integrative clustering of multiple genomic data types (iCluster), cancer integration via multi-kernel learning (CIMLR), perturbation clustering for data integration and disease subtyping (PINS) and low-rank clustering (LRACluster). Then, a priori interactions in multi-omics data were incorporated to detect prominent molecular patterns in pan-cancer data sets. Finally, we present comparative assessments of these methods, with discussion over key issues in applying these algorithms. We found that all five methods can identify distinct tumor compositions. The pan-cancer samples can be reclassified into several groups by different proportions. Interestingly, each method can classify the tumors into categories that are different from original cancer types or subtypes, especially for ovarian serous cystadenocarcinoma (OV) and breast invasive carcinoma (BRCA) tumors. In addition, all clusters of the five computational methods show notable prognostic values. Furthermore, both the 9 recurrent differential genes and the 15 common pathway characteristics were identified across all the methods. The results and discussion can help the community select appropriate integrative tools according to different research tasks or aims in pan-cancer analysis.
Collapse
Affiliation(s)
- Zhuohui Wei
- Computer Science and Engineering, South China University of Technology
| | - Yue Zhang
- School of Computer Science, Guangdong Polytechnic Normal University
| | - Wanlin Weng
- Computer Science and Engineering, South China University of Technology
| | - Jiazhou Chen
- Computer Science and Engineering, South China University of Technology
| | - Hongmin Cai
- Computer Science and Engineering, South China University of Technology
| |
Collapse
|