1
|
Belfiori M, Lazzari L, Hezzell M, Angelini GD, Dong T. Transcriptomics, Proteomics and Bioinformatics in Atrial Fibrillation: A Descriptive Review. Bioengineering (Basel) 2025; 12:149. [PMID: 40001669 PMCID: PMC11851880 DOI: 10.3390/bioengineering12020149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2025] [Revised: 01/28/2025] [Accepted: 02/02/2025] [Indexed: 02/27/2025] Open
Abstract
Atrial fibrillation (AF) is the most frequent cardiac arrhythmia, with an estimated five million cases globally. This condition increases the likelihood of developing cardiovascular complications such as thromboembolic events, with a fivefold increase in risk of both heart failure and stroke. Contemporary challenges include a better understanding AF pathophysiology and optimizing therapeutical options due to the current lack of efficacy and adverse effects of antiarrhythmic drug therapy. Hence, the identification of novel biomarkers in biological samples would greatly impact the diagnostic and therapeutic opportunities offered to AF patients. Long noncoding RNAs, micro RNAs, circular RNAs, and genes involved in heart cell differentiation are particularly relevant to understanding gene regulatory effects on AF pathophysiology. Proteomic remodeling may also play an important role in the structural, electrical, ion channel, and interactome dysfunctions associated with AF pathogenesis. Different devices for processing RNA and proteomic samples vary from RNA sequencing and microarray to a wide range of mass spectrometry techniques such as Orbitrap, Quadrupole, LC-MS, and hybrid systems. Since AF atrial tissue samples require a more invasive approach to be retrieved and analyzed, blood plasma biomarkers were also considered. A range of different sample preprocessing techniques and bioinformatic methods across studies were examined. The objective of this descriptive review is to examine the most recent developments of transcriptomics, proteomics, and bioinformatics in atrial fibrillation.
Collapse
Affiliation(s)
- Martina Belfiori
- School of Medicine and Surgery, Università degli Studi di Milano-Bicocca, 20126 Milano, Italy; (M.B.); (L.L.)
| | - Lisa Lazzari
- School of Medicine and Surgery, Università degli Studi di Milano-Bicocca, 20126 Milano, Italy; (M.B.); (L.L.)
| | - Melanie Hezzell
- Bristol Veterinary School, University of Bristol, Langford House, Langford, Bristol BS40 5DU, UK;
| | - Gianni D. Angelini
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol BS2 8HW, UK;
| | - Tim Dong
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol BS2 8HW, UK;
| |
Collapse
|
2
|
Li W, Ballard J, Zhao Y, Long Q. Knowledge-guided learning methods for integrative analysis of multi-omics data. Comput Struct Biotechnol J 2024; 23:1945-1950. [PMID: 38736693 PMCID: PMC11087912 DOI: 10.1016/j.csbj.2024.04.053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/17/2024] [Accepted: 04/18/2024] [Indexed: 05/14/2024] Open
Abstract
Integrative analysis of multi-omics data has the potential to yield valuable and comprehensive insights into the molecular mechanisms underlying complex diseases such as cancer and Alzheimer's disease. However, a number of analytical challenges complicate multi-omics data integration. For instance, -omics data are usually high-dimensional, and sample sizes in multi-omics studies tend to be modest. Furthermore, when genes in an important pathway have relatively weak signal, it can be difficult to detect them individually. There is a growing body of literature on knowledge-guided learning methods that can address these challenges by incorporating biological knowledge such as functional genomics and functional proteomics into multi-omics data analysis. These methods have been shown to outperform their counterparts that do not utilize biological knowledge in tasks including prediction, feature selection, clustering, and dimension reduction. In this review, we survey recently developed methods and applications of knowledge-guided multi-omics data integration methods and discuss future research directions.
Collapse
Affiliation(s)
- Wenrui Li
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, 19104, PA, USA
| | - Jenna Ballard
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, 3700 Hamilton Walk, Philadelphia, 19104, PA, USA
| | - Yize Zhao
- Department of Biostatistics, School of Public Health, Yale University, 60 College Street, New Haven, 06510, CT, USA
| | - Qi Long
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, 423 Guardian Drive, Philadelphia, 19104, PA, USA
| |
Collapse
|
3
|
Hayes CN, Nakahara H, Ono A, Tsuge M, Oka S. From Omics to Multi-Omics: A Review of Advantages and Tradeoffs. Genes (Basel) 2024; 15:1551. [PMID: 39766818 PMCID: PMC11675490 DOI: 10.3390/genes15121551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2024] [Revised: 11/25/2024] [Accepted: 11/28/2024] [Indexed: 01/11/2025] Open
Abstract
Bioinformatics is a rapidly evolving field charged with cataloging, disseminating, and analyzing biological data. Bioinformatics started with genomics, but while genomics focuses more narrowly on the genes comprising a genome, bioinformatics now encompasses a much broader range of omics technologies. Overcoming barriers of scale and effort that plagued earlier sequencing methods, bioinformatics adopted an ambitious strategy involving high-throughput and highly automated assays. However, as the list of omics technologies continues to grow, the field of bioinformatics has changed in two fundamental ways. Despite enormous success in expanding our understanding of the biological world, the failure of bulk methods to account for biologically important variability among cells of the same or different type has led to a major shift toward single-cell and spatially resolved omics methods, which attempt to disentangle the conflicting signals contained in heterogeneous samples by examining individual cells or cell clusters. The second major shift has been the attempt to integrate two or more different classes of omics data in a single multimodal analysis to identify patterns that bridge biological layers. For example, unraveling the cause of disease may reveal a metabolite deficiency caused by the failure of an enzyme to be phosphorylated because a gene is not expressed due to aberrant methylation as a result of a rare germline variant. Conclusions: There is a fine line between superficial understanding and analysis paralysis, but like a detective novel, multi-omics increasingly provides the clues we need, if only we are able to see them.
Collapse
Affiliation(s)
- C. Nelson Hayes
- Department of Gastroenterology, Graduate School of Biomedical & Health Sciences, Hiroshima University, Hiroshima 734-8551, Japan; (A.O.); (M.T.); (S.O.)
| | - Hikaru Nakahara
- Department of Clinical and Molecular Genetics, Hiroshima University, Hiroshima 734-8551, Japan;
| | - Atsushi Ono
- Department of Gastroenterology, Graduate School of Biomedical & Health Sciences, Hiroshima University, Hiroshima 734-8551, Japan; (A.O.); (M.T.); (S.O.)
| | - Masataka Tsuge
- Department of Gastroenterology, Graduate School of Biomedical & Health Sciences, Hiroshima University, Hiroshima 734-8551, Japan; (A.O.); (M.T.); (S.O.)
- Liver Center, Hiroshima University, Hiroshima 734-8551, Japan
| | - Shiro Oka
- Department of Gastroenterology, Graduate School of Biomedical & Health Sciences, Hiroshima University, Hiroshima 734-8551, Japan; (A.O.); (M.T.); (S.O.)
| |
Collapse
|
4
|
Kobel CM, Merkesvik J, Burgos IMT, Lai W, Øyås O, Pope PB, Hvidsten TR, Aho VTE. Integrating host and microbiome biology using holo-omics. Mol Omics 2024; 20:438-452. [PMID: 38963125 DOI: 10.1039/d4mo00017j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/05/2024]
Abstract
Holo-omics is the use of omics data to study a host and its inherent microbiomes - a biological system known as a "holobiont". A microbiome that exists in such a space often encounters habitat stability and in return provides metabolic capacities that can benefit their host. Here we present an overview of beneficial host-microbiome systems and propose and discuss several methodological frameworks that can be used to investigate the intricacies of the many as yet undefined host-microbiome interactions that influence holobiont homeostasis. While this is an emerging field, we anticipate that ongoing methodological advancements will enhance the biological resolution that is necessary to improve our understanding of host-microbiome interplay to make meaningful interpretations and biotechnological applications.
Collapse
Affiliation(s)
- Carl M Kobel
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway.
| | - Jenny Merkesvik
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
| | | | - Wanxin Lai
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
| | - Ove Øyås
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway.
| | - Phillip B Pope
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway.
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
- Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology (QUT), Translational Research Institute, Woolloongabba, Queensland, Australia
| | - Torgeir R Hvidsten
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
| | - Velma T E Aho
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway.
| |
Collapse
|
5
|
Jain S, Safo SE. DeepIDA-GRU: a deep learning pipeline for integrative discriminant analysis of cross-sectional and longitudinal multiview data with applications to inflammatory bowel disease classification. Brief Bioinform 2024; 25:bbae339. [PMID: 39007595 PMCID: PMC11771283 DOI: 10.1093/bib/bbae339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 02/29/2024] [Accepted: 06/28/2024] [Indexed: 07/16/2024] Open
Abstract
Biomedical research now commonly integrates diverse data types or views from the same individuals to better understand the pathobiology of complex diseases, but the challenge lies in meaningfully integrating these diverse views. Existing methods often require the same type of data from all views (cross-sectional data only or longitudinal data only) or do not consider any class outcome in the integration method, which presents limitations. To overcome these limitations, we have developed a pipeline that harnesses the power of statistical and deep learning methods to integrate cross-sectional and longitudinal data from multiple sources. In addition, it identifies key variables that contribute to the association between views and the separation between classes, providing deeper biological insights. This pipeline includes variable selection/ranking using linear and nonlinear methods, feature extraction using functional principal component analysis and Euler characteristics, and joint integration and classification using dense feed-forward networks for cross-sectional data and recurrent neural networks for longitudinal data. We applied this pipeline to cross-sectional and longitudinal multiomics data (metagenomics, transcriptomics and metabolomics) from an inflammatory bowel disease (IBD) study and identified microbial pathways, metabolites and genes that discriminate by IBD status, providing information on the etiology of IBD. We conducted simulations to compare the two feature extraction methods.
Collapse
Affiliation(s)
- Sarthak Jain
- Department of Electrical Engineering, University of
Minnesota, Minneapolis, MN 55455, United States
| | - Sandra E Safo
- Division of Biostatistics and Health Data Science, University of
Minnesota, Minneapolis, MN 55455, United States
| |
Collapse
|
6
|
Mukherjee A, Kar I, Patra AK. Understanding anthelmintic resistance in livestock using "omics" approaches. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:125439-125463. [PMID: 38015400 DOI: 10.1007/s11356-023-31045-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 11/08/2023] [Indexed: 11/29/2023]
Abstract
Widespread and improper use of various anthelmintics, genetic, and epidemiological factors has resulted in anthelmintic-resistant (AR) helminth populations in livestock. This is currently quite common globally in different livestock animals including sheep, goats, and cattle to gastrointestinal nematode (GIN) infections. Therefore, the mechanisms underlying AR in parasitic worm species have been the subject of ample research to tackle this challenge. Current and emerging technologies in the disciplines of genomics, transcriptomics, metabolomics, and proteomics in livestock species have advanced the understanding of the intricate molecular AR mechanisms in many major parasites. The technologies have improved the identification of possible biomarkers of resistant parasites, the ability to find actual causative genes, regulatory networks, and pathways of parasites governing the AR development including the dynamics of helminth infection and host-parasite infections. In this review, various "omics"-driven technologies including genome scan, candidate gene, quantitative trait loci, transcriptomic, proteomic, and metabolomic approaches have been described to understand AR of parasites of veterinary importance. Also, challenges and future prospects of these "omics" approaches are also discussed.
Collapse
Affiliation(s)
- Ayan Mukherjee
- Department of Animal Biotechnology, West Bengal University of Animal and Fishery Sciences, Nadia, Mohanpur, West Bengal, India
| | - Indrajit Kar
- Department of Avian Sciences, West Bengal University of Animal and Fishery Sciences, Nadia, Mohanpur, West Bengal, India
| | - Amlan Kumar Patra
- American Institute for Goat Research, Langston University, Oklahoma, 73050, USA.
| |
Collapse
|
7
|
Downing T, Angelopoulos N. A primer on correlation-based dimension reduction methods for multi-omics analysis. J R Soc Interface 2023; 20:20230344. [PMID: 37817584 PMCID: PMC10565429 DOI: 10.1098/rsif.2023.0344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/19/2023] [Indexed: 10/12/2023] Open
Abstract
The continuing advances of omic technologies mean that it is now more tangible to measure the numerous features collectively reflecting the molecular properties of a sample. When multiple omic methods are used, statistical and computational approaches can exploit these large, connected profiles. Multi-omics is the integration of different omic data sources from the same biological sample. In this review, we focus on correlation-based dimension reduction approaches for single omic datasets, followed by methods for pairs of omics datasets, before detailing further techniques for three or more omic datasets. We also briefly detail network methods when three or more omic datasets are available and which complement correlation-oriented tools. To aid readers new to this area, these are all linked to relevant R packages that can implement these procedures. Finally, we discuss scenarios of experimental design and present road maps that simplify the selection of appropriate analysis methods. This review will help researchers navigate emerging methods for multi-omics and integrating diverse omic datasets appropriately. This raises the opportunity of implementing population multi-omics with large sample sizes as omics technologies and our understanding improve.
Collapse
Affiliation(s)
- Tim Downing
- Pirbright Institute, Pirbright, Surrey, UK
- Department of Biotechnology, Dublin City University, Dublin, Ireland
| | | |
Collapse
|
8
|
Arıkan M, Muth T. Integrated multi-omics analyses of microbial communities: a review of the current state and future directions. Mol Omics 2023; 19:607-623. [PMID: 37417894 DOI: 10.1039/d3mo00089c] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2023]
Abstract
Integrated multi-omics analyses of microbiomes have become increasingly common in recent years as the emerging omics technologies provide an unprecedented opportunity to better understand the structural and functional properties of microbial communities. Consequently, there is a growing need for and interest in the concepts, approaches, considerations, and available tools for investigating diverse environmental and host-associated microbial communities in an integrative manner. In this review, we first provide a general overview of each omics analysis type, including a brief history, typical workflow, primary applications, strengths, and limitations. Then, we inform on both experimental design and bioinformatics analysis considerations in integrated multi-omics analyses, elaborate on the current approaches and commonly used tools, and highlight the current challenges. Finally, we discuss the expected key advances, emerging trends, potential implications on various fields from human health to biotechnology, and future directions.
Collapse
Affiliation(s)
- Muzaffer Arıkan
- Regenerative and Restorative Medicine Research Center (REMER), Research Institute for Health Sciences and Technologies (SABITA), Istanbul Medipol University, Istanbul, Turkey.
- Department of Medical Biology, Faculty of Medicine, Istanbul Medipol University, Istanbul, Turkey
| | - Thilo Muth
- Section eScience (S.3), Federal Institute for Materials Research and Testing (BAM), Berlin, Germany.
| |
Collapse
|
9
|
Gygi JP, Kleinstein SH, Guan L. Predictive overfitting in immunological applications: Pitfalls and solutions. Hum Vaccin Immunother 2023; 19:2251830. [PMID: 37697867 PMCID: PMC10498807 DOI: 10.1080/21645515.2023.2251830] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Revised: 07/27/2023] [Accepted: 08/21/2023] [Indexed: 09/13/2023] Open
Abstract
Overfitting describes the phenomenon where a highly predictive model on the training data generalizes poorly to future observations. It is a common concern when applying machine learning techniques to contemporary medical applications, such as predicting vaccination response and disease status in infectious disease or cancer studies. This review examines the causes of overfitting and offers strategies to counteract it, focusing on model complexity reduction, reliable model evaluation, and harnessing data diversity. Through discussion of the underlying mathematical models and illustrative examples using both synthetic data and published real datasets, our objective is to equip analysts and bioinformaticians with the knowledge and tools necessary to detect and mitigate overfitting in their research.
Collapse
Affiliation(s)
- Jeremy P. Gygi
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA
| | - Steven H. Kleinstein
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA
- Department of Pathology, Yale School of Medicine, New Haven, CT, USA
- Department of Immunobiology, Yale School of Medicine, New Haven, CT, USA
| | - Leying Guan
- Program in Computational Biology & Bioinformatics, Yale University, New Haven, CT, USA
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| |
Collapse
|
10
|
Sameh M, Khalaf HM, Anwar AM, Osama A, Ahmed EA, Mahgoub S, Ezzeldin S, Tanios A, Alfishawy M, Said AF, Mohamed MS, Sayed AA, Magdeldin S. Integrated multiomics analysis to infer COVID-19 biological insights. Sci Rep 2023; 13:1802. [PMID: 36720931 PMCID: PMC9888750 DOI: 10.1038/s41598-023-28816-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 01/25/2023] [Indexed: 02/02/2023] Open
Abstract
Three years after the pandemic, we still have an imprecise comprehension of the pathogen landscape and we are left with an urgent need for early detection methods and effective therapy for severe COVID-19 patients. The implications of infection go beyond pulmonary damage since the virus hijacks the host's cellular machinery and consumes its resources. Here, we profiled the plasma proteome and metabolome of a cohort of 57 control and severe COVID-19 cases using high-resolution mass spectrometry. We analyzed their proteome and metabolome profiles with multiple depths and methodologies as conventional single omics analysis and other multi-omics integrative methods to obtain the most comprehensive method that portrays an in-depth molecular landscape of the disease. Our findings revealed that integrating the knowledge-based and statistical-based techniques (knowledge-statistical network) outperformed other methods not only on the pathway detection level but even on the number of features detected within pathways. The versatile usage of this approach could provide us with a better understanding of the molecular mechanisms behind any biological system and provide multi-dimensional therapeutic solutions by simultaneously targeting more than one pathogenic factor.
Collapse
Affiliation(s)
- Mahmoud Sameh
- Basic Research Department, Proteomics and Metabolomics Research Program, Children's Cancer Hospital 57357 (CCHE-57357), Cairo, Egypt
| | - Hossam M Khalaf
- Intensive Care Unit, As-Salam International Hospital, Cairo, Egypt
| | - Ali Mostafa Anwar
- Basic Research Department, Proteomics and Metabolomics Research Program, Children's Cancer Hospital 57357 (CCHE-57357), Cairo, Egypt
| | - Aya Osama
- Basic Research Department, Proteomics and Metabolomics Research Program, Children's Cancer Hospital 57357 (CCHE-57357), Cairo, Egypt
| | - Eman Ali Ahmed
- Basic Research Department, Proteomics and Metabolomics Research Program, Children's Cancer Hospital 57357 (CCHE-57357), Cairo, Egypt
- Department of Pharmacology, Faculty of Veterinary Medicine, Suez Canal University, Ismailia, 41522, Egypt
| | - Sebaey Mahgoub
- Basic Research Department, Proteomics and Metabolomics Research Program, Children's Cancer Hospital 57357 (CCHE-57357), Cairo, Egypt
| | - Shahd Ezzeldin
- Basic Research Department, Proteomics and Metabolomics Research Program, Children's Cancer Hospital 57357 (CCHE-57357), Cairo, Egypt
| | - Anthony Tanios
- Basic Research Department, Proteomics and Metabolomics Research Program, Children's Cancer Hospital 57357 (CCHE-57357), Cairo, Egypt
| | - Mostafa Alfishawy
- Infectious Diseases Consultants and Academic Researchers of Egypt (IDCARE), Cairo, Egypt
- Alazhar Center for Allergy and Immunology, Cairo, Egypt
| | - Azza Farag Said
- Department of Pulmonary Medicine, Faculty of Medicine, Minia University, Minia, Egypt
| | - Maged Salah Mohamed
- Department of Anesthesia and Intensive Care, Kasr Al Ainy, Cairo University, Cairo, Egypt
| | - Ahmed A Sayed
- Department of Basic Research, Genomics Program, Children's Cancer Hospital 57357, Cairo, Egypt
- Department of Biochemistry, Faculty of Science, Ain Shams University, Cairo, Egypt
| | - Sameh Magdeldin
- Basic Research Department, Proteomics and Metabolomics Research Program, Children's Cancer Hospital 57357 (CCHE-57357), Cairo, Egypt.
- Department of Physiology, Faculty of Veterinary Medicine, Suez Canal University, Ismailia, Egypt.
| |
Collapse
|
11
|
Chen H, Caffo B, Stein-O’Brien G, Liu J, Langmead B, Colantuoni C, Xiao L. Two-stage linked component analysis for joint decomposition of multiple biologically related data sets. Biostatistics 2022; 23:1200-1217. [PMID: 35358296 PMCID: PMC9566367 DOI: 10.1093/biostatistics/kxac005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 01/13/2022] [Accepted: 01/24/2022] [Indexed: 02/03/2023] Open
Abstract
Integrative analysis of multiple data sets has the potential of fully leveraging the vast amount of high throughput biological data being generated. In particular such analysis will be powerful in making inference from publicly available collections of genetic, transcriptomic and epigenetic data sets which are designed to study shared biological processes, but which vary in their target measurements, biological variation, unwanted noise, and batch variation. Thus, methods that enable the joint analysis of multiple data sets are needed to gain insights into shared biological processes that would otherwise be hidden by unwanted intra-data set variation. Here, we propose a method called two-stage linked component analysis (2s-LCA) to jointly decompose multiple biologically related experimental data sets with biological and technological relationships that can be structured into the decomposition. The consistency of the proposed method is established and its empirical performance is evaluated via simulation studies. We apply 2s-LCA to jointly analyze four data sets focused on human brain development and identify meaningful patterns of gene expression in human neurogenesis that have shared structure across these data sets.
Collapse
Affiliation(s)
- Huan Chen
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | - Brian Caffo
- Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, 21205, USA
| | | | - Jinrui Liu
- Department of Neurology, Johns Hopkins University, Baltimore, MD, 21287, USA
| | - Ben Langmead
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, USA
| | - Carlo Colantuoni
- Department of Neuroscience, Johns Hopkins University, Baltimore, MD, 21205, USA, Department of Neurology, Johns Hopkins University, Baltimore, MD, 21287, USA and Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, 21201, USA
| | - Luo Xiao
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, 27607, USA
| |
Collapse
|
12
|
Vahabi N, Michailidis G. Unsupervised Multi-Omics Data Integration Methods: A Comprehensive Review. Front Genet 2022; 13:854752. [PMID: 35391796 PMCID: PMC8981526 DOI: 10.3389/fgene.2022.854752] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 02/28/2022] [Indexed: 12/26/2022] Open
Abstract
Through the developments of Omics technologies and dissemination of large-scale datasets, such as those from The Cancer Genome Atlas, Alzheimer’s Disease Neuroimaging Initiative, and Genotype-Tissue Expression, it is becoming increasingly possible to study complex biological processes and disease mechanisms more holistically. However, to obtain a comprehensive view of these complex systems, it is crucial to integrate data across various Omics modalities, and also leverage external knowledge available in biological databases. This review aims to provide an overview of multi-Omics data integration methods with different statistical approaches, focusing on unsupervised learning tasks, including disease onset prediction, biomarker discovery, disease subtyping, module discovery, and network/pathway analysis. We also briefly review feature selection methods, multi-Omics data sets, and resources/tools that constitute critical components for carrying out the integration.
Collapse
Affiliation(s)
- Nasim Vahabi
- Informatics Institute, University of Florida, Gainesville, FL, United States
| | - George Michailidis
- Informatics Institute, University of Florida, Gainesville, FL, United States
| |
Collapse
|
13
|
Ganugi P, Fiorini A, Ardenti F, Caffi T, Bonini P, Taskin E, Puglisi E, Tabaglio V, Trevisan M, Lucini L. Nitrogen use efficiency, rhizosphere bacterial community, and root metabolome reprogramming due to maize seed treatment with microbial biostimulants. PHYSIOLOGIA PLANTARUM 2022; 174:e13679. [PMID: 35362106 PMCID: PMC9324912 DOI: 10.1111/ppl.13679] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2021] [Revised: 02/26/2022] [Accepted: 03/25/2022] [Indexed: 06/14/2023]
Abstract
Seed inoculation with beneficial microorganisms has gained importance as it has been proven to show biostimulant activity in plants, especially in terms of abiotic/biotic stress tolerance and plant growth promotion, representing a sustainable way to ensure yield stability under low input sustainable agriculture. Nevertheless, limited knowledge is available concerning the molecular and physiological processes underlying the root-inoculant symbiosis or plant response at the root system level. Our work aimed to integrate the interrelationship between agronomic traits, rhizosphere microbial population and metabolic processes in roots, following seed treatment with either arbuscular mycorrhizal fungi (AMF) or Plant Growth-Promoting Rhizobacteria (PGPR). To this aim, maize was grown under open field conditions with either optimal or reduced nitrogen availability. Both seed treatments increased nitrogen uptake efficiency under reduced nitrogen supply revealed some microbial community changes among treatments at root microbiome level and limited yield increases, while significant changes could be observed at metabolome level. Amino acid, lipid, flavone, lignan, and phenylpropanoid concentrations were mostly modulated. Integrative analysis of multi-omics datasets (Multiple Co-Inertia Analysis) highlighted a strong correlation between the metagenomics and the untargeted metabolomics datasets, suggesting a coordinate modulation of root physiological traits.
Collapse
Affiliation(s)
- Paola Ganugi
- Department for Sustainable Food ProcessUniversità Cattolica del Sacro CuorePiacenzaItaly
| | - Andrea Fiorini
- Department of Sustainable Crop ProductionUniversità Cattolica del Sacro CuorePiacenzaItaly
| | - Federico Ardenti
- Department of Sustainable Crop ProductionUniversità Cattolica del Sacro CuorePiacenzaItaly
| | - Tito Caffi
- Department of Sustainable Crop ProductionUniversità Cattolica del Sacro CuorePiacenzaItaly
| | | | - Eren Taskin
- Department for Sustainable Food ProcessUniversità Cattolica del Sacro CuorePiacenzaItaly
| | - Edoardo Puglisi
- Department for Sustainable Food ProcessUniversità Cattolica del Sacro CuorePiacenzaItaly
| | - Vincenzo Tabaglio
- Department of Sustainable Crop ProductionUniversità Cattolica del Sacro CuorePiacenzaItaly
| | - Marco Trevisan
- Department for Sustainable Food ProcessUniversità Cattolica del Sacro CuorePiacenzaItaly
| | - Luigi Lucini
- Department for Sustainable Food ProcessUniversità Cattolica del Sacro CuorePiacenzaItaly
| |
Collapse
|
14
|
Santiago-Rodriguez TM, Hollister EB. Multi 'omic data integration: A review of concepts, considerations, and approaches. Semin Perinatol 2021; 45:151456. [PMID: 34256961 DOI: 10.1016/j.semperi.2021.151456] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The application of 'omic techniques including, but not limited to genomics/metagenomics, transcriptomics/meta-transcriptomics, proteomics/meta-proteomics, and metabolomics to generate multiple datasets from a single sample have facilitated hypothesis generation leading to the identification of biological, molecular and ecological functions and mechanisms, as well as associations and correlations. Despite their power and promise, a variety of challenges must be considered in the successful design and execution of a multi-omics study. In this review, various 'omic technologies applicable to single- and meta-organisms (i.e., host + microbiome) are described, and considerations for sample collection, storage and processing prior to data generation and analysis, as well as approaches to data storage, dissemination and analysis are discussed. Finally, case studies are included as examples of multi-omic applications providing novel insights and a more holistic understanding of biological processes.
Collapse
Affiliation(s)
| | - Emily B Hollister
- Diversigen, Inc, 3 Greenway Plaza, Suite 1575, Houston, TX 77046, USA.
| |
Collapse
|
15
|
Vahabi N, McDonough CW, Desai AA, Cavallari LH, Duarte JD, Michailidis G. Cox-sMBPLS: An Algorithm for Disease Survival Prediction and Multi-Omics Module Discovery Incorporating Cis-Regulatory Quantitative Effects. Front Genet 2021; 12:701405. [PMID: 34408773 PMCID: PMC8366414 DOI: 10.3389/fgene.2021.701405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 07/07/2021] [Indexed: 12/03/2022] Open
Abstract
Background The development of high-throughput techniques has enabled profiling a large number of biomolecules across a number of molecular compartments. The challenge then becomes to integrate such multimodal Omics data to gain insights into biological processes and disease onset and progression mechanisms. Further, given the high dimensionality of such data, incorporating prior biological information on interactions between molecular compartments when developing statistical models for data integration is beneficial, especially in settings involving a small number of samples. Results We develop a supervised model for time to event data (e.g., death, biochemical recurrence) that simultaneously accounts for redundant information within Omics profiles and leverages prior biological associations between them through a multi-block PLS framework. The interactions between data from different molecular compartments (e.g., epigenome, transcriptome, methylome, etc.) were captured by using cis-regulatory quantitative effects in the proposed model. The model, coined Cox-sMBPLS, exhibits superior prediction performance and improved feature selection based on both simulation studies and analysis of data from heart failure patients. Conclusion The proposed supervised Cox-sMBPLS model can effectively incorporate prior biological information in the survival prediction system, leading to improved prediction performance and feature selection. It also enables the identification of multi-Omics modules of biomolecules that impact the patients’ survival probability and also provides insights into potential relevant risk factors that merit further investigation.
Collapse
Affiliation(s)
- Nasim Vahabi
- Informatics Institute, University of Florida, Gainesville, FL, United States
| | - Caitrin W McDonough
- Department of Pharmacotherapy and Translational Research, Center for Pharmacogenomics and Precision Medicine, University of Florida, Gainesville, FL, United States
| | - Ankit A Desai
- Department of Medicine, Indiana University, Indianapolis, IN, United States
| | - Larisa H Cavallari
- Department of Pharmacotherapy and Translational Research, Center for Pharmacogenomics and Precision Medicine, University of Florida, Gainesville, FL, United States
| | - Julio D Duarte
- Department of Pharmacotherapy and Translational Research, Center for Pharmacogenomics and Precision Medicine, University of Florida, Gainesville, FL, United States
| | - George Michailidis
- Informatics Institute, University of Florida, Gainesville, FL, United States
| |
Collapse
|