1
|
Amirmahani F, Ebrahimi N, Molaei F, Faghihkhorasani F, Jamshidi Goharrizi K, Mirtaghi SM, Borjian‐Boroujeni M, Hamblin MR. Approaches for the integration of big data in translational medicine: single‐cell and computational methods. Ann N Y Acad Sci 2021; 1493:3-28. [DOI: 10.1111/nyas.14544] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 10/31/2020] [Accepted: 11/12/2020] [Indexed: 12/11/2022]
Affiliation(s)
- Farzane Amirmahani
- Genetics Division, Department of Cell and Molecular Biology and Microbiology, Faculty of Science and Technology University of Isfahan Isfahan Iran
| | - Nasim Ebrahimi
- Genetics Division, Department of Cell and Molecular Biology and Microbiology, Faculty of Science and Technology University of Isfahan Isfahan Iran
| | - Fatemeh Molaei
- Department of Anesthesiology, Faculty of Paramedical Jahrom University of Medical Sciences Jahrom Iran
| | | | | | | | | | - Michael R. Hamblin
- Laser Research Centre, Faculty of Health Science University of Johannesburg South Africa
| |
Collapse
|
2
|
Stanstrup J, Broeckling CD, Helmus R, Hoffmann N, Mathé E, Naake T, Nicolotti L, Peters K, Rainer J, Salek RM, Schulze T, Schymanski EL, Stravs MA, Thévenot EA, Treutler H, Weber RJM, Willighagen E, Witting M, Neumann S. The metaRbolomics Toolbox in Bioconductor and beyond. Metabolites 2019; 9:E200. [PMID: 31548506 PMCID: PMC6835268 DOI: 10.3390/metabo9100200] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2019] [Revised: 09/16/2019] [Accepted: 09/17/2019] [Indexed: 11/17/2022] Open
Abstract
Metabolomics aims to measure and characterise the complex composition of metabolites in a biological system. Metabolomics studies involve sophisticated analytical techniques such as mass spectrometry and nuclear magnetic resonance spectroscopy, and generate large amounts of high-dimensional and complex experimental data. Open source processing and analysis tools are of major interest in light of innovative, open and reproducible science. The scientific community has developed a wide range of open source software, providing freely available advanced processing and analysis approaches. The programming and statistics environment R has emerged as one of the most popular environments to process and analyse Metabolomics datasets. A major benefit of such an environment is the possibility of connecting different tools into more complex workflows. Combining reusable data processing R scripts with the experimental data thus allows for open, reproducible research. This review provides an extensive overview of existing packages in R for different steps in a typical computational metabolomics workflow, including data processing, biostatistics, metabolite annotation and identification, and biochemical network and pathway analysis. Multifunctional workflows, possible user interfaces and integration into workflow management systems are also reviewed. In total, this review summarises more than two hundred metabolomics specific packages primarily available on CRAN, Bioconductor and GitHub.
Collapse
Affiliation(s)
- Jan Stanstrup
- Preventive and Clinical Nutrition, University of Copenhagen, Rolighedsvej 30, 1958 Frederiksberg C, Denmark.
| | - Corey D Broeckling
- Proteomics and Metabolomics Facility, Colorado State University, Fort Collins, CO 80523, USA.
| | - Rick Helmus
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, 1098 XH Amsterdam, The Netherlands.
| | - Nils Hoffmann
- Leibniz-Institut für Analytische Wissenschaften-ISAS-e.V., Otto-Hahn-Straße 6b, 44227 Dortmund, Germany.
| | - Ewy Mathé
- Department of Biomedical Informatics, College of Medicine, The Ohio State University, Columbus, OH 43210, USA.
| | - Thomas Naake
- Max Planck Institute of Molecular Plant Physiology, 14476 Potsdam-Golm, Germany.
| | - Luca Nicolotti
- The Australian Wine Research Institute, Metabolomics Australia, PO Box 197, Adelaide SA 5064, Australia.
| | - Kristian Peters
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Johannes Rainer
- Institute for Biomedicine, Eurac Research, Affiliated Institute of the University of Lübeck, 39100 Bolzano, Italy.
| | - Reza M Salek
- The International Agency for Research on Cancer, 150 cours Albert Thomas, CEDEX 08, 69372 Lyon, France.
| | - Tobias Schulze
- Department of Effect-Directed Analysis, Helmholtz Centre for Environmental Research-UFZ, Permoserstraße 15, 04318 Leipzig, Germany.
| | - Emma L Schymanski
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, 6 avenue du Swing, L-4367 Belvaux, Luxembourg.
| | - Michael A Stravs
- Eawag, Swiss Federal Institute of Aquatic Science and Technology, Überlandstrasse 133, 8600 Dubendorf, Switzerland.
| | - Etienne A Thévenot
- CEA, LIST, Laboratory for Data Sciences and Decision, MetaboHUB, Gif-Sur-Yvette F-91191, France.
| | - Hendrik Treutler
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
| | - Ralf J M Weber
- Phenome Centre Birmingham and School of Biosciences, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK.
| | - Egon Willighagen
- Department of Bioinformatics-BiGCaT, NUTRIM, Maastricht University, 6229 ER Maastricht, The Netherlands.
| | - Michael Witting
- Research Unit Analytical BioGeoChemistry, Helmholtz Zentrum München, 85764 Neuherberg, Germany.
- Chair of Analytical Food Chemistry, Technische Universität München, 85354 Weihenstephan, Germany.
| | - Steffen Neumann
- Leibniz Institute of Plant Biochemistry (IPB Halle), Bioinformatics and Scientific Data, 06120 Halle, Germany.
- German Centre for Integrative Biodiversity Research (iDiv), Halle-Jena-Leipzig Deutscher, Platz 5e, 04103 Leipzig, Germany.
| |
Collapse
|
3
|
Gu W, Yildirimman R, Van der Stuyft E, Verbeeck D, Herzinger S, Satagopam V, Barbosa-Silva A, Schneider R, Lange B, Lehrach H, Guo Y, Henderson D, Rowe A. Data and knowledge management in translational research: implementation of the eTRIKS platform for the IMI OncoTrack consortium. BMC Bioinformatics 2019; 20:164. [PMID: 30935364 PMCID: PMC6444691 DOI: 10.1186/s12859-019-2748-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 03/18/2019] [Indexed: 01/04/2023] Open
Abstract
Background For large international research consortia, such as those funded by the European Union’s Horizon 2020 programme or the Innovative Medicines Initiative, good data coordination practices and tools are essential for the successful collection, organization and analysis of the resulting data. Research consortia are attempting ever more ambitious science to better understand disease, by leveraging technologies such as whole genome sequencing, proteomics, patient-derived biological models and computer-based systems biology simulations. Results The IMI eTRIKS consortium is charged with the task of developing an integrated knowledge management platform capable of supporting the complexity of the data generated by such research programmes. In this paper, using the example of the OncoTrack consortium, we describe a typical use case in translational medicine. The tranSMART knowledge management platform was implemented to support data from observational clinical cohorts, drug response data from cell culture models and drug response data from mouse xenograft tumour models. The high dimensional (omics) data from the molecular analyses of the corresponding biological materials were linked to these collections, so that users could browse and analyse these to derive candidate biomarkers. Conclusions In all these steps, data mapping, linking and preparation are handled automatically by the tranSMART integration platform. Therefore, researchers without specialist data handling skills can focus directly on the scientific questions, without spending undue effort on processing the data and data integration, which are otherwise a burden and the most time-consuming part of translational research data analysis. Electronic supplementary material The online version of this article (10.1186/s12859-019-2748-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wei Gu
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | | | | | | | - Sascha Herzinger
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Adriano Barbosa-Silva
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Bodo Lange
- Alacris Theranostics GmbH, Berlin, Germany
| | - Hans Lehrach
- Alacris Theranostics GmbH, Berlin, Germany.,Max Planck Institute for Molecular Genetics, Berlin, Germany.,Dahlem Centre for Genome Research and Medical Systems Biology, Berlin, Germany
| | - Yike Guo
- Data Science Institute, Imperial College London, London, UK
| | | | - Anthony Rowe
- Janssen Research and Development Ltd, High Wycombe, UK.
| | | |
Collapse
|
4
|
Graze RM, Tzeng RY, Howard TS, Arbeitman MN. Perturbation of IIS/TOR signaling alters the landscape of sex-differential gene expression in Drosophila. BMC Genomics 2018; 19:893. [PMID: 30526477 PMCID: PMC6288939 DOI: 10.1186/s12864-018-5308-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2018] [Accepted: 11/23/2018] [Indexed: 12/15/2022] Open
Abstract
Background The core functions of the insulin/insulin-like signaling and target of rapamycin (IIS/TOR) pathway are nutrient sensing, energy homeostasis, growth, and regulation of stress responses. This pathway is also known to interact directly and indirectly with the sex determination regulatory hierarchy. The IIS/TOR pathway plays a role in directing sexually dimorphic traits, including dimorphism of growth, metabolism, stress and behavior. Previous studies of sexually dimorphic gene expression in the adult head, which includes both nervous system and endocrine tissues, have revealed variation in sex-differential expression, depending in part on genotype and environment. To understand the degree to which the environmentally responsive insulin signaling pathway contributes to sexual dimorphism of gene expression, we examined the effect of perturbation of the pathway on gene expression in male and female Drosophila heads. Results Our data reveal a large effect of insulin signaling on gene expression, with greater than 50% of genes examined changing expression. Males and females have a shared gene expression response to knock-down of InR function, with significant enrichment for pathways involved in metabolism. Perturbation of insulin signaling has a greater impact on gene expression in males, with more genes changing expression and with gene expression differences of larger magnitude. Primarily as a consequence of the response in males, we find that reduced insulin signaling results in a striking increase in sex-differential expression. This includes sex-differences in expression of immune, defense and stress response genes, genes involved in modulating reproductive behavior, genes linking insulin signaling and ageing, and in the insulin signaling pathway itself. Conclusions Our results demonstrate that perturbation of insulin signaling results in thousands of genes displaying sex differences in expression that are not differentially expressed in control conditions. Thus, insulin signaling may play a role in variability of somatic, sex-differential expression. The finding that perturbation of the IIS/TOR pathway results in an altered landscape of sex-differential expression suggests a role of insulin signaling in the physiological underpinnings of trade-offs, sexual conflict and sex differences in expression variability. Electronic supplementary material The online version of this article (10.1186/s12864-018-5308-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Rita M Graze
- Department of Biological Sciences, Auburn University, 101 Rouse Life Sciences building, Auburn, AL, 36849-5407, USA.
| | - Ruei-Ying Tzeng
- Biomedical Sciences Department, Florida State University, College of Medicine, 1115 West Call Street, Tallahassee, FL, 32306, USA
| | - Tiffany S Howard
- Department of Biological Sciences, Auburn University, 101 Rouse Life Sciences building, Auburn, AL, 36849-5407, USA
| | - Michelle N Arbeitman
- Biomedical Sciences Department, Florida State University, College of Medicine, 1115 West Call Street, Tallahassee, FL, 32306, USA.
| |
Collapse
|
5
|
Zeng ISL, Lumley T. Review of Statistical Learning Methods in Integrated Omics Studies (An Integrated Information Science). Bioinform Biol Insights 2018; 12:1177932218759292. [PMID: 29497285 PMCID: PMC5824897 DOI: 10.1177/1177932218759292] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2017] [Accepted: 01/24/2018] [Indexed: 12/14/2022] Open
Abstract
Integrated omics is becoming a new channel for investigating the complex molecular system in modern biological science and sets a foundation for systematic learning for precision medicine. The statistical/machine learning methods that have emerged in the past decade for integrated omics are not only innovative but also multidisciplinary with integrated knowledge in biology, medicine, statistics, machine learning, and artificial intelligence. Here, we review the nontrivial classes of learning methods from the statistical aspects and streamline these learning methods within the statistical learning framework. The intriguing findings from the review are that the methods used are generalizable to other disciplines with complex systematic structure, and the integrated omics is part of an integrated information science which has collated and integrated different types of information for inferences and decision making. We review the statistical learning methods of exploratory and supervised learning from 42 publications. We also discuss the strengths and limitations of the extended principal component analysis, cluster analysis, network analysis, and regression methods. Statistical techniques such as penalization for sparsity induction when there are fewer observations than the number of features and using Bayesian approach when there are prior knowledge to be integrated are also included in the commentary. For the completeness of the review, a table of currently available software and packages from 23 publications for omics are summarized in the appendix.
Collapse
Affiliation(s)
- Irene Sui Lan Zeng
- Department of Statistics, Faculty of Science, The University of Auckland, Auckland, New Zealand
| | - Thomas Lumley
- Department of Statistics, Faculty of Science, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
6
|
Abstract
Data processing and analysis are major bottlenecks in high-throughput metabolomic experiments. Recent advancements in data acquisition platforms are driving trends toward increasing data size (e.g., petabyte scale) and complexity (multiple omic platforms). Improvements in data analysis software and in silico methods are similarly required to effectively utilize these advancements and link the acquired data with biological interpretations. Herein, we provide an overview of recently developed and freely available metabolomic tools, algorithms, databases, and data analysis frameworks. This overview of popular tools for MS and NMR-based metabolomics is organized into the following sections: data processing, annotation, analysis, and visualization. The following overview of newly developed tools helps to better inform researchers to support the emergence of metabolomics as an integral tool for the study of biochemistry, systems biology, environmental analysis, health, and personalized medicine.
Collapse
Affiliation(s)
- Biswapriya B Misra
- Department of Genetics, Texas Biomedical Research Institute, San Antonio, TX, USA
| | - Johannes F Fahrmann
- Department of Clinical Cancer Prevention, University of Texas MD Anderson Cancer Center, TX, USA
| | | |
Collapse
|
7
|
Using the Semantic Web for Rapid Integration of WikiPathways with Other Biological Online Data Resources. PLoS Comput Biol 2016; 12:e1004989. [PMID: 27336457 PMCID: PMC4918977 DOI: 10.1371/journal.pcbi.1004989] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Accepted: 05/17/2016] [Indexed: 12/21/2022] Open
Abstract
The diversity of online resources storing biological data in different formats provides a challenge for bioinformaticians to integrate and analyse their biological data. The semantic web provides a standard to facilitate knowledge integration using statements built as triples describing a relation between two objects. WikiPathways, an online collaborative pathway resource, is now available in the semantic web through a SPARQL endpoint at http://sparql.wikipathways.org. Having biological pathways in the semantic web allows rapid integration with data from other resources that contain information about elements present in pathways using SPARQL queries. In order to convert WikiPathways content into meaningful triples we developed two new vocabularies that capture the graphical representation and the pathway logic, respectively. Each gene, protein, and metabolite in a given pathway is defined with a standard set of identifiers to support linking to several other biological resources in the semantic web. WikiPathways triples were loaded into the Open PHACTS discovery platform and are available through its Web API (https://dev.openphacts.org/docs) to be used in various tools for drug development. We combined various semantic web resources with the newly converted WikiPathways content using a variety of SPARQL query types and third-party resources, such as the Open PHACTS API. The ability to use pathway information to form new links across diverse biological data highlights the utility of integrating WikiPathways in the semantic web.
Collapse
|
8
|
Abstract
Allosteric effects of mutations, ligand binding, or post-translational modifications on protein function occur through changes to the protein's shape, or conformation. In a cell, there are many copies of the same protein, all experiencing these perturbations in a dynamic fashion and fluctuating through different conformations and activity states. According to the "conformational selection and population shift" theory, ligand binding selects a particular conformation. This perturbs the ensemble and induces a population shift. In a new PLOS Biology paper, Melacini and colleagues describe a novel model of protein regulation, the "Double-Conformational Selection Model", which demonstrates how two tandem ligand-binding domains interact to regulate protein function. Here we explain how tandem domains with tuned interactions-but not single domains-can provide a blueprint for sensitive activation sensors within a narrow window of ligand concentration, thereby promoting signaling control.
Collapse
Affiliation(s)
- Ruth Nussinov
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick, Maryland, United States of America
- Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
- Sackler Institute of Molecular Medicine, Department of Human Genetics and Molecular Medicine, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Chung-Jung Tsai
- Cancer and Inflammation Program, Leidos Biomedical Research, Inc., Frederick, Maryland, United States of America
- Frederick National Laboratory for Cancer Research, National Cancer Institute, Frederick, Maryland, United States of America
| |
Collapse
|
9
|
Kutmon M, Riutta A, Nunes N, Hanspers K, Willighagen EL, Bohler A, Mélius J, Waagmeester A, Sinha SR, Miller R, Coort SL, Cirillo E, Smeets B, Evelo CT, Pico AR. WikiPathways: capturing the full diversity of pathway knowledge. Nucleic Acids Res 2015; 44:D488-94. [PMID: 26481357 PMCID: PMC4702772 DOI: 10.1093/nar/gkv1024] [Citation(s) in RCA: 292] [Impact Index Per Article: 32.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2015] [Accepted: 09/28/2015] [Indexed: 12/19/2022] Open
Abstract
WikiPathways (http://www.wikipathways.org) is an open, collaborative platform for capturing and disseminating models of biological pathways for data visualization and analysis. Since our last NAR update, 4 years ago, WikiPathways has experienced massive growth in content, which continues to be contributed by hundreds of individuals each year. New aspects of the diversity and depth of the collected pathways are described from the perspective of researchers interested in using pathway information in their studies. We provide updates on extensions and services to support pathway analysis and visualization via popular standalone tools, i.e. PathVisio and Cytoscape, web applications and common programming environments. We introduce the Quick Edit feature for pathway authors and curators, in addition to new means of publishing pathways and maintaining custom pathway collections to serve specific research topics and communities. In addition to the latest milestones in our pathway collection and curation effort, we also highlight the latest means to access the content as publishable figures, as standard data files, and as linked data, including bulk and programmatic access.
Collapse
Affiliation(s)
- Martina Kutmon
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | - Anders Riutta
- Gladstone Institutes, San Francisco, California, CA 94158, USA
| | - Nuno Nunes
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | | | - Egon L Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | - Anwesha Bohler
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | - Jonathan Mélius
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | - Andra Waagmeester
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands Micelio, Antwerp, 2180 Antwerp, Belgium
| | - Sravanthi R Sinha
- Keshav Memorial Institute of Technology, Hyderabad, Telangana 500029, India
| | - Ryan Miller
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | - Susan L Coort
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | - Elisa Cirillo
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | - Bart Smeets
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | - Chris T Evelo
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| | - Alexander R Pico
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, 6229 ER Maastricht, The Netherlands
| |
Collapse
|