1
|
Fresnais L, Perin O, Riu A, Grall R, Ott A, Fromenty B, Gallardo JC, Stingl M, Frainay C, Jourdan F, Poupin N. A strategy to detect metabolic changes induced by exposure to chemicals from large sets of condition-specific metabolic models computed with enumeration techniques. BMC Bioinformatics 2024; 25:234. [PMID: 38992584 PMCID: PMC11238488 DOI: 10.1186/s12859-024-05845-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 06/14/2024] [Indexed: 07/13/2024] Open
Abstract
BACKGROUND The growing abundance of in vitro omics data, coupled with the necessity to reduce animal testing in the safety assessment of chemical compounds and even eliminate it in the evaluation of cosmetics, highlights the need for adequate computational methodologies. Data from omics technologies allow the exploration of a wide range of biological processes, therefore providing a better understanding of mechanisms of action (MoA) related to chemical exposure in biological systems. However, the analysis of these large datasets remains difficult due to the complexity of modulations spanning multiple biological processes. RESULTS To address this, we propose a strategy to reduce information overload by computing, based on transcriptomics data, a comprehensive metabolic sub-network reflecting the metabolic impact of a chemical. The proposed strategy integrates transcriptomic data to a genome scale metabolic network through enumeration of condition-specific metabolic models hence translating transcriptomics data into reaction activity probabilities. Based on these results, a graph algorithm is applied to retrieve user readable sub-networks reflecting the possible metabolic MoA (mMoA) of chemicals. This strategy has been implemented as a three-step workflow. The first step consists in building cell condition-specific models reflecting the metabolic impact of each exposure condition while taking into account the diversity of possible optimal solutions with a partial enumeration algorithm. In a second step, we address the challenge of analyzing thousands of enumerated condition-specific networks by computing differentially activated reactions (DARs) between the two sets of enumerated possible condition-specific models. Finally, in the third step, DARs are grouped into clusters of functionally interconnected metabolic reactions, representing possible mMoA, using the distance-based clustering and subnetwork extraction method. The first part of the workflow was exemplified on eight molecules selected for their known human hepatotoxic outcomes associated with specific MoAs well described in the literature and for which we retrieved primary human hepatocytes transcriptomic data in Open TG-GATEs. Then, we further applied this strategy to more precisely model and visualize associated mMoA for two of these eight molecules (amiodarone and valproic acid). The approach proved to go beyond gene-based analysis by identifying mMoA when few genes are significantly differentially expressed (2 differentially expressed genes (DEGs) for amiodarone), bringing additional information from the network topology, or when very large number of genes were differentially expressed (5709 DEGs for valproic acid). In both cases, the results of our strategy well fitted evidence from the literature regarding known MoA. Beyond these confirmations, the workflow highlighted potential other unexplored mMoA. CONCLUSION The proposed strategy allows toxicology experts to decipher which part of cellular metabolism is expected to be affected by the exposition to a given chemical. The approach originality resides in the combination of different metabolic modelling approaches (constraint based and graph modelling). The application to two model molecules shows the strong potential of the approach for interpretation and visual mining of complex omics in vitro data. The presented strategy is freely available as a python module ( https://pypi.org/project/manamodeller/ ) and jupyter notebooks ( https://github.com/LouisonF/MANA ).
Collapse
Affiliation(s)
- Louison Fresnais
- UMR1331 Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France.
- L'Oréal Research and Innovation, Aulnay-sous-Bois, France.
| | - Olivier Perin
- L'Oréal Research and Innovation, Aulnay-sous-Bois, France
| | - Anne Riu
- L'Oréal Research and Innovation, Aulnay-sous-Bois, France
| | - Romain Grall
- L'Oréal Research and Innovation, Aulnay-sous-Bois, France
| | - Alban Ott
- L'Oréal Research and Innovation, Aulnay-sous-Bois, France
| | - Bernard Fromenty
- Institut NUMECAN (Nutrition Metabolisms and Cancer) UMR_A 1317, UMR_S 1241, INSERM, Univ Rennes, INRAE, 35000, Rennes, France
| | - Jean-Clément Gallardo
- UMR1331 Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Maximilian Stingl
- UMR1331 Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Clément Frainay
- UMR1331 Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Fabien Jourdan
- UMR1331 Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
- MetaboHUB-MetaToul, National Infrastructure of Metabolomics and Fluxomics, Toulouse, France
| | - Nathalie Poupin
- UMR1331 Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France.
| |
Collapse
|
2
|
Wieder C, Cooke J, Frainay C, Poupin N, Bowler R, Jourdan F, Kechris KJ, Lai RPJ, Ebbels T. PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration. PLoS Comput Biol 2024; 20:e1011814. [PMID: 38527092 PMCID: PMC10994553 DOI: 10.1371/journal.pcbi.1011814] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 04/04/2024] [Accepted: 03/11/2024] [Indexed: 03/27/2024] Open
Abstract
As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. PathIntegrate is available as an open-source Python package.
Collapse
Affiliation(s)
- Cecilia Wieder
- Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Juliette Cooke
- Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Clement Frainay
- Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Nathalie Poupin
- Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Russell Bowler
- National Jewish Health, Denver, Colorado, United States of America
| | - Fabien Jourdan
- MetaboHUB-Metatoul, National Infrastructure of Metabolomics and Fluxomics, Toulouse, France
| | - Katerina J. Kechris
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, Colorado, United States of America
| | - Rachel PJ Lai
- Department of Infectious Disease, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Timothy Ebbels
- Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
3
|
Wieder C, Cooke J, Frainay C, Poupin N, Bowler R, Jourdan F, Kechris KJ, Lai RP, Ebbels T. PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.09.574780. [PMID: 38260498 PMCID: PMC10802464 DOI: 10.1101/2024.01.09.574780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
As terabytes of multi-omics data are being generated, there is an ever-increasing need for methods facilitating the integration and interpretation of such data. Current multi-omics integration methods typically output lists, clusters, or subnetworks of molecules related to an outcome. Even with expert domain knowledge, discerning the biological processes involved is a time-consuming activity. Here we propose PathIntegrate, a method for integrating multi-omics datasets based on pathways, designed to exploit knowledge of biological systems and thus provide interpretable models for such studies. PathIntegrate employs single-sample pathway analysis to transform multi-omics datasets from the molecular to the pathway-level, and applies a predictive single-view or multi-view model to integrate the data. Model outputs include multi-omics pathways ranked by their contribution to the outcome prediction, the contribution of each omics layer, and the importance of each molecule in a pathway. Using semi-synthetic data we demonstrate the benefit of grouping molecules into pathways to detect signals in low signal-to-noise scenarios, as well as the ability of PathIntegrate to precisely identify important pathways at low effect sizes. Finally, using COPD and COVID-19 data we showcase how PathIntegrate enables convenient integration and interpretation of complex high-dimensional multi-omics datasets. The PathIntegrate Python package is available at https://github.com/cwieder/PathIntegrate.
Collapse
Affiliation(s)
- Cecilia Wieder
- Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Juliette Cooke
- Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Clement Frainay
- Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Nathalie Poupin
- Toxalim (Research Centre in Food Toxicology), Université de Toulouse, INRAE, ENVT, INP-Purpan, UPS, Toulouse, France
| | - Russell Bowler
- National Jewish Health, 1400 Jackson Street, Denver, CO, 80206, USA
| | - Fabien Jourdan
- MetaboHUB-Metatoul, National Infrastructure of Metabolomics and Fluxomics, Toulouse, France
| | - Katerina J Kechris
- Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, Aurora, CO, United States of America
| | - Rachel Pj Lai
- Department of Infectious Disease, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Timothy Ebbels
- Section of Bioinformatics, Division of Systems Medicine, Department of Metabolism, Digestion, and Reproduction, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
4
|
Molversmyr H, Øyås O, Rotnes F, Vik JO. Extracting functionally accurate context-specific models of Atlantic salmon metabolism. NPJ Syst Biol Appl 2023; 9:19. [PMID: 37244928 DOI: 10.1038/s41540-023-00280-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Accepted: 05/05/2023] [Indexed: 05/29/2023] Open
Abstract
Constraint-based models (CBMs) are used to study metabolic network structure and function in organisms ranging from microbes to multicellular eukaryotes. Published CBMs are usually generic rather than context-specific, meaning that they do not capture differences in reaction activities, which, in turn, determine metabolic capabilities, between cell types, tissues, environments, or other conditions. Only a subset of a CBM's metabolic reactions and capabilities are likely to be active in any given context, and several methods have therefore been developed to extract context-specific models from generic CBMs through integration of omics data. We tested the ability of six model extraction methods (MEMs) to create functionally accurate context-specific models of Atlantic salmon using a generic CBM (SALARECON) and liver transcriptomics data from contexts differing in water salinity (life stage) and dietary lipids. Three MEMs (iMAT, INIT, and GIMME) outperformed the others in terms of functional accuracy, which we defined as the extracted models' ability to perform context-specific metabolic tasks inferred directly from the data, and one MEM (GIMME) was faster than the others. Context-specific versions of SALARECON consistently outperformed the generic version, showing that context-specific modeling better captures salmon metabolism. Thus, we demonstrate that results from human studies also hold for a non-mammalian animal and major livestock species.
Collapse
Affiliation(s)
- Håvard Molversmyr
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | - Ove Øyås
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | - Filip Rotnes
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway
| | - Jon Olav Vik
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway.
- Faculty of Biosciences, Norwegian University of Life Sciences, Ås, Norway.
| |
Collapse
|