Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Download

Total Articles

47
(from Reference Citation Analysis)

Article PDFs (27)

Cited by > 0 (39)

Searched Name

Juho Rousu

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Indexed Articles

Year Published

Show more Refine

Article Type

Show more Refine

Article Statistics

Refine

MESH Headings

Show more Refine

First Author

Show more Refine

First Author Affiliations

Show more Refine

Authors

Show more Refine

Publication Titles

Show more Refine

Grant Agencies

Show more Refine

Countries/Regions

Show more Refine

Affiliations

Show more Refine

Corresponding Author Affiliations

Show more Refine

Category

Show more Refine

Number

Citation Analysis

Abbasi F, Rousu J. New methods for drug synergy prediction: A mini-review. Curr Opin Struct Biol 2024;86:102827. [PMID: 38705070 DOI: 10.1016/j.sbi.2024.102827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 04/12/2024] [Accepted: 04/12/2024] [Indexed: 05/07/2024]

Armah-Sekum RE, Szedmak S, Rousu J. Protein function prediction through multi-view multi-label latent tensor reconstruction. BMC Bioinformatics 2024;25:174. [PMID: 38698340 PMCID: PMC11067221 DOI: 10.1186/s12859-024-05789-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Accepted: 04/17/2024] [Indexed: 05/05/2024] Open

Astero M, Rousu J. Learning symmetry-aware atom mapping in chemical reactions through deep graph matching. J Cheminform 2024;16:46. [PMID: 38650016 PMCID: PMC11036715 DOI: 10.1186/s13321-024-00841-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 04/07/2024] [Indexed: 04/25/2024] Open

Abstract

Accurate atom mapping, which establishes correspondences between atoms in reactants and products, is a crucial step in analyzing chemical reactions. In this paper, we present a novel end-to-end approach that formulates the atom mapping problem as a deep graph matching task. Our proposed model, AMNet (Atom Matching Network), utilizes molecular graph representations and employs various atom and bond features using graph neural networks to capture the intricate structural characteristics of molecules, ensuring precise atom correspondence predictions. Notably, AMNet incorporates the consideration of molecule symmetry, enhancing accuracy while simultaneously reducing computational complexity. The integration of the Weisfeiler-Lehman isomorphism test for symmetry identification refines the model's predictions. Furthermore, our model maps the entire atom set in a chemical reaction, offering a comprehensive approach beyond focusing solely on the main molecules in reactions. We evaluated AMNet's performance on a subset of USPTO reaction datasets, addressing various tasks, including assessing the impact of molecular symmetry identification, understanding the influence of feature selection on AMNet performance, and comparing its performance with the state-of-the-art method. The result reveals an average accuracy of 97.3% on mapped atoms, with 99.7% of reactions correctly mapped when the correct mapped atom is within the top 10 predicted atoms.Scientific contributionThe paper introduces a novel end-to-end deep graph matching model for atom mapping, utilizing molecular graph representations to capture structural characteristics effectively. It enhances accuracy by integrating symmetry detection through the Weisfeiler-Lehman test, reducing the number of possible mappings and improving efficiency. Unlike previous methods, it maps the entire reaction, not just main components, providing a comprehensive view. Additionally, by integrating efficient graph matching techniques, it reduces computational complexity, making atom mapping more feasible.

Collapse

Sandström H, Rissanen M, Rousu J, Rinke P. Data-Driven Compound Identification in Atmospheric Mass Spectrometry. Adv Sci (Weinh) 2024;11:e2306235. [PMID: 38095508 PMCID: PMC10885664 DOI: 10.1002/advs.202306235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/04/2023] [Indexed: 02/24/2024]

Sabzevari M, Szedmak S, Penttilä M, Jouhten P, Rousu J. Strain design optimization using reinforcement learning. PLoS Comput Biol 2022;18:e1010177. [PMID: 35658018 PMCID: PMC9200333 DOI: 10.1371/journal.pcbi.1010177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 06/15/2022] [Accepted: 05/06/2022] [Indexed: 11/18/2022] Open

Abstract Engineered microbial cells present a sustainable alternative to fossil-based synthesis of chemicals and fuels. Cellular synthesis routes are readily assembled and introduced into microbial strains using state-of-the-art synthetic biology tools. However, the optimization of the strains required to reach industrially feasible production levels is far less efficient. It typically relies on trial-and-error leading into high uncertainty in total duration and cost. New techniques that can cope with the complexity and limited mechanistic knowledge of the cellular regulation are called for guiding the strain optimization. In this paper, we put forward a multi-agent reinforcement learning (MARL) approach that learns from experiments to tune the metabolic enzyme levels so that the production is improved. Our method is model-free and does not assume prior knowledge of the microbe’s metabolic network or its regulation. The multi-agent approach is well-suited to make use of parallel experiments such as multi-well plates commonly used for screening microbial strains. We demonstrate the method’s capabilities using the genome-scale kinetic model of Escherichia coli, k-ecoli457, as a surrogate for an in vivo cell behaviour in cultivation experiments. We investigate the method’s performance relevant for practical applicability in strain engineering i.e. the speed of convergence towards the optimum response, noise tolerance, and the statistical stability of the solutions found. We further evaluate the proposed MARL approach in improving L-tryptophan production by yeast Saccharomyces cerevisiae, using publicly available experimental data on the performance of a combinatorial strain library. Overall, our results show that multi-agent reinforcement learning is a promising approach for guiding the strain optimization beyond mechanistic knowledge, with the goal of faster and more reliably obtaining industrially attractive production levels. Collapse

Kong W, Midena G, Chen Y, Athanasiadis P, Wang T, Rousu J, He L, Aittokallio T. Systematic review of computational methods for drug combination prediction. Comput Struct Biotechnol J 2022;20:2807-2814. [PMID: 35685365 PMCID: PMC9168078 DOI: 10.1016/j.csbj.2022.05.055] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 05/27/2022] [Accepted: 05/27/2022] [Indexed: 12/26/2022] Open

Bach E, Rogers S, Williamson J, Rousu J. Probabilistic framework for integration of mass spectrum and retention time information in small molecule identification. Bioinformatics 2021;37:1724-1731. [PMID: 33244585 PMCID: PMC8289373 DOI: 10.1093/bioinformatics/btaa998] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 10/27/2020] [Accepted: 11/17/2020] [Indexed: 11/14/2022] Open

Wang T, Szedmak S, Wang H, Aittokallio T, Pahikkala T, Cichonska A, Rousu J. Modeling drug combination effects via latent tensor reconstruction. Bioinformatics 2021;37:i93-i101. [PMID: 34252952 PMCID: PMC8336593 DOI: 10.1093/bioinformatics/btab308] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open

Hjörleifsson Eldjárn G, Ramsay A, van der Hooft JJJ, Duncan KR, Soldatou S, Rousu J, Daly R, Wandy J, Rogers S. Ranking microbial metabolomic and genomic links in the NPLinker framework using complementary scoring functions. PLoS Comput Biol 2021;17:e1008920. [PMID: 33945539 PMCID: PMC8130963 DOI: 10.1371/journal.pcbi.1008920] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 05/18/2021] [Accepted: 03/26/2021] [Indexed: 12/31/2022] Open

Dührkop K, Nothias LF, Fleischauer M, Reher R, Ludwig M, Hoffmann MA, Petras D, Gerwick WH, Rousu J, Dorrestein PC, Böcker S. Systematic classification of unknown metabolites using high-resolution fragmentation mass spectra. Nat Biotechnol 2021;39:462-471. [PMID: 33230292 DOI: 10.1038/s41587-020-0740-8] [Citation(s) in RCA: 233] [Impact Index Per Article: 77.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Accepted: 10/16/2020] [Indexed: 12/12/2022]

Wang T, Gautam P, Rousu J, Aittokallio T. Systematic mapping of cancer cell target dependencies using high-throughput drug screening in triple-negative breast cancer. Comput Struct Biotechnol J 2020;18:3819-3832. [PMID: 33335681 PMCID: PMC7720026 DOI: 10.1016/j.csbj.2020.11.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 10/23/2020] [Accepted: 11/01/2020] [Indexed: 12/31/2022] Open

Abstract

While high-throughput drug screening offers possibilities to profile phenotypic responses of hundreds of compounds, elucidation of the cell context-specific mechanisms of drug action requires additional analyses. To that end, we developed a computational target deconvolution pipeline that identifies the key target dependencies based on collective drug response patterns in each cell line separately. The pipeline combines quantitative drug-cell line responses with drug-target interaction networks among both intended on- and potent off-targets to identify pharmaceutically actionable and selective therapeutic targets. To demonstrate its performance, the target deconvolution pipeline was applied to 310 small molecules tested on 20 genetically and phenotypically heterogeneous triple-negative breast cancer (TNBC) cell lines to identify cell line-specific target mechanisms in terms of cytotoxic and cytostatic drug target vulnerabilities. The functional essentiality of each protein target was quantified with a target addiction score (TAS), as a measure of dependency of the cell line on the therapeutic target. The target dependency profiling was shown to capture inhibitory information that is complementary to that obtained from the structure or sensitivity of the drugs. Comparison of the TAS profiles and gene essentiality scores from CRISPR-Cas9 knockout screens revealed that certain proteins with low gene essentiality showed high target addictions, suggesting that they might be functioning as protein groups, and therefore be resistant to single gene knock-out. The comparative analysis discovered protein groups of potential multi-target synthetic lethal interactions, for instance, among histone deacetylases (HDACs). Our integrated approach also recovered a number of well-established TNBC cell line-specific drivers and known TNBC therapeutic targets, such as HDACs and cyclin-dependent kinases (CDKs). The present work provides novel insights into druggable vulnerabilities for TNBC, and opportunities to identify multi-target synthetic lethal interactions for further studies.

Collapse

Voutilainen S, Heinonen M, Andberg M, Jokinen E, Maaheimo H, Pääkkönen J, Hakulinen N, Rouvinen J, Lähdesmäki H, Kaski S, Rousu J, Penttilä M, Koivula A. Substrate specificity of 2-deoxy-D-ribose 5-phosphate aldolase (DERA) assessed by different protein engineering and machine learning methods. Appl Microbiol Biotechnol 2020;104:10515-10529. [PMID: 33147349 PMCID: PMC7671976 DOI: 10.1007/s00253-020-10960-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 10/01/2020] [Accepted: 10/12/2020] [Indexed: 11/29/2022]

Heinonen M, Osmala M, Mannerström H, Wallenius J, Kaski S, Rousu J, Lähdesmäki H. Bayesian metabolic flux analysis reveals intracellular flux couplings. Bioinformatics 2020;35:i548-i557. [PMID: 31510676 PMCID: PMC6612884 DOI: 10.1093/bioinformatics/btz315] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Bhadra S, Blomberg P, Castillo S, Rousu J. Principal metabolic flux mode analysis. Bioinformatics 2019;34:2409-2417. [PMID: 29420676 PMCID: PMC6041797 DOI: 10.1093/bioinformatics/bty049] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Accepted: 02/06/2018] [Indexed: 01/01/2023] Open

Abstract

Motivation

In the analysis of metabolism, two distinct and complementary approaches are frequently used: Principal component analysis (PCA) and stoichiometric flux analysis. PCA is able to capture the main modes of variability in a set of experiments and does not make many prior assumptions about the data, but does not inherently take into account the flux mode structure of metabolism. Stoichiometric flux analysis methods, such as Flux Balance Analysis (FBA) and Elementary Mode Analysis, on the other hand, are able to capture the metabolic flux modes, however, they are primarily designed for the analysis of single samples at a time, and not best suited for exploratory analysis on a large sets of samples.

Results

We propose a new methodology for the analysis of metabolism, called Principal Metabolic Flux Mode Analysis (PMFA), which marries the PCA and stoichiometric flux analysis approaches in an elegant regularized optimization framework. In short, the method incorporates a variance maximization objective form PCA coupled with a stoichiometric regularizer, which penalizes projections that are far from any flux modes of the network. For interpretability, we also introduce a sparse variant of PMFA that favours flux modes that contain a small number of reactions. Our experiments demonstrate the versatility and capabilities of our methodology. The proposed method can be applied to genome-scale metabolic network in efficient way as PMFA does not enumerate elementary modes. In addition, the method is more robust on out-of-steady steady-state experimental data than competing flux mode analysis approaches.

Availability and implementation

Matlab software for PMFA and SPMFA and dataset used for experiments are available in https://github.com/aalto-ics-kepaco/PMFA.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Brouard C, Bassé A, d'Alché-Buc F, Rousu J. Improved Small Molecule Identification through Learning Combinations of Kernel Regression Models. Metabolites 2019;9:E160. [PMID: 31374904 PMCID: PMC6724104 DOI: 10.3390/metabo9080160] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2019] [Revised: 07/30/2019] [Accepted: 07/31/2019] [Indexed: 01/15/2023] Open

Bach E, Szedmak S, Brouard C, Böcker S, Rousu J. Liquid-chromatography retention order prediction for metabolite identification. Bioinformatics 2018;34:i875-i883. [DOI: 10.1093/bioinformatics/bty590] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Cichonska A, Pahikkala T, Szedmak S, Julkunen H, Airola A, Heinonen M, Aittokallio T, Rousu J. Learning with multiple pairwise kernels for drug bioactivity prediction. Bioinformatics 2018;34:i509-i518. [PMID: 29949975 PMCID: PMC6022556 DOI: 10.1093/bioinformatics/bty277] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Abstract

Motivation

Many inference problems in bioinformatics, including drug bioactivity prediction, can be formulated as pairwise learning problems, in which one is interested in making predictions for pairs of objects, e.g. drugs and their targets. Kernel-based approaches have emerged as powerful tools for solving problems of that kind, and especially multiple kernel learning (MKL) offers promising benefits as it enables integrating various types of complex biomedical information sources in the form of kernels, along with learning their importance for the prediction task. However, the immense size of pairwise kernel spaces remains a major bottleneck, making the existing MKL algorithms computationally infeasible even for small number of input pairs.

Results

We introduce pairwiseMKL, the first method for time- and memory-efficient learning with multiple pairwise kernels. pairwiseMKL first determines the mixture weights of the input pairwise kernels, and then learns the pairwise prediction function. Both steps are performed efficiently without explicit computation of the massive pairwise matrices, therefore making the method applicable to solving large pairwise learning problems. We demonstrate the performance of pairwiseMKL in two related tasks of quantitative drug bioactivity prediction using up to 167 995 bioactivity measurements and 3120 pairwise kernels: (i) prediction of anticancer efficacy of drug compounds across a large panel of cancer cell lines; and (ii) prediction of target profiles of anticancer compounds across their kinome-wide target spaces. We show that pairwiseMKL provides accurate predictions using sparse solutions in terms of selected kernels, and therefore it automatically identifies also data sources relevant for the prediction problem.

Availability and implementation

Code is available at https://github.com/aalto-ics-kepaco.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Bhadra S, Rousu J. Analysis of Fluxomic Experiments with Principal Metabolic Flux Mode Analysis. Methods Mol Biol 2018;1807:141-161. [PMID: 30030809 DOI: 10.1007/978-1-4939-8561-6_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Cankorur-Cetinkaya A, Dias JML, Kludas J, Slater NKH, Rousu J, Oliver SG, Dikicioglu D. Erratum: CamOptimus: a tool for exploiting complex adaptive evolution to optimise experiments and processes in biotechnology. Microbiology (Reading) 2017;163:1369. [DOI: 10.1099/mic.0.000530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Accepted: 08/21/2017] [Indexed: 11/18/2022] Open

Brouard C, Shen H, Dührkop K, d'Alché-Buc F, Böcker S, Rousu J. Fast metabolite identification with Input Output Kernel Regression. Bioinformatics 2017;32:i28-i36. [PMID: 27307628 PMCID: PMC4908330 DOI: 10.1093/bioinformatics/btw246] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Cankorur-Cetinkaya A, Dias JML, Kludas J, Slater NKH, Rousu J, Oliver SG, Dikicioglu D. CamOptimus: a tool for exploiting complex adaptive evolution to optimize experiments and processes in biotechnology. Microbiology (Reading) 2017. [PMID: 28635591 PMCID: PMC5817226 DOI: 10.1099/mic.0.000477] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Schymanski EL, Ruttkies C, Krauss M, Brouard C, Kind T, Dührkop K, Allen F, Vaniya A, Verdegem D, Böcker S, Rousu J, Shen H, Tsugawa H, Sajed T, Fiehn O, Ghesquière B, Neumann S. Critical Assessment of Small Molecule Identification 2016: automated methods. J Cheminform 2017;9:22. [PMID: 29086042 PMCID: PMC5368104 DOI: 10.1186/s13321-017-0207-1] [Citation(s) in RCA: 96] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 03/13/2017] [Indexed: 12/30/2022] Open

Abstract

BACKGROUND

The fourth round of the Critical Assessment of Small Molecule Identification (CASMI) Contest ( www.casmi-contest.org ) was held in 2016, with two new categories for automated methods. This article covers the 208 challenges in Categories 2 and 3, without and with metadata, from organization, participation, results and post-contest evaluation of CASMI 2016 through to perspectives for future contests and small molecule annotation/identification.

RESULTS

The Input Output Kernel Regression (CSI:IOKR) machine learning approach performed best in "Category 2: Best Automatic Structural Identification-In Silico Fragmentation Only", won by Team Brouard with 41% challenge wins. The winner of "Category 3: Best Automatic Structural Identification-Full Information" was Team Kind (MS-FINDER), with 76% challenge wins. The best methods were able to achieve over 30% Top 1 ranks in Category 2, with all methods ranking the correct candidate in the Top 10 in around 50% of challenges. This success rate rose to 70% Top 1 ranks in Category 3, with candidates in the Top 10 in over 80% of the challenges. The machine learning and chemistry-based approaches are shown to perform in complementary ways.

CONCLUSIONS

The improvement in (semi-)automated fragmentation methods for small molecule identification has been substantial. The achieved high rates of correct candidates in the Top 1 and Top 10, despite large candidate numbers, open up great possibilities for high-throughput annotation of untargeted analysis for "known unknowns". As more high quality training data becomes available, the improvements in machine learning methods will likely continue, but the alternative approaches still provide valuable complementary information. Improved integration of experimental context will also improve identification success further for "real life" annotations. The true "unknown unknowns" remain to be evaluated in future CASMI contests. Graphical abstract .

Collapse

Affiliation(s)

Emma L Schymanski Eawag: Swiss Federal Institute for Aquatic Science and Technology, Überlandstrasse 133, 8600, Dübendorf, Switzerland.
Christoph Ruttkies Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany
Martin Krauss Department of Effect-Directed Analysis, UFZ: Helmholtz Centre for Environmental Research, Permoserstrasse 15, 04318, Leipzig, Germany
Céline Brouard Department of Computer Science, Aalto University, Konemiehentie 2, 02150, Espoo, Finland Helsinki Institute for Information Technology, Tekniikantie 14, 02150, Espoo, Finland
Tobias Kind West Coast Metabolomics Center and Genome Center, University of California Davis, 451 Health Sciences Drive, Davis, CA, 95616, USA
Kai Dührkop Chair of Bioinformatics, Friedrich-Schiller-University, Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany
Felicity Allen Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E9, Canada
Arpana Vaniya West Coast Metabolomics Center and Genome Center, University of California Davis, 451 Health Sciences Drive, Davis, CA, 95616, USA Department of Chemistry, University of California Davis, One Shields Avenue, Davis, CA, 95616, USA
Dries Verdegem Metabolomics Expertise Center, Vesalius Research Center (VRC), VIB, KU Leuven - University of Leuven, 3000, Louvain, Belgium
Sebastian Böcker Chair of Bioinformatics, Friedrich-Schiller-University, Jena, Ernst-Abbe-Platz 2, 07743, Jena, Germany
Juho Rousu Department of Computer Science, Aalto University, Konemiehentie 2, 02150, Espoo, Finland Helsinki Institute for Information Technology, Tekniikantie 14, 02150, Espoo, Finland
Huibin Shen Department of Computer Science, Aalto University, Konemiehentie 2, 02150, Espoo, Finland Helsinki Institute for Information Technology, Tekniikantie 14, 02150, Espoo, Finland
Hiroshi Tsugawa RIKEN Center for Sustainable Resource Science (CSRS), 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
Tanvir Sajed Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E9, Canada
Oliver Fiehn West Coast Metabolomics Center and Genome Center, University of California Davis, 451 Health Sciences Drive, Davis, CA, 95616, USA Department of Biochemistry, Faculty of Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
Bart Ghesquière Metabolomics Expertise Center, Vesalius Research Center (VRC), VIB, KU Leuven - University of Leuven, 3000, Louvain, Belgium
Steffen Neumann Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, Weinberg 3, 06120, Halle, Germany

Collapse

van Dijk ADJ, Lähdesmäki H, de Ridder D, Rousu J. Selected proceedings of Machine Learning in Systems Biology: MLSB 2016. BMC Bioinformatics 2016;17:437. [PMID: 28105910 PMCID: PMC5249013 DOI: 10.1186/s12859-016-1305-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Kludas J, Arvas M, Castillo S, Pakula T, Oja M, Brouard C, Jäntti J, Penttilä M, Rousu J. Machine Learning of Protein Interactions in Fungal Secretory Pathways. PLoS One 2016;11:e0159302. [PMID: 27441920 PMCID: PMC4956264 DOI: 10.1371/journal.pone.0159302] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Accepted: 06/30/2016] [Indexed: 12/18/2022] Open

Cichonska A, Rousu J, Marttinen P, Kangas AJ, Soininen P, Lehtimäki T, Raitakari OT, Järvelin MR, Salomaa V, Ala-Korpela M, Ripatti S, Pirinen M. metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics 2016;32:1981-9. [PMID: 27153689 PMCID: PMC4920109 DOI: 10.1093/bioinformatics/btw052] [Citation(s) in RCA: 76] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2015] [Revised: 12/04/2015] [Accepted: 01/19/2016] [Indexed: 01/22/2023] Open

Affiliation(s)

Anna Cichonska Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland, Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
Juho Rousu Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
Pekka Marttinen Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
Antti J Kangas Computational Medicine, University of Oulu, Oulu University Hospital and Biocenter Oulu, Oulu, Finland
Pasi Soininen Computational Medicine, University of Oulu, Oulu University Hospital and Biocenter Oulu, Oulu, Finland, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland
Terho Lehtimäki Department of Clinical Chemistry, Fimlab Laboratories, University of Tampere School of Medicine, Tampere, Finland
Olli T Raitakari Department of Clinical Physiology and Nuclear Medicine, University of Turku and Turku University Hospital, Turku, Finland, Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Department of Clinical Physiology and Nuclear Medicine, Turku University Hospital, Turku, Finland
Marjo-Riitta Järvelin Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment & Health, School of Public Health, Imperial College London, London, UK, Centre for Life Course Epidemiology, Faculty of Medicine, University of Oulu, Oulu, Finland, Biocenter Oulu, University of Oulu, Oulu, Finland, Unit of Primary Care, Oulu University Hospital, Oulu, Finland
Veikko Salomaa National Institute for Health and Welfare, Helsinki, Finland
Mika Ala-Korpela Computational Medicine, University of Oulu, Oulu University Hospital and Biocenter Oulu, Oulu, Finland, NMR Metabolomics Laboratory, School of Pharmacy, University of Eastern Finland, Kuopio, Finland, Computational Medicine, School of Social and Community Medicine and the Medical Research Council Integrative Epidemiology Unit, University of Bristol, Bristol, UK
Samuli Ripatti Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland, Public Health, University of Helsinki, Helsinki, Finland and Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
Matti Pirinen Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland

Collapse

Honeyborne I, McHugh TD, Kuittinen I, Cichonska A, Evangelopoulos D, Ronacher K, van Helden PD, Gillespie SH, Fernandez-Reyes D, Walzl G, Rousu J, Butcher PD, Waddell SJ. Profiling persistent tubercule bacilli from patient sputa during therapy predicts early drug efficacy. BMC Med 2016;14:68. [PMID: 27055815 PMCID: PMC4825072 DOI: 10.1186/s12916-016-0609-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 03/23/2016] [Indexed: 12/20/2022] Open

Abstract

BACKGROUND

New treatment options are needed to maintain and improve therapy for tuberculosis, which caused the death of 1.5 million people in 2013 despite potential for an 86 % treatment success rate. A greater understanding of Mycobacterium tuberculosis (M.tb) bacilli that persist through drug therapy will aid drug development programs. Predictive biomarkers for treatment efficacy are also a research priority.

METHODS AND RESULTS

Genome-wide transcriptional profiling was used to map the mRNA signatures of M.tb from the sputa of 15 patients before and 3, 7 and 14 days after the start of standard regimen drug treatment. The mRNA profiles of bacilli through the first 2 weeks of therapy reflected drug activity at 3 days with transcriptional signatures at days 7 and 14 consistent with reduced M.tb metabolic activity similar to the profile of pre-chemotherapy bacilli. These results suggest that a pre-existing drug-tolerant M.tb population dominates sputum before and after early drug treatment, and that the mRNA signature at day 3 marks the killing of a drug-sensitive sub-population of bacilli. Modelling patient indices of disease severity with bacterial gene expression patterns demonstrated that both microbiological and clinical parameters were reflected in the divergent M.tb responses and provided evidence that factors such as bacterial load and disease pathology influence the host-pathogen interplay and the phenotypic state of bacilli. Transcriptional signatures were also defined that predicted measures of early treatment success (rate of decline in bacterial load over 3 days, TB test positivity at 2 months, and bacterial load at 2 months).

CONCLUSIONS

This study defines the transcriptional signature of M.tb bacilli that have been expectorated in sputum after two weeks of drug therapy, characterizing the phenotypic state of bacilli that persist through treatment. We demonstrate that variability in clinical manifestations of disease are detectable in bacterial sputa signatures, and that the changing M.tb mRNA profiles 0-2 weeks into chemotherapy predict the efficacy of treatment 6 weeks later. These observations advocate assaying dynamic bacterial phenotypes through drug therapy as biomarkers for treatment success.

Collapse

Affiliation(s)

Isobella Honeyborne Centre for Clinical Microbiology, University College London, London, NW3 2PF, UK
Timothy D McHugh Centre for Clinical Microbiology, University College London, London, NW3 2PF, UK
Iitu Kuittinen Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland
Anna Cichonska Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland.,Institute for Molecular Medicine Finland FIMM, University of Helsinki, Helsinki, Finland
Dimitrios Evangelopoulos Centre for Clinical Microbiology, University College London, London, NW3 2PF, UK
Katharina Ronacher Department of Science and Technology/National Research Foundation Centre of Excellence for Biomedical Tuberculosis Research and Medical Research Council Centre for TB Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Western Cape, South Africa
Paul D van Helden Department of Science and Technology/National Research Foundation Centre of Excellence for Biomedical Tuberculosis Research and Medical Research Council Centre for TB Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Western Cape, South Africa
Stephen H Gillespie Medical and Biological Sciences Building, University of St Andrews, North Haugh, St Andrews, Fife, KY16 9TF, UK
Delmiro Fernandez-Reyes Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, UK.,Department of Paediatrics, University College Hospital, College of Medicine of the University of Ibadan, Ibadan, Nigeria
Gerhard Walzl Department of Science and Technology/National Research Foundation Centre of Excellence for Biomedical Tuberculosis Research and Medical Research Council Centre for TB Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, Western Cape, South Africa
Juho Rousu Department of Computer Science, Helsinki Institute for Information Technology HIIT, Aalto University, Espoo, Finland
Philip D Butcher Institute for Infection and Immunity, St George's University of London, London, SW17 0RE, UK
Simon J Waddell Brighton and Sussex Medical School, University of Sussex, Brighton, BN1 9PX, UK.

Collapse

Rantasalo A, Czeizler E, Virtanen R, Rousu J, Lähdesmäki H, Penttilä M, Jäntti J, Mojzita D. Synthetic Transcription Amplifier System for Orthogonal Control of Gene Expression in Saccharomyces cerevisiae. PLoS One 2016;11:e0148320. [PMID: 26901642 PMCID: PMC4762949 DOI: 10.1371/journal.pone.0148320] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2015] [Accepted: 01/15/2016] [Indexed: 12/26/2022] Open

Cichonska A, Rousu J, Aittokallio T. Identification of drug candidates and repurposing opportunities through compound-target interaction networks. Expert Opin Drug Discov 2015;10:1333-45. [PMID: 26429153 DOI: 10.1517/17460441.2015.1096926] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Shen H, Dührkop K, Böcker S, Rousu J. Metabolite identification through multiple kernel learning on fragmentation trees. ACTA ACUST UNITED AC 2014;30:i157-64. [PMID: 24931979 PMCID: PMC4058957 DOI: 10.1093/bioinformatics/btu275] [Citation(s) in RCA: 76] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Su H, Rousu J. Multilabel classification through random graph ensembles. Mach Learn 2014. [DOI: 10.1007/s10994-014-5465-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]

Pitkänen E, Jouhten P, Hou J, Syed MF, Blomberg P, Kludas J, Oja M, Holm L, Penttilä M, Rousu J, Arvas M. Comparative genome-scale reconstruction of gapless metabolic networks for present and ancestral species. PLoS Comput Biol 2014;10:e1003465. [PMID: 24516375 PMCID: PMC3916221 DOI: 10.1371/journal.pcbi.1003465] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2013] [Accepted: 12/18/2013] [Indexed: 12/12/2022] Open

Abstract

We introduce a novel computational approach, CoReCo, for comparative metabolic reconstruction and provide genome-scale metabolic network models for 49 important fungal species. Leveraging on the exponential growth in sequenced genome availability, our method reconstructs genome-scale gapless metabolic networks simultaneously for a large number of species by integrating sequence data in a probabilistic framework. High reconstruction accuracy is demonstrated by comparisons to the well-curated Saccharomyces cerevisiae consensus model and large-scale knock-out experiments. Our comparative approach is particularly useful in scenarios where the quality of available sequence data is lacking, and when reconstructing evolutionary distant species. Moreover, the reconstructed networks are fully carbon mapped, allowing their use in 13C flux analysis. We demonstrate the functionality and usability of the reconstructed fungal models with computational steady-state biomass production experiment, as these fungi include some of the most important production organisms in industrial biotechnology. In contrast to many existing reconstruction techniques, only minimal manual effort is required before the reconstructed models are usable in flux balance experiments. CoReCo is available at http://esaskar.github.io/CoReCo/.

Advances in next-generation sequencing technologies are revolutionizing molecular biology. Sequencing-enabled cost-effective characterization of microbial genomes is a particularly exciting development in metabolic engineering. There, considerable effort has been put to reconstructing genome-scale metabolic networks that describe the collection of hundreds to thousands of biochemical reactions available for a microbial cell. These network models are instrumental in understanding microbial metabolism and guiding metabolic engineering efforts to improve biochemical yields. We have developed a novel computational method, CoReCo, which bridges the growing gap between the availability of sequenced genomes and respective reconstructed metabolic networks. The method reconstructs genome-scale metabolic networks simultaneously for related microbial species. It utilizes the available sequencing data from these species to correct for incomplete and missing data. We used the method to reconstruct metabolic networks for a set of 49 fungal species providing the method protein sequence data and a phylogenetic tree describing the evolutionary relationships between the species. We demonstrate the applicability of the method by comparing a metabolic reconstruction of Saccharomyces cerevisiae to the manually curated, high-quality consensus network. We also provide an easy-to-use implementation of the method, usable both in single computer and distributed computing environments.

Collapse

Shen H, Zamboni N, Heinonen M, Rousu J. Metabolite Identification through Machine Learning- Tackling CASMI Challenge Using FingerID. Metabolites 2013;3:484-505. [PMID: 24958002 PMCID: PMC3901273 DOI: 10.3390/metabo3020484] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2013] [Revised: 05/24/2013] [Accepted: 05/30/2013] [Indexed: 01/28/2023] Open

Rousu J, Agranoff DD, Sodeinde O, Shawe-Taylor J, Fernandez-Reyes D. Biomarker discovery by sparse canonical correlation analysis of complex clinical phenotypes of tuberculosis and malaria. PLoS Comput Biol 2013;9:e1003018. [PMID: 23637585 PMCID: PMC3630122 DOI: 10.1371/journal.pcbi.1003018] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Accepted: 02/18/2013] [Indexed: 11/25/2022] Open

Abstract

Biomarker discovery aims to find small subsets of relevant variables in ‘omics data that correlate with the clinical syndromes of interest. Despite the fact that clinical phenotypes are usually characterized by a complex set of clinical parameters, current computational approaches assume univariate targets, e.g. diagnostic classes, against which associations are sought for. We propose an approach based on asymmetrical sparse canonical correlation analysis (SCCA) that finds multivariate correlations between the ‘omics measurements and the complex clinical phenotypes. We correlated plasma proteomics data to multivariate overlapping complex clinical phenotypes from tuberculosis and malaria datasets. We discovered relevant ‘omic biomarkers that have a high correlation to profiles of clinical measurements and are remarkably sparse, containing 1.5–3% of all ‘omic variables. We show that using clinical view projections we obtain remarkable improvements in diagnostic class prediction, up to 11% in tuberculosis and up to 5% in malaria. Our approach finds proteomic-biomarkers that correlate with complex combinations of clinical-biomarkers. Using the clinical-biomarkers improves the accuracy of diagnostic class prediction while not requiring the measurement plasma proteomic profiles of each subject. Our approach makes it feasible to use omics' data to build accurate diagnostic algorithms that can be deployed to community health centres lacking the expensive ‘omics measurement capabilities.

Many infectious diseases such as tuberculosis and malaria are challenging both for scientists trying to understand the biochemical basis of the diseases and for medical doctors making diagnosis. The challenges arise both from the dependence of the diseases on sets of proteins and from the complexity of the symptoms. Biomarkers denote small sets of measurements that correlate with the phenotype of interest. They have potential use both in advancing the basic biomedical research of infectious diseases and in facilitating predictive diagnostic tools. We propose a new method for biomarker discovery that works by finding canonical correlations between two sets of data, the plasma proteomic profiles and clinical profiles of the subjects. We show that the method is able to find candidate proteomic biomarkers that correlate with combinations of clinical variables, called the clinical biomarkers. Using the clinical biomarkers improves the accuracy of diagnostic class prediction while not requiring the expensive plasma proteomic profiles to be measured for each subject.

Collapse

Heinonen M, Shen H, Zamboni N, Rousu J. Metabolite identification and molecular fingerprint prediction through machine learning. Bioinformatics 2012;28:2333-41. [DOI: 10.1093/bioinformatics/bts437] [Citation(s) in RCA: 119] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Heinonen M, Lappalainen S, Mielikäinen T, Rousu J. Computing Atom Mappings for Biochemical Reactions without Subgraph Isomorphism. J Comput Biol 2011;18:43-58. [DOI: 10.1089/cmb.2009.0216] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Pitkänen E, Rousu J, Ukkonen E. Computational methods for metabolic reconstruction. Curr Opin Biotechnol 2010;21:70-7. [DOI: 10.1016/j.copbio.2010.01.010] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2009] [Revised: 01/17/2010] [Accepted: 01/20/2010] [Indexed: 12/19/2022]

Pitkänen E, Jouhten P, Rousu J. Inferring branching pathways in genome-scale metabolic networks. BMC Syst Biol 2009;3:103. [PMID: 19874610 PMCID: PMC2791103 DOI: 10.1186/1752-0509-3-103] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2009] [Accepted: 10/29/2009] [Indexed: 11/17/2022]

Abstract

Background

A central problem in computational metabolic modelling is how to find biochemically plausible pathways between metabolites in a metabolic network. Two general, complementary frameworks have been utilized to find metabolic pathways: constraint-based modelling and graph-theoretical path finding approaches. In constraint-based modelling, one aims to find pathways where metabolites are balanced in a pseudo steady-state. Constraint-based methods, such as elementary flux mode analysis, have typically a high computational cost stemming from a large number of steady-state pathways in a typical metabolic network. On the other hand, graph-theoretical approaches avoid the computational complexity of constraint-based methods by solving a simpler problem of finding shortest paths. However, while scaling well with network size, graph-theoretic methods generally tend to return more false positive pathways than constraint-based methods.

Results

In this paper, we introduce a computational method, ReTrace, for finding biochemically relevant, branching metabolic pathways in an atom-level representation of metabolic networks. The method finds compact pathways which transfer a high fraction of atoms from source to target metabolites by considering combinations of linear shortest paths. In contrast to current steady-state pathway analysis methods, our method scales up well and is able to operate on genome-scale models. Further, we show that the pathways produced are biochemically meaningful by an example involving the biosynthesis of inosine 5'-monophosphate (IMP). In particular, the method is able to avoid typical problems associated with graph-theoretic approaches such as the need to define side metabolites or pathways not carrying any net carbon flux appearing in results. Finally, we discuss an application involving reconstruction of amino acid pathways of a recently sequenced organism demonstrating how measurement data can be easily incorporated into ReTrace analysis. ReTrace is licensed under GPL and is freely available for academic use at http://www.cs.helsinki.fi/group/sysfys/software/retrace/.

Conclusion

ReTrace is a useful method in metabolic path finding tasks, combining some of the best aspects in constraint-based and graph-theoretic methods. It finds use in a multitude of tasks ranging from metabolic engineering to metabolic reconstruction of recently sequenced organisms.

Collapse

Astikainen K, Holm L, Pitkänen E, Szedmak S, Rousu J. Towards structured output prediction of enzyme function. BMC Proc 2008;2 Suppl 4:S2. [PMID: 19091049 PMCID: PMC2654971 DOI: 10.1186/1753-6561-2-s4-s2] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Heinonen M, Rantanen A, Mielikäinen T, Kokkonen J, Kiuru J, Ketola RA, Rousu J. FiD: a software for ab initio structural identification of product ions from tandem mass spectrometric data. Rapid Commun Mass Spectrom 2008;22:3043-3052. [PMID: 18763276 DOI: 10.1002/rcm.3701] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]

Abstract

We present FiD (Fragment iDentificator), a software tool for the structural identification of product ions produced with tandem mass spectrometric measurement of low molecular weight organic compounds. Tandem mass spectrometry (MS/MS) has proven to be an indispensable tool in modern, cell-wide metabolomics and fluxomics studies. In such studies, the structural information of the MS(n) product ions is usually needed in the downstream analysis of the measurement data. The manual identification of the structures of MS(n) product ions is, however, a nontrivial task requiring expertise, and calls for computer assistance. Commercial software tools, such as Mass Frontier and ACD/MS Fragmenter, rely on fragmentation rule databases for the identification of MS(n) product ions. FiD, on the other hand, conducts a combinatorial search over all possible fragmentation paths and outputs a ranked list of alternative structures. This gives the user an advantage in situations where the MS/MS data of compounds with less well-known fragmentation mechanisms are processed. FiD software implements two fragmentation models, the single-step model that ignores intermediate fragmentation states and the multi-step model, which allows for complex fragmentation pathways. The software works for MS/MS data produced both in positive- and negative-ion modes. The software has an easy-to-use graphical interface with built-in visualization capabilities for structures of product ions and fragmentation pathways. In our experiments involving amino acids and sugar-phosphates, often found, e.g., in the central carbon metabolism of yeasts, FiD software correctly predicted the structures of product ions on average in 85% of the cases. The FiD software is free for academic use and is available for download from www.cs.helsinki.fi/group/sysfys/software/fragid.

Collapse

Rantanen A, Rousu J, Jouhten P, Zamboni N, Maaheimo H, Ukkonen E. An analytic and systematic framework for estimating metabolic flux ratios from 13C tracer experiments. BMC Bioinformatics 2008;9:266. [PMID: 18534038 PMCID: PMC2430715 DOI: 10.1186/1471-2105-9-266] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2008] [Accepted: 06/06/2008] [Indexed: 11/10/2022] Open

Abstract

Background

Metabolic fluxes provide invaluable insight on the integrated response of a cell to environmental stimuli or genetic modifications. Current computational methods for estimating the metabolic fluxes from ¹³C isotopomer measurement data rely either on manual derivation of analytic equations constraining the fluxes or on the numerical solution of a highly nonlinear system of isotopomer balance equations. In the first approach, analytic equations have to be tediously derived for each organism, substrate or labelling pattern, while in the second approach, the global nature of an optimum solution is difficult to prove and comprehensive measurements of external fluxes to augment the ¹³C isotopomer data are typically needed.

Results

We present a novel analytic framework for estimating metabolic flux ratios in the cell from ¹³C isotopomer measurement data. In the presented framework, equation systems constraining the fluxes are derived automatically from the model of the metabolism of an organism. The framework is designed to be applicable with all metabolic network topologies, ¹³C isotopomer measurement techniques, substrates and substrate labelling patterns.

By analyzing nuclear magnetic resonance (NMR) and mass spectrometry (MS) measurement data obtained from the experiments on glucose with the model micro-organisms Bacillus subtilis and Saccharomyces cerevisiae we show that our framework is able to automatically produce the flux ratios discovered so far by the domain experts with tedious manual analysis. Furthermore, we show by in silico calculability analysis that our framework can rapidly produce flux ratio equations – as well as predict when the flux ratios are unobtainable by linear means – also for substrates not related to glucose.

Conclusion

The core of ¹³C metabolic flux analysis framework introduced in this article constitutes of flow and independence analysis of metabolic fragments and techniques for manipulating isotopomer measurements with vector space techniques. These methods facilitate efficient, analytic computation of the ratios between the fluxes of pathways that converge to a common junction metabolite. The framework can been seen as a generalization and formalization of existing tradition for computing metabolic flux ratios where equations constraining flux ratios are manually derived, usually without explicitly showing the formal proofs of the validity of the equations.

Collapse

Kaski S, Rousu J, Ukkonen E. Probabilistic modeling and machine learning in structural and systems biology. BMC Bioinformatics 2007. [PMCID: PMC1892067 DOI: 10.1186/1471-2105-8-s2-s1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open

Rantanen A, Mielikäinen T, Rousu J, Maaheimo H, Ukkonen E. Planning optimal measurements of isotopomer distributions for estimation of metabolic fluxes. Bioinformatics 2006;22:1198-206. [PMID: 16504982 DOI: 10.1093/bioinformatics/btl069] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Elomaa T, Rousu J. Efficient Multisplitting Revisited: Optima-Preserving Elimination of Partition Candidates. Data Min Knowl Discov 2004. [DOI: 10.1023/b:dami.0000015868.85039.e6] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Elomaa T, Rousu J. Necessary and Sufficient Pre-processing in Numerical Range Discretization. Knowl Inf Syst 2003. [DOI: 10.1007/s10115-003-0099-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Rantanen A, Rousu J, Kokkonen JT, Tarkiainen V, Ketola RA. Computing positional isotopomer distributions from tandem mass spectrometric data. Metab Eng 2002;4:285-94. [PMID: 12646323 DOI: 10.1006/mben.2002.0232] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Elomaa T, Rousu J. J Intell Inf Syst 2002;18:55-70. [DOI: 10.1023/a:1012920624627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Elomaa T, Rousu J. Mach Learn 1999;36:201-244. [DOI: 10.1023/a:1007674919412] [Citation(s) in RCA: 96] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]