1
|
Shen H, Dührkop K, Böcker S, Rousu J. Metabolite identification through multiple kernel learning on fragmentation trees. ACTA ACUST UNITED AC 2014; 30:i157-64. [PMID: 24931979 PMCID: PMC4058957 DOI: 10.1093/bioinformatics/btu275] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Motivation: Metabolite identification from tandem mass spectrometric data is a key task in metabolomics. Various computational methods have been proposed for the identification of metabolites from tandem mass spectra. Fragmentation tree methods explore the space of possible ways in which the metabolite can fragment, and base the metabolite identification on scoring of these fragmentation trees. Machine learning methods have been used to map mass spectra to molecular fingerprints; predicted fingerprints, in turn, can be used to score candidate molecular structures. Results: Here, we combine fragmentation tree computations with kernel-based machine learning to predict molecular fingerprints and identify molecular structures. We introduce a family of kernels capturing the similarity of fragmentation trees, and combine these kernels using recently proposed multiple kernel learning approaches. Experiments on two large reference datasets show that the new methods significantly improve molecular fingerprint prediction accuracy. These improvements result in better metabolite identification, doubling the number of metabolites ranked at the top position of the candidates list. Contact:huibin.shen@aalto.fi Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Huibin Shen
- Department of Information and Computer Science, Aalto University, Espoo, Finland, Helsinki Institute for Information Technology, Espoo, Finland and Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, GermanyDepartment of Information and Computer Science, Aalto University, Espoo, Finland, Helsinki Institute for Information Technology, Espoo, Finland and Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, Germany
| | - Kai Dührkop
- Department of Information and Computer Science, Aalto University, Espoo, Finland, Helsinki Institute for Information Technology, Espoo, Finland and Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, Germany
| | - Sebastian Böcker
- Department of Information and Computer Science, Aalto University, Espoo, Finland, Helsinki Institute for Information Technology, Espoo, Finland and Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, Germany
| | - Juho Rousu
- Department of Information and Computer Science, Aalto University, Espoo, Finland, Helsinki Institute for Information Technology, Espoo, Finland and Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, GermanyDepartment of Information and Computer Science, Aalto University, Espoo, Finland, Helsinki Institute for Information Technology, Espoo, Finland and Chair for Bioinformatics, Friedrich Schiller University Jena, Jena, Germany
| |
Collapse
|
2
|
Hufsky F, Scheubert K, Böcker S. Computational mass spectrometry for small-molecule fragmentation. Trends Analyt Chem 2014. [DOI: 10.1016/j.trac.2013.09.008] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
3
|
Abstract
MOTIVATION Mass spectrometry allows sensitive, automated and high-throughput analysis of small molecules such as metabolites. One major bottleneck in metabolomics is the identification of 'unknown' small molecules not in any database. Recently, fragmentation tree alignments have been introduced for the automated comparison of the fragmentation patterns of small molecules. Fragmentation pattern similarities are strongly correlated with the chemical similarity of the molecules, and allow us to cluster compounds based solely on their fragmentation patterns. RESULTS Aligning fragmentation trees is computationally hard. Nevertheless, we present three exact algorithms for the problem: a dynamic programming (DP) algorithm, a sparse variant of the DP, and an Integer Linear Program (ILP). Evaluation of our methods on three different datasets showed that thousands of alignments can be computed in a matter of minutes using DP, even for 'challenging' instances. Running times of the sparse DP were an order of magnitude better than for the classical DP. The ILP was clearly outperformed by both DP approaches. We also found that for both DP algorithms, computing the 1% slowest alignments required as much time as computing the 99% fastest.
Collapse
Affiliation(s)
- Franziska Hufsky
- Chair for Bioinformatics, Friedrich-Schiller-University, Jena, Germany
| | | | | | | | | |
Collapse
|