1
|
Picaud V, Giovannelli JF, Truntzer C, Charrier JP, Giremus A, Grangeat P, Mercier C. Linear MALDI-ToF simultaneous spectrum deconvolution and baseline removal. BMC Bioinformatics 2018; 19:123. [PMID: 29621971 PMCID: PMC5887234 DOI: 10.1186/s12859-018-2116-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 03/13/2018] [Indexed: 11/10/2022] Open
Abstract
Background Thanks to a reasonable cost and simple sample preparation procedure, linear MALDI-ToF spectrometry is a growing technology for clinical microbiology. With appropriate spectrum databases, this technology can be used for early identification of pathogens in body fluids. However, due to the low resolution of linear MALDI-ToF instruments, robust and accurate peak picking remains a challenging task. In this context we propose a new peak extraction algorithm from raw spectrum. With this method the spectrum baseline and spectrum peaks are processed jointly. The approach relies on an additive model constituted by a smooth baseline part plus a sparse peak list convolved with a known peak shape. The model is then fitted under a Gaussian noise model. The proposed method is well suited to process low resolution spectra with important baseline and unresolved peaks. Results We developed a new peak deconvolution procedure. The paper describes the method derivation and discusses some of its interpretations. The algorithm is then described in a pseudo-code form where the required optimization procedure is detailed. For synthetic data the method is compared to a more conventional approach. The new method reduces artifacts caused by the usual two-steps procedure, baseline removal then peak extraction. Finally some results on real linear MALDI-ToF spectra are provided. Conclusions We introduced a new method for peak picking, where peak deconvolution and baseline computation are performed jointly. On simulated data we showed that this global approach performs better than a classical one where baseline and peaks are processed sequentially. A dedicated experiment has been conducted on real spectra. In this study a collection of spectra of spiked proteins were acquired and then analyzed. Better performances of the proposed method, in term of accuracy and reproductibility, have been observed and validated by an extended statistical analysis.
Collapse
Affiliation(s)
- Vincent Picaud
- University of Bordeaux, IMS, UMR 5218, Talence, 33400, France.
| | | | - Caroline Truntzer
- CLIPP, Pôle de Recherche Université de Bourgogne, Dijon, 21000, France
| | - Jean-Philippe Charrier
- Technology Research Department, Innovation Unit, Marcy l'Étoile, bioMérieux SA, Marcy l'Étoile, 69280, France
| | - Audrey Giremus
- University of Bordeaux, IMS, UMR 5218, Talence, 33400, France
| | - Pierre Grangeat
- Université Grenoble Alpes, Grenoble, 38000, France.,CEA, LETI, MINATEC Campus, DTBS, 17 Rue des Martyrs, Grenoble, 38054, France
| | - Catherine Mercier
- Service de Biostatistique, Hospices Civils de Lyon, Lyon, 69000, France.,Université Lyon 1, Villeurbanne, 69100, France.,CNRS UMR5558, Laboratoire de Biométrie et Biologie Évolutive, Équipe Biostatistique-Santé, Villeurbanne, 69100, France
| |
Collapse
|