1
|
Lu M, Jiang H, Wang R, An S, Wang J, Yu C. Injectiondesign: web service of plate design with optimized stratified block randomization for modern GC/LC-MS-based sample preparation. BMC Bioinformatics 2023; 24:489. [PMID: 38124029 PMCID: PMC10734102 DOI: 10.1186/s12859-023-05598-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 12/04/2023] [Indexed: 12/23/2023] Open
Abstract
BACKGROUND Plate design is a necessary and time-consuming operation for GC/LC-MS-based sample preparation. The implementation of the inter-batch balancing algorithm and the intra-batch randomization algorithm can have a significant impact on the final results. For researchers without programming skills, a stable and efficient online service for plate design is necessary. RESULTS Here we describe InjectionDesign, a free online plate design service focused on GC/LC-MS-based multi-omics experiment design. It offers the ability to separate the position design from the sequence design, making the output more compatible with the requirements of a modern mass spectrometer-based laboratory. In addition, it has implemented an optimized block randomization algorithm, which can be better applied to sample stratification with block randomization for an unbalanced distribution. It is easy to use, with built-in support for common instrument models and quick export to a worksheet. CONCLUSIONS InjectionDesign is an open-source project based on Java. Researchers can get the source code for the project from Github: https://github.com/CSi-Studio/InjectionDesign . A free web service is also provided: http://www.injection.design .
Collapse
Affiliation(s)
- Miaoshan Lu
- Zhejiang University, Hangzhou, Zhejiang, China
- School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Hengxuan Jiang
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Ruimin Wang
- School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Fudan University, Shanghai, China
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Shaowei An
- School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Institute of Biology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, 310024, Zhejiang, China
- Fudan University, Shanghai, China
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China
| | - Jiawei Wang
- Carbon Silicon (Hangzhou) Biotechnology Co., Ltd, Hangzhou, China
| | - Changbin Yu
- Shandong First Medical University and Shandong Academy of Medical Sciences, Jinan, China.
| |
Collapse
|
2
|
Burger B, Vaudel M, Barsnes H. Automated splitting into batches for observational biomedical studies with sequential processing. Biostatistics 2023; 24:1031-1044. [PMID: 35536588 PMCID: PMC10583723 DOI: 10.1093/biostatistics/kxac014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2021] [Revised: 04/07/2022] [Accepted: 04/08/2022] [Indexed: 10/19/2023] Open
Abstract
Experimental design usually focuses on the setting where treatments and/or other aspects of interest can be manipulated. However, in observational biomedical studies with sequential processing, the set of available samples is often fixed, and the problem is thus rather the ordering and allocation of samples to batches such that comparisons between different treatments can be made with similar precision. In certain situations, this allocation can be done by hand, but this rapidly becomes impractical with more challenging cohort setups. Here, we present a fast and intuitive algorithm to generate balanced allocations of samples to batches for any single-variable model where the treatment variable is nominal. This greatly simplifies the grouping of samples into batches, makes the process reproducible, and provides a marked improvement over completely random allocations. The general challenges of allocation and why good solutions can be hard to find are also discussed, as well as potential extensions to multivariable settings.
Collapse
Affiliation(s)
- Bram Burger
- Computational Biology Unit (CBU), Department of Informatics, University of Bergen, 5008 Bergen, Norway, Proteomics Unit (PROBE), Department of Biomedicine, University of Bergen, 5020 Bergen, Norway, and Department of Medical Genetics, Haukeland University Hospital, 5021 Bergen, Norway
| | - Marc Vaudel
- Department of Clinical Science, University of Bergen, 5020 Bergen, Norway
| | - Harald Barsnes
- Computational Biology Unit (CBU), Department of Informatics, University of Bergen, 5008 Bergen, Norway and Proteomics Unit (PROBE), Department of Biomedicine, University of Bergen, 5020 Bergen, Norway
| |
Collapse
|
3
|
Zamora Obando HR, Duarte GHB, Simionato AVC. Metabolomics Data Treatment: Basic Directions of the Full Process. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2021; 1336:243-264. [PMID: 34628635 DOI: 10.1007/978-3-030-77252-9_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
The present chapter describes basic aspects of the main steps for data processing on mass spectrometry-based metabolomics platforms, focusing on the main objectives and important considerations of each step. Initially, an overview of metabolomics and the pivotal techniques applied in the field are presented. Important features of data acquisition and preprocessing such as data compression, noise filtering, and baseline correction are revised focusing on practical aspects. Peak detection, deconvolution, and alignment as well as missing values are also discussed. Special attention is given to chemical and mathematical normalization approaches and the role of the quality control (QC) samples. Methods for uni- and multivariate statistical analysis and data pretreatment that could impact them are reviewed, emphasizing the most widely used multivariate methods, i.e., principal components analysis (PCA), partial least squares-discriminant analysis (PLS-DA), orthogonal partial least square-discriminant analysis (OPLS-DA), and hierarchical cluster analysis (HCA). Criteria for model validation and softwares used in data processing were also approached. The chapter ends with some concerns about the minimal requirements to report metadata in metabolomics.
Collapse
Affiliation(s)
- Hans Rolando Zamora Obando
- Department of Analytical Chemistry, Institute of Chemistry, University of Campinas, Campinas, SP, Brazil
| | | | | |
Collapse
|
4
|
Statistical analysis in metabolic phenotyping. Nat Protoc 2021; 16:4299-4326. [PMID: 34321638 DOI: 10.1038/s41596-021-00579-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Accepted: 05/27/2021] [Indexed: 01/09/2023]
Abstract
Metabolic phenotyping is an important tool in translational biomedical research. The advanced analytical technologies commonly used for phenotyping, including mass spectrometry (MS) and nuclear magnetic resonance (NMR) spectroscopy, generate complex data requiring tailored statistical analysis methods. Detailed protocols have been published for data acquisition by liquid NMR, solid-state NMR, ultra-performance liquid chromatography (LC-)MS and gas chromatography (GC-)MS on biofluids or tissues and their preprocessing. Here we propose an efficient protocol (guidelines and software) for statistical analysis of metabolic data generated by these methods. Code for all steps is provided, and no prior coding skill is necessary. We offer efficient solutions for the different steps required within the complete phenotyping data analytics workflow: scaling, normalization, outlier detection, multivariate analysis to explore and model study-related effects, selection of candidate biomarkers, validation, multiple testing correction and performance evaluation of statistical models. We also provide a statistical power calculation algorithm and safeguards to ensure robust and meaningful experimental designs that deliver reliable results. We exemplify the protocol with a two-group classification study and data from an epidemiological cohort; however, the protocol can be easily modified to cover a wider range of experimental designs or incorporate different modeling approaches. This protocol describes a minimal set of analyses needed to rigorously investigate typical datasets encountered in metabolic phenotyping.
Collapse
|