Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Leach DT, Stratton KG, Irvahn J, Richardson R, Webb-Robertson BJM, Bramer LM. malbacR: A Package for Standardized Implementation of Batch Correction Methods for Omics Data. Anal Chem 2023;95:12195-12199. [PMID: 37551970 DOI: 10.1021/acs.analchem.3c01289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]

For:	Leach DT, Stratton KG, Irvahn J, Richardson R, Webb-Robertson BJM, Bramer LM. malbacR: A Package for Standardized Implementation of Batch Correction Methods for Omics Data. Anal Chem 2023;95:12195-12199. [PMID: 37551970 DOI: 10.1021/acs.analchem.3c01289] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]

Number

Cited by Other Article(s)

Bramer LM, Nakayasu ES, Flores JE, Van Eyk JE, MacCoss MJ, Parikh HM, Metz TO, Webb-Robertson BJM. Data from a multi-year targeted proteomics study of a longitudinal birth cohort of type 1 diabetes. Sci Data 2025;12:112. [PMID: 39833216 PMCID: PMC11747092 DOI: 10.1038/s41597-024-04249-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2024] [Accepted: 12/05/2024] [Indexed: 01/22/2025] Open

Schumann Y, Gocke A, Neumann JE. Computational Methods for Data Integration and Imputation of Missing Values in Omics Datasets. Proteomics 2025;25:e202400100. [PMID: 39740174 DOI: 10.1002/pmic.202400100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 11/08/2024] [Accepted: 11/26/2024] [Indexed: 01/02/2025]

Abstract

Molecular profiling of different omic-modalities (e.g., DNA methylomics, transcriptomics, proteomics) in biological systems represents the basis for research and clinical decision-making. Measurement-specific biases, so-called batch effects, often hinder the integration of independently acquired datasets, and missing values further hamper the applicability of typical data processing algorithms. In addition to careful experimental design, well-defined standards in data acquisition and data exchange, the alleviation of these phenomena particularly requires a dedicated data integration and preprocessing pipeline. This review aims to give a comprehensive overview of computational methods for data integration and missing value imputation for omic data analyses. We provide formal definitions for missing value mechanisms and propose a novel statistical taxonomy for batch effects, especially in the presence of missing data. Based on an automated document search and systematic literature review, we describe 32 distinct data integration methods from five main methodological categories, as well as 37 algorithms for missing value imputation from five separate categories. Additionally, this review highlights multiple quantitative evaluation methods to aid researchers in selecting a suitable set of methods for their work. Finally, this work provides an integrated discussion of the relevance of batch effects and missing values in omics with corresponding method recommendations. We then propose a comprehensive three-step workflow from the study conception to final data analysis and deduce perspectives for future research. Eventually, we present a comprehensive flow chart as well as exemplary decision trees to aid practitioners in the selection of specific approaches for imputation and data integration in their studies.

Collapse

Marković S, Jadranin M, Miladinović Z, Gavrilović A, Avramović N, Takić M, Tasic L, Tešević V, Mandić B. LC-HRMS Lipidomic Fingerprints in Serbian Cohort of Schizophrenia Patients. Int J Mol Sci 2024;25:10266. [PMID: 39408605 PMCID: PMC11476971 DOI: 10.3390/ijms251910266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2024] [Revised: 09/12/2024] [Accepted: 09/15/2024] [Indexed: 10/20/2024] Open

Abstract

Schizophrenia (SCH) is a major mental illness that causes impaired cognitive function and long-term disability, so the requirements for reliable biomarkers for early diagnosis and therapy of SCH are essential. The objective of this work was an untargeted lipidomic study of serum samples from a Serbian cohort including 30 schizophrenia (SCH) patients and 31 non-psychiatric control (C) individuals by applying liquid chromatography (LC) coupled with high-resolution mass spectrometry (HRMS) and chemometric analyses. Principal component analysis (PCA) of all samples indicated no clear separation between SCH and C groups but indicated clear gender separation in the C group. Multivariate statistical analyses (PCA and orthogonal partial least squares discriminant analysis (OPLS-DA)) of gender-differentiated SCH and C groups established forty-nine differential lipids in the differentiation of male SCH (SCH-M) patients and male controls (C-M), while sixty putative biomarkers were identified in the differentiation of female SCH patients (SCH-F) and female controls (C-F). Lipidomic study of gender-differentiated groups, between SCH-M and C-M and between SCH-F and C-F groups, confirmed that lipids metabolism was altered and the content of the majority of the most affected lipid classes, glycerophospholipids (GP), sphingolipids (SP), glycerolipids (GL) and fatty acids (FA), was decreased compared to controls. From differential lipid metabolites with higher content in both SCH-M and SCH-F patients groups compared to their non-psychiatric controls, there were four common lipid molecules: ceramides Cer 34:2, and Cer 34:1, lysophosphatidylcholine LPC 16:0 and triacylglycerol TG 48:2. Significant alteration of lipids metabolism confirmed the importance of metabolic pathways in the pathogenesis of schizophrenia.

Collapse

Gong Y, Ding W, Wang P, Wu Q, Yao X, Yang Q. Evaluating Machine Learning Methods of Analyzing Multiclass Metabolomics. J Chem Inf Model 2023;63:7628-7641. [PMID: 38079572 DOI: 10.1021/acs.jcim.3c01525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2023]

Abstract

Multiclass metabolomic studies have become popular for revealing the differences in multiple stages of complex diseases, various lifestyles, or the effects of specific treatments. In multiclass metabolomics, there are multiple data manipulation steps for analyzing raw data, which consist of data filtering, the imputation of missing values, data normalization, marker identification, sample separation, classification, and so on. In each step, several to dozens of machine learning methods can be chosen for the given data set, with potentially hundreds or thousands of method combinations in the whole data processing chain. Therefore, a clear understanding of these machine learning methods is helpful for selecting an appropriate method combination for obtaining stable and reliable analytical results of specific data. However, there has rarely been an overall introduction or evaluation of these methods based on multiclass metabolomic data. Herein, detailed descriptions of these machine learning methods in multiple data manipulation steps are reviewed. Moreover, an assessment of these methods was performed using a benchmark data set for multiclass metabolomics. First, 12 imputation methods for imputing missing values were evaluated based on the PSS (Procrustes statistical shape analysis) and NRMSE (normalized root-mean-square error) values. Second, 17 normalization methods for processing multiclass metabolomic data were evaluated by applying the PMAD (pooled median absolute deviation) value. Third, different methods of identifying markers of multiclass metabolomics were evaluated based on the CWrel (relative weighted consistency) value. Fourth, nine classification methods for constructing multiclass models were assessed using the AUC (area under the curve) value. Performance evaluations of machine learning methods are highly recommended to select the most appropriate method combination before performing the final analysis of the given data. Overall, detailed descriptions and evaluation of various machine learning methods are expected to improve analyses of multiclass metabolomic data.

Collapse