Glehr G, Riquelme P, Kronenberg K, Lohmayer R, López-Madrona VJ, Kapinsky M, Schlitt HJ, Geissler EK, Spang R, Haferkamp S, Hutchinson JA. Restricting datasets to classifiable samples augments discovery of immune disease biomarkers.
Nat Commun 2024;
15:5417. [PMID:
38926389 PMCID:
PMC11208602 DOI:
10.1038/s41467-024-49094-3]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Accepted: 05/14/2024] [Indexed: 06/28/2024] Open
Abstract
Immunological diseases are typically heterogeneous in clinical presentation, severity and response to therapy. Biomarkers of immune diseases often reflect this variability, especially compared to their regulated behaviour in health. This leads to a common difficulty that frustrates biomarker discovery and interpretation - namely, unequal dispersion of immune disease biomarker expression between patient classes necessarily limits a biomarker's informative range. To solve this problem, we introduce dataset restriction, a procedure that splits datasets into classifiable and unclassifiable samples. Applied to synthetic flow cytometry data, restriction identifies biomarkers that are otherwise disregarded. In advanced melanoma, restriction finds biomarkers of immune-related adverse event risk after immunotherapy and enables us to build multivariate models that accurately predict immunotherapy-related hepatitis. Hence, dataset restriction augments discovery of immune disease biomarkers, increases predictive certainty for classifiable samples and improves multivariate models incorporating biomarkers with a limited informative range. This principle can be directly extended to any classification task.
Collapse