Chu HO, Buchan E, Smith D, Goldberg Oppenheimer P. Development and application of an optimised Bayesian shrinkage prior for spectroscopic biomedical diagnostics.
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024;
245:108014. [PMID:
38246097 DOI:
10.1016/j.cmpb.2024.108014]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/06/2024] [Accepted: 01/08/2024] [Indexed: 01/23/2024]
Abstract
BACKGROUND AND OBJECTIVE
Classification of vibrational spectra is often challenging for biological substances containing similar molecular bonds, interfering with spectral outputs. To address this, various approaches are widely studied. However, whilst providing powerful estimations, these techniques are computationally extensive and frequently overfit the data. Shrinkage priors, which favour models with relatively few predictor variables, are often applied in Bayesian penalisation techniques to avoid overfitting.
METHODS
Using the logit-normal continuous analogue of the spike-and-slab (LN-CASS) as the shrinkage prior and modelling, we have established classification for accurate analysis, with the established system found to be faster than conventional least absolute shrinkage and selection operator, horseshoe or spike-and-slab. These were examined versus coefficient data based on a linear regression model and vibrational spectra produced via density functional theory calculations. Then applied to Raman spectra from saliva to classify the sample sex.
RESULTS
Subsequently applied to the acquired spectra from saliva, the evaluated models exhibited high accuracy (AUC>90 %) even when number of parameters was higher than the number of observations. Analyses of spectra for all Bayesian models yielded high-classification accuracy upon cross-validation. Further, for saliva sensing, LN-CASS was found to be the only classifier with 100 %-accuracy in predicting the output based on a leave-one-out cross validation.
CONCLUSIONS
With potential applications in aiding diagnosis from small spectroscopic datasets and are compatible with a range of spectroscopic data formats. As seen with the classification of IR and Raman spectra. These results are highly promising for emerging developments of spectroscopic platforms for biomedical diagnostic sensing systems.
Collapse