Study on outlier detection method of the near infrared spectroscopy analysis by
probability metric.
SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2022;
280:121473. [PMID:
35717926 DOI:
10.1016/j.saa.2022.121473]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 05/29/2022] [Accepted: 06/03/2022] [Indexed: 06/15/2023]
Abstract
Due to the high dimensionality and non-linearity of the near infrared (NIR) spectra data result the difficulty of the outlier measure. This paper proposed a probability based outlier detection method, which adopted the distribution probability of the spectra data to identify outliers at each wavelength by using of copula function. The negative logarithmic function was also used to quantify the overall variation of the joint distribution for the outliers. This method not only enlarges the difference of the spectra between typical samples and outliers, but also can be adapted to multi-type of outliers. Moreover, the jump degree in statistics was introduced for the automated determination of threshold for the outliers, which avoids the threshold setting problem in empirical way and the misjudgment of the outliers. In order to investigate the effectiveness of the algorithm, the recognition of different cases and types of outliers were applied, and compared with the commonly used PCA-Mahalanobis distance, spectral residual (SR) and leverage methods. The experimental results showed that the probability based outlier detection method effectively improved the performance of outlier identification and calibration for NIR analysis.
Collapse