1
|
Wai Tsang K, Tsung F, Xu Z. Knockoff procedure for false discovery rate control in high-dimensional data streams. J Appl Stat 2023; 50:2970-2983. [PMID: 37808615 PMCID: PMC10557548 DOI: 10.1080/02664763.2023.2200496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 04/03/2023] [Indexed: 10/10/2023]
Abstract
Motivated by applications to root-cause identification of faults in high-dimensional data streams that may have very limited samples after faults are detected, we consider multiple testing in models for multivariate statistical process control (SPC). With quick fault detection, only small portion of data streams being out-of-control (OC) can be assumed. It is a long standing problem to identify those OC data streams while controlling the number of false discoveries. It is challenging due to the limited number of OC samples after the termination of the process when faults are detected. Although several false discovery rate (FDR) controlling methods have been proposed, people may prefer other methods for quick detection. With a recently developed method called Knockoff filtering, we propose a knockoff procedure that can combine with other fault detection methods in the sense that the knockoff procedure does not change the stopping time, but may identify another set of faults to control FDR. A theorem for the FDR control of the proposed procedure is provided. Simulation studies show that the proposed procedure can control FDR while maintaining high power. We also illustrate the performance in an application to semiconductor manufacturing processes that motivated this development.
Collapse
Affiliation(s)
- Ka Wai Tsang
- School of Data Science, The Chinese University of Hong Kong, ShenzhenGuangdong518172, People's Republic of China
| | - Fugee Tsung
- Department of Industrial Engineering and Decision Analytics, Hong Kong University of Science and Technology, Hong Kong
| | - Zhihao Xu
- Department of Statistics, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
2
|
Xu F, Shu L, Li Y, Wang B. Joint Diagnosis of High-dimensional Process Mean and Covariance Matrix based on Bayesian Model Selection. Technometrics 2023. [DOI: 10.1080/00401706.2023.2182366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Affiliation(s)
- Feng Xu
- College of Science, Guilin University of Technology, Guilin, China
| | - Lianjie Shu
- Faculty of Business Administration, University of Macau, China
| | - Yanting Li
- Department of Industrial Engineering and Management, Shanghai Jiao Tong University, Shanghai, China
| | - Binhui Wang
- School of Management, Jinan University, Guangzhou, China
| |
Collapse
|
3
|
Zan X, Wang D, Xian X. Spatial Rank-Based Augmentation for Nonparametric Online Monitoring and Adaptive Sampling of Big Data Streams. Technometrics 2022. [DOI: 10.1080/00401706.2022.2143903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Xin Zan
- Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611
| | - Di Wang
- Department of Industrial Engineering and Management, School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Xiaochen Xian
- Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611
| |
Collapse
|
4
|
Yue J, Liu L. A New Nonparametric Multivariate Control Scheme for Simultaneous Monitoring Changes in Location and Scale. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:3385825. [PMID: 35832137 PMCID: PMC9273427 DOI: 10.1155/2022/3385825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/20/2022] [Accepted: 06/21/2022] [Indexed: 11/18/2022]
Abstract
Real-time monitoring of the breast cancer index is becoming increasingly important. It can help create advances in the diagnosis and treatment of breast cancer. In today's modern medical processes, simultaneously monitoring changes in observations in terms of location and scale are convenient for the implementation of control schemes but can be challenging. In this paper, we consider a new nonparametric control scheme for monitoring location and scale parameters in multivariate processes. The proposed method is easy to implement, and the performance of the proposed control procedure is discussed. Then, we compare the proposed scheme with some competing methods. Simulation results show that the proposed scheme can efficiently detect a range of shifts. The proposed chart can trigger an alert and timely discover the change of the breast cancer index.
Collapse
Affiliation(s)
- Jin Yue
- College of Mathematics and Physics, Chengdu University of Technology, Chengdu 610059, China
- School of Mathematics and VC & VR Key Lab of Sichuan Province, Sichuan Normal University, Chengdu 610068, China
| | - Liu Liu
- College of Mathematics and Physics, Chengdu University of Technology, Chengdu 610059, China
- School of Mathematics and VC & VR Key Lab of Sichuan Province, Sichuan Normal University, Chengdu 610068, China
| |
Collapse
|
5
|
Zhang W, Mei Y. Bandit Change-Point Detection for Real-Time Monitoring High-Dimensional Data Under Sampling Control. Technometrics 2022; 65:33-43. [PMID: 36950530 PMCID: PMC10027391 DOI: 10.1080/00401706.2022.2054861] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 03/09/2022] [Accepted: 03/12/2022] [Indexed: 10/18/2022]
Abstract
In many real-world problems of real-time monitoring high-dimensional streaming data, one wants to detect an undesired event or change quickly once it occurs, but under the sampling control constraint in the sense that one might be able to only observe or use selected components data for decision-making per time step in the resource-constrained environments. In this paper, we propose to incorporate multi-armed bandit approaches into sequential change-point detection to develop an efficient bandit change-point detection algorithm based on the limiting Bayesian approach to incorporate a prior knowledge of potential changes. Our proposed algorithm, termed Thompson-Sampling-Shiryaev-Roberts-Pollak (TSSRP), consists of two policies per time step: the adaptive sampling policy applies the Thompson Sampling algorithm to balance between exploration for acquiring long-term knowledge and exploitation for immediate reward gain, and the statistical decision policy fuses the local Shiryaev-Roberts-Pollak statistics to determine whether to raise a global alarm by sum shrinkage techniques. Extensive numerical simulations and case studies demonstrate the statistical and computational efficiency of our proposed TSSRP algorithm.
Collapse
|
6
|
Hahn G. Online multivariate changepoint detection with type I error control and constant time/memory updates per series. Stat Probab Lett 2022. [DOI: 10.1016/j.spl.2021.109258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
7
|
Chen Y, Wang T, Samworth RJ. High‐dimensional, multiscale online changepoint detection. J R Stat Soc Series B Stat Methodol 2022. [DOI: 10.1111/rssb.12447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Yudong Chen
- University of Cambridge Cambridge Cambridgeshire UK
| | - Tengyao Wang
- London School of Economics and Political Science London UK
- University College London London UK
| | | |
Collapse
|
8
|
Gösmann J, Stoehr C, Heiny J, Dette H. Sequential change point detection in high dimensional time series. Electron J Stat 2022. [DOI: 10.1214/22-ejs2027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | | | | | - Holger Dette
- Department of Mathematics, Ruhr University Bochum
| |
Collapse
|
9
|
Xiang D, Qiu P, Wang D, Li W. Reliable Post-Signal Fault Diagnosis for Correlated High-Dimensional Data Streams. Technometrics 2021. [DOI: 10.1080/00401706.2021.1979100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Dongdong Xiang
- KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, China
| | - Peihua Qiu
- Department of Biostatistics, University of Florida, Gainesville, USA
| | - Dezhi Wang
- KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, China
- School of Mathematics and Statistics, Lanzhou University, Lanzhou, China
| | - Wendong Li
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| |
Collapse
|
10
|
Gómez AME, Li D, Paynabar K. An Adaptive Sampling Strategy for Online Monitoring and Diagnosis of High-Dimensional Streaming Data. Technometrics 2021. [DOI: 10.1080/00401706.2021.1967198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
| | - Dan Li
- School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA
| | - Kamran Paynabar
- School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA
| |
Collapse
|
11
|
Fang Z, Li W, Liu X, Pu X, Xiang D. Online monitoring of high-dimensional binary data streams with application to extreme weather surveillance. J Appl Stat 2021; 49:4122-4136. [DOI: 10.1080/02664763.2021.1971633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Zhiwen Fang
- KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of China
| | - Wendong Li
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, People's Republic of China
| | - Xin Liu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, People's Republic of China
| | - Xiaolong Pu
- KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of China
| | - Dongdong Xiang
- KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, People's Republic of China
| |
Collapse
|
12
|
Yue J, Liu L. A dynamic sampling for monitoring nonparametric multivariate processes. COMMUN STAT-SIMUL C 2021. [DOI: 10.1080/03610918.2021.1945628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Affiliation(s)
- Jin Yue
- School of Mathematics and VC&VR Lab, Sichuan Normal University, Chengdu, Sichuan, China
| | - Liu Liu
- School of Mathematics and VC&VR Lab, Sichuan Normal University, Chengdu, Sichuan, China
| |
Collapse
|
13
|
Zhao W, Wang Z, Wu C. Adaptive multivariate EWMA charts for monitoring sparse mean shifts based on parameter optimization design. J STAT COMPUT SIM 2021. [DOI: 10.1080/00949655.2021.1904242] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Wei Zhao
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, People's Republic of China
| | - Zhijun Wang
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, People's Republic of China
| | - Chunjie Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, People's Republic of China
| |
Collapse
|
14
|
On the distribution of the T2 statistic, used in statistical process monitoring, for high-dimensional data. Stat Probab Lett 2021. [DOI: 10.1016/j.spl.2020.108919] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
15
|
Ren H, Zou C, Chen N, Li R. Large-Scale Datastreams Surveillance via Pattern-Oriented-Sampling. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2020.1819295] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Haojie Ren
- School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, China
- Department of Statistics, The Pennsylvania State University at University Park, State College, PA
| | - Changliang Zou
- School of Statistics and Data Science, LPMC and KLMDASR, Nankai University, Tianjin, China
| | - Nan Chen
- Department of Industrial Systems Engineering and Management, National University of Singapore, Singapore
| | - Runze Li
- Department of Statistics, The Pennsylvania State University at University Park, State College, PA
| |
Collapse
|
16
|
Guo L, Modarres R. Two multivariate online change detection models. J Appl Stat 2020; 49:427-448. [PMID: 35707208 PMCID: PMC9196088 DOI: 10.1080/02664763.2020.1815674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Accepted: 08/22/2020] [Indexed: 10/23/2022]
Abstract
Online change point detection methods monitor changes in the distribution of a data stream. This article discusses two non-parametric online change detection methods based on the energy statistics and Mahalanobis depth. To apply the energy statistic, we use sliding-window algorithm with efficient training and updating procedures. For Mahalanobis depth, we propose an algorithm to train the threshold with desired protective ability against false alarms and discuss factors that have an influence on the threshold. Numerical studies evaluate and compare the performance of the proposed models with three existing methods to detect changes in the mean and variability of a data stream. The methods are applied to detecting changes in the flowing volume of the Mississippi River.
Collapse
Affiliation(s)
- Lingzhe Guo
- Department of Statistics, The George Washington University, Washington, DC, USA
| | - Reza Modarres
- Department of Statistics, The George Washington University, Washington, DC, USA
| |
Collapse
|
17
|
Affiliation(s)
- Peihua Qiu
- Department of Biostatistics, University of Florida, Gainesville, FL
| |
Collapse
|
18
|
Li W, Xiang D, Tsung F, Pu X. A Diagnostic Procedure for High-Dimensional Data Streams via Missed Discovery Rate Control. Technometrics 2019. [DOI: 10.1080/00401706.2019.1575284] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Wendong Li
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, China
| | - Dongdong Xiang
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, China
| | - Fugee Tsung
- Department of Industrial Engineering and Decision Analytics, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Xiaolong Pu
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science-MOE, School of Statistics, East China Normal University, Shanghai, China
| |
Collapse
|
19
|
Yan H, Paynabar K, Shi J. Real-Time Monitoring of High-Dimensional Functional Data Streams via Spatio-Temporal Smooth Sparse Decomposition. Technometrics 2018. [DOI: 10.1080/00401706.2017.1346522] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Hao Yan
- Georgia Institute of Technology, Atlanta, GA
| | | | - Jianjun Shi
- Georgia Institute of Technology, Atlanta, GA
| |
Collapse
|
20
|
Deng L, Zi X, Li Z. False discovery rates for large-scale model checking under certain dependence. COMMUN STAT-THEOR M 2018. [DOI: 10.1080/03610926.2017.1300279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Lu Deng
- Institute of Statistics and LPMC, Nankai University, Tianjin, P. R. China
| | - Xuemin Zi
- School of Science, Tianjin University of Technology and Education, Tianjin, P. R. China
| | - Zhonghua Li
- Institute of Statistics and LPMC, Nankai University, Tianjin, P. R. China
| |
Collapse
|
21
|
Xian X, Wang A, Liu K. A Nonparametric Adaptive Sampling Strategy for Online Monitoring of Big Data Streams. Technometrics 2017. [DOI: 10.1080/00401706.2017.1317291] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Xiaochen Xian
- Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI
| | - Andi Wang
- Department of Industrial Engineering and Logistics Management, Hong Kong University of Science and Technology, Clear Waterbay, NT, Hong Kong
| | - Kaibo Liu
- Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI
| |
Collapse
|
22
|
Lai TL, Tsang KW. Discussion on “Sequential detection/isolation of abrupt changes” by Igor V. Nikiforov. Seq Anal 2016. [DOI: 10.1080/07474946.2016.1206372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|