1
|
Liu Y, Luo S, Li J. Hypothesis tests in ordinal predictive models with optimal accuracy. Biometrics 2024; 80:ujae079. [PMID: 39166461 DOI: 10.1093/biomtc/ujae079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Revised: 05/13/2024] [Accepted: 08/02/2024] [Indexed: 08/23/2024]
Abstract
In real-world applications involving multi-class ordinal discrimination, a common approach is to aggregate multiple predictive variables into a linear combination, aiming to develop a classifier with high prediction accuracy. Assessment of such multi-class classifiers often utilizes the hypervolume under ROC manifolds (HUM). When dealing with a substantial pool of potential predictors and achieving optimal HUM, it becomes imperative to conduct appropriate statistical inference. However, prevalent methodologies in existing literature are computationally expensive. We propose to use the jackknife empirical likelihood method to address this issue. The Wilks' theorem under moderate conditions is established and the power analysis under the Pitman alternative is provided. We also introduce a novel network-based rapid computation algorithm specifically designed for computing a general multi-sample $U$-statistic in our test procedure. To compare our approach against existing approaches, we conduct extensive simulations. Results demonstrate the superior performance of our method in terms of test size, power, and implementation time. Furthermore, we apply our method to analyze a real medical dataset and obtain some new findings.
Collapse
Affiliation(s)
- Yuyang Liu
- Shanghai Zhangjiang Institute of Mathematics, Shanghai, 201203, China
| | - Shan Luo
- Department of Statistics, School of Mathematical Sciences, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Jialiang Li
- Department of Statistics and Data Science, National University of Singapore, Singapore, 117546, Singapore
- Duke University-NUS Graduate Medical School, National University of Singapore, Singapore, 169857, Singapore
| |
Collapse
|
2
|
Mosier BR, Bantis LE. Combining multiple biomarkers linearly to minimize the Euclidean distance of the closest point on the receiver operating characteristic surface to the perfection corner in trichotomous settings. Stat Methods Med Res 2024; 33:647-668. [PMID: 38445348 PMCID: PMC11234871 DOI: 10.1177/09622802241233768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
The performance of individual biomarkers in discriminating between two groups, typically the healthy and the diseased, may be limited. Thus, there is interest in developing statistical methodologies for biomarker combinations with the aim of improving upon the individual discriminatory performance. There is extensive literature referring to biomarker combinations under the two-class setting. However, the corresponding literature under a three-class setting is limited. In our study, we provide parametric and nonparametric methods that allow investigators to optimally combine biomarkers that seek to discriminate between three classes by minimizing the Euclidean distance from the receiver operating characteristic surface to the perfection corner. Using this Euclidean distance as the objective function allows for estimation of the optimal combination coefficients along with the optimal cutoff values for the combined score. An advantage of the proposed methods is that they can accommodate biomarker data from all three groups simultaneously, as opposed to a pairwise analysis such as the one implied by the three-class Youden index. We illustrate that the derived true classification rates exhibit narrower confidence intervals than those derived from the Youden-based approach under a parametric, flexible parametric, and nonparametric kernel-based framework. We evaluate our approaches through extensive simulations and apply them to real data sets that refer to liver cancer patients.
Collapse
Affiliation(s)
- Brian R Mosier
- University of Kansas Medical Center, Kansas City, KS, USA
- EMB Statistical Solutions, LLC KS, USA
| | | |
Collapse
|
3
|
Maiti R, Li J, Das P, Liu X, Feng L, Hausenloy DJ, Chakraborty B. A distribution-free smoothed combination method to improve discrimination accuracy in multi-category classification. Stat Methods Med Res 2023; 32:242-266. [PMID: 36384309 DOI: 10.1177/09622802221137742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Results from multiple diagnostic tests are combined in many ways to improve the overall diagnostic accuracy. For binary classification, maximization of the empirical estimate of the area under the receiver operating characteristic curve has widely been used to produce an optimal linear combination of multiple biomarkers. However, in the presence of a large number of biomarkers, this method proves to be computationally expensive and difficult to implement since it involves maximization of a discontinuous, non-smooth function for which gradient-based methods cannot be used directly. The complexity of this problem further increases when the classification problem becomes multi-category. In this article, we develop a linear combination method that maximizes a smooth approximation of the empirical Hyper-volume Under Manifolds for the multi-category outcome. We approximate HUM by replacing the indicator function with the sigmoid function and normal cumulative distribution function. With such smooth approximations, efficient gradient-based algorithms are employed to obtain better solutions with less computing time. We show that under some regularity conditions, the proposed method yields consistent estimates of the coefficient parameters. We derive the asymptotic normality of the coefficient estimates. A simulation study is performed to study the effectiveness of our proposed method as compared to other existing methods. The method is illustrated using two real medical data sets.
Collapse
Affiliation(s)
- Raju Maiti
- Economic Research Unit, Indian Statistical Institute Kolkata, Kolkata, India
| | - Jialiang Li
- Department of Statistics and Data Science, National University of Singapore, Singapore, Singapore
| | - Priyam Das
- Department of Biomedical Informatics, 1811Harvard Medical School, Boston, MA, USA
| | - Xueqing Liu
- Centre for Quantitative Medicine, 121579Duke-NUS Medical School, Singapore, Singapore
| | - Lei Feng
- Department of Psychological Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Derek J Hausenloy
- Cardiovascular and Metabolic Disorders Program, 121579Duke-NUS Medical School, Singapore, Singapore.,National Heart Research Institute Singapore, National Heart Centre, Singapore, Singapore.,Yong Loo Lin School of Medicine, National University Singapore, Singapore, Singapore.,The Hatter Cardiovascular Institute, University College London, London, UK.,Cardiovascular Research Center, College of Medical and Health Sciences, Asia University, Taichung
| | - Bibhas Chakraborty
- Department of Statistics and Data Science, National University of Singapore, Singapore, Singapore.,Centre for Quantitative Medicine, 121579Duke-NUS Medical School, Singapore, Singapore.,Department of Biostatistics and Bioinformatics, Duke University, USA
| |
Collapse
|
4
|
Das P, De D, Maiti R, Kamal M, Hutcheson KA, Fuller CD, Chakraborty B, Peterson CB. Estimating the optimal linear combination of predictors using spherically constrained optimization. BMC Bioinformatics 2022; 23:436. [PMID: 36261805 PMCID: PMC9583504 DOI: 10.1186/s12859-022-04953-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 09/19/2022] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND In the context of a binary classification problem, the optimal linear combination of continuous predictors can be estimated by maximizing the area under the receiver operating characteristic curve. For ordinal responses, the optimal predictor combination can similarly be obtained by maximization of the hypervolume under the manifold (HUM). Since the empirical HUM is discontinuous, non-differentiable, and possibly multi-modal, solving this maximization problem requires a global optimization technique. Estimation of the optimal coefficient vector using existing global optimization techniques is computationally expensive, becoming prohibitive as the number of predictors and the number of outcome categories increases. RESULTS We propose an efficient derivative-free black-box optimization technique based on pattern search to solve this problem, which we refer to as Spherically Constrained Optimization Routine (SCOR). Through extensive simulation studies, we demonstrate that the proposed method achieves better performance than existing methods including the step-down algorithm. Finally, we illustrate the proposed method to predict the severity of swallowing difficulty after radiation therapy for oropharyngeal cancer based on radiation dose to various structures in the head and neck. CONCLUSIONS Our proposed method addresses an important challenge in combining multiple biomarkers to predict an ordinal outcome. This problem is particularly relevant to medical research, where it may be of interest to diagnose a disease with various stages of progression or a toxicity with multiple grades of severity. We provide the implementation of our proposed SCOR method as an R package, available online at https://CRAN.R-project.org/package=SCOR .
Collapse
Affiliation(s)
- Priyam Das
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| | - Debsurya De
- Indian Statistical Institute, Kolkata, India
| | - Raju Maiti
- Centre for Quantitative Medicine, Duke-National University of Singapore Medical School, Singapore, Singapore
| | - Mona Kamal
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Katherine A Hutcheson
- Department of Head and Neck Surgery, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Clifton D Fuller
- Department of Radiation Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Bibhas Chakraborty
- Centre for Quantitative Medicine, Duke-National University of Singapore Medical School, Singapore, Singapore
- Department of Statistics and Applied Probability, National University of Singapore, Singapore, Singapore
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
| | - Christine B Peterson
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
5
|
Hua J, Tian L. Combining multiple biomarkers to linearly maximize the diagnostic accuracy under ordered multi-class setting. Stat Methods Med Res 2021; 30:1101-1118. [PMID: 33522437 DOI: 10.1177/0962280220987587] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Either in clinical study or biomedical research, it is a common practice to combine multiple biomarkers to improve the overall diagnostic performance. Despite the fact there exist a large number of statistical methods for biomarker combination under binary classification, research on this topic under multi-class setting is sparse. The overall diagnostic accuracy, i.e. the sum of correct classification rates, directly measures the classification accuracy of the combined biomarkers. Hence the overall accuracy can serve as an important objective function for biomarker combination, especially when the combined biomarkers are used for the purpose of making medical diagnosis. In this paper, we address the problem of combining multiple biomarkers to directly maximize the overall diagnostic accuracy by presenting several grid search methods and derivation-based methods. A comprehensive simulation study was conducted to compare the performances of these methods. An ovarian cancer data set is analyzed in the end.
Collapse
Affiliation(s)
- Jia Hua
- Department of Biostatistics, School of Public Health and Health Professions, University at Buffalo, Buffalo, NY, USA
| | - Lili Tian
- Department of Biostatistics, School of Public Health and Health Professions, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
6
|
Huang L, Li J. Weighted volume under the three-way receiver operating characteristic surface. Stat Methods Med Res 2018; 28:3627-3648. [PMID: 30453845 DOI: 10.1177/0962280218812211] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
It is often necessary to differentiate subjects from multiple categories using medical tests. We may then adopt statistical measures to characterize the performance of these tests. The three-way ROC analysis has been proposed to evaluate the diagnostic accuracy of medical tests with three categories, reflecting the correct classification probabilities across all possible decision thresholds. The geometry of the ROC surface is carefully studied, leading to numerical summary measures such as the volume under the surface. This paper generalizes the global volume under the surface of three-way ROC analysis to the weighted volume under the surface (WVUS) by introducing a weight function emphasizing particular regions of correct classification probabilities. This generalization practically allows researchers to calculate the diagnostic accuracy for a medical or clinical biomarker while satisfactorily high probabilities of correct classification for one or two classes are conditionally ensured. We provide the asymptotic properties of the proposed nonparametric and parametric estimators of WVUS, which could easily lend support to statistical inferences. Some simulations have been conducted to assess the proposed estimators and also to demonstrate the necessity of WVUS. A real data analysis about liver cancer illustrates our methodology.
Collapse
Affiliation(s)
- Lei Huang
- Southwest Jiaotong University, School of Mathematics, Department of Statistics, Chengdu, China
| | - Jialiang Li
- Duke University NUS Graduate Medical School, Singapore Eye Research Institute, National University of Singapore, Singapore, Singapore
| |
Collapse
|
7
|
Xu ZQ, Zhang P, Chai YQ, Wang HJ, Yuan R. A biosensor based on a 3D-DNA walking machine network and distance-controlled electrochemiluminescence energy transfer for ultrasensitive detection of tenascin C and lead ions. Chem Commun (Camb) 2018; 54:8741-8744. [DOI: 10.1039/c8cc04953j] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
An electrochemiluminescence biosensor was proposed based on distance-controlled energy transfer and a 3D-DNA walking machine network.
Collapse
Affiliation(s)
- Zi-Qi Xu
- Key Laboratory on Luminescence and Real-Time Analytical Chemistry
- Ministry of Education
- School of Chemistry and Chemical Engineering
- Southwest University
- Chongqing 400715
| | - Pu Zhang
- Key Laboratory on Luminescence and Real-Time Analytical Chemistry
- Ministry of Education
- School of Chemistry and Chemical Engineering
- Southwest University
- Chongqing 400715
| | - Ya-Qin Chai
- Key Laboratory on Luminescence and Real-Time Analytical Chemistry
- Ministry of Education
- School of Chemistry and Chemical Engineering
- Southwest University
- Chongqing 400715
| | - Hai-Jun Wang
- Key Laboratory on Luminescence and Real-Time Analytical Chemistry
- Ministry of Education
- School of Chemistry and Chemical Engineering
- Southwest University
- Chongqing 400715
| | - Ruo Yuan
- Key Laboratory on Luminescence and Real-Time Analytical Chemistry
- Ministry of Education
- School of Chemistry and Chemical Engineering
- Southwest University
- Chongqing 400715
| |
Collapse
|