1
|
Wang J, Tian L. Optimal Cut-Point Selection Methods Under Binary Classification When Subclasses Are Involved. Pharm Stat 2024. [PMID: 38972714 DOI: 10.1002/pst.2413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 05/21/2024] [Accepted: 05/30/2024] [Indexed: 07/09/2024]
Abstract
In practice, we often encounter binary classification problems where both main classes consist of multiple subclasses. For example, in an ovarian cancer study where biomarkers were evaluated for their accuracy of distinguishing noncancer cases from cancer cases, the noncancer class consists of healthy subjects and benign cases, while the cancer class consists of subjects at both early and late stages. This article aims to provide a large number of optimal cut-point selection methods for such setting. Furthermore, we also study confidence interval estimation of the optimal cut-points. Simulation studies are carried out to explore the performance of the proposed cut-point selection methods as well as confidence interval estimation methods. A real ovarian cancer data set is analyzed using the proposed methods.
Collapse
Affiliation(s)
- Jia Wang
- Department of Biostatistics, University at Buffalo, Buffalo, New York, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, New York, USA
| |
Collapse
|
2
|
Wang J, Yin J, Tian L. Evaluating joint confidence region of hypervolume under ROC manifold and generalized Youden index. Stat Med 2024; 43:869-889. [PMID: 38115806 DOI: 10.1002/sim.9998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Revised: 10/25/2023] [Accepted: 12/05/2023] [Indexed: 12/21/2023]
Abstract
In biomarker evaluation/diagnostic studies, the hypervolume under the receiver operating characteristic manifold (HUM K $$ {\mathrm{HUM}}_K $$ ) and the generalized Youden index (J K $$ {J}_K $$ ) are the most popular measures for assessing classification accuracy under multiple classes. WhileHUM K $$ {\mathrm{HUM}}_K $$ is frequently used to evaluate the overall accuracy,J K $$ {J}_K $$ provides direct measure of accuracy at the optimal cut-points. Simultaneous evaluation ofHUM K $$ {\mathrm{HUM}}_K $$ andJ K $$ {J}_K $$ provides a comprehensive picture about the classification accuracy of the biomarker/diagnostic test under consideration. This article studies both parametric and non-parametric approaches for estimating the confidence region ofHUM K $$ {\mathrm{HUM}}_K $$ andJ K $$ {J}_K $$ for a single biomarker. The performances of the proposed methods are investigated by an extensive simulation study and are applied to a real data set from the Alzheimer's Disease Neuroimaging Initiative.
Collapse
Affiliation(s)
- Jia Wang
- Department of Biostatistics, University at Buffalo, Buffalo, New York, USA
| | - Jingjing Yin
- Department of Biostatistics, Epidemiology and Environmental Health Sciences, Jiann-Ping Hsu College Public Health, Georgia Southern University, Statesboro, Georgia, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, New York, USA
| |
Collapse
|
3
|
Krishnamoorthy K, Murshed MM. Confidence estimation based on data from independent studies. Stat Methods Med Res 2024; 33:42-60. [PMID: 38055982 DOI: 10.1177/09622802231217644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
The problem of finding confidence intervals based on data from several independent studies or experiments is considered. A general method of finding confidence intervals by inverting a combined test is proposed. The combined tests considered are the Fisher test, the weighted inverse normal test, the inverse chi-square test and the inverse Cauchy test. The method is illustrated for finding confidence intervals for a common mean of several normal populations, common correlation coefficient of several bivariate normal populations, common coefficient of variation, common mean of several lognormal populations, and for a common mean of several gamma populations. For each case, the confidence intervals based on the combined tests are compared with the other available approximate confidence intervals with respect to coverage probability and precision. R functions to compute all confidence intervals are provided in a supplementary file. The methods are illustrated using several practical examples.
Collapse
Affiliation(s)
| | - Md Monzur Murshed
- Department of Mathematics, University of Louisiana at Lafayette, Lafayette, LA, USA
| |
Collapse
|
4
|
Nan N, Tian L. A new accuracy metric under three classes when subclasses are involved and its confidence interval estimation. Stat Med 2023; 42:5207-5228. [PMID: 37779490 DOI: 10.1002/sim.9908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 07/26/2023] [Accepted: 09/04/2023] [Indexed: 10/03/2023]
Abstract
"Compound multi-class classification" refers to the setting where three or more main classes are involved and at least one of the main classes have multiple subclasses. A common practice in evaluating biomarker performance under "compound multi-class classification" is "subclasses pooling." In this article, we first explore the downsides of accuracy metrics based on pooled data. Then we propose a new accuracy measure proper for "compound multi-class classification" with three ordinal main classes, namely "volume under compoundR O C $$ ROC $$ surface (V U S C $$ VU{S}_C $$ )." The proposedV U S C $$ VU{S}_C $$ evaluates the accuracy of a biomarker appropriately by identifying main classes without requiring specification of an ordering for marker values of subclasses within each main class. For confidence interval estimation ofV U S C $$ VU{S}_C $$ , both parametric and nonparametric methods are studied, and simulation studies are carried out to assess coverage probabilities. A subset of Alzheimer's Disease Neuroimaging Initiative study dataset is analyzed.
Collapse
Affiliation(s)
- Nan Nan
- Department of Biostatistics, University at Buffalo, Buffalo, New York, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, New York, USA
| |
Collapse
|
5
|
La-ongkaew M, Niwitpong SA, Niwitpong S. Estimating average wind speed in Thailand using confidence intervals for common mean of several Weibull distributions. PeerJ 2023; 11:e15513. [PMID: 37366422 PMCID: PMC10290832 DOI: 10.7717/peerj.15513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2022] [Accepted: 05/15/2023] [Indexed: 06/28/2023] Open
Abstract
The Weibull distribution has been used to analyze data from many fields, including engineering, survival and lifetime analysis, and weather forecasting, particularly wind speed data. It is useful to measure the central tendency of wind speed data in specific locations using statistical parameters for instance the mean to accurately forecast the severity of future catastrophic events. In particular, the common mean of several independent wind speed samples collected from different locations is a useful statistic. To explore wind speed data from several areas in Surat Thani province, a large province in southern Thailand, we constructed estimates of the confidence interval for the common mean of several Weibull distributions using the Bayesian equitailed confidence interval and the highest posterior density interval using the gamma prior. Their performances are compared with those of the generalized confidence interval and the adjusted method of variance estimates recovery based on their coverage probabilities and expected lengths. The results demonstrate that when the common mean is small and the sample size is large, the Bayesian highest posterior density interval performed the best since its coverage probabilities were higher than the nominal confidence level and it provided the shortest expected lengths. Moreover, the generalized confidence interval performed well in some scenarios whereas adjusted method of variance estimates recovery did not. The approaches were used to estimate the common mean of real wind speed datasets from several areas in Surat Thani province, Thailand, fitted to Weibull distributions. These results support the simulation results in that the Bayesian methods performed the best. Hence, the Bayesian highest posterior density interval is the most appropriate method for establishing the confidence interval for the common mean of several Weibull distributions.
Collapse
Affiliation(s)
- Manussaya La-ongkaew
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| | - Sa-Aat Niwitpong
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| | - Suparat Niwitpong
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| |
Collapse
|
6
|
Thangjai W, Niwitpong SA, Niwitpong S. Estimation of common percentile of rainfall datasets in Thailand using delta-lognormal distributions. PeerJ 2022; 10:e14498. [PMID: 36523461 PMCID: PMC9745915 DOI: 10.7717/peerj.14498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 11/10/2022] [Indexed: 12/12/2022] Open
Abstract
Weighted percentiles in many areas can be used to investigate the overall trend in a particular context. In this article, the confidence intervals for the common percentile are constructed to estimate rainfall in Thailand. The confidence interval for the common percentile help to indicate intensity of rainfall. Herein, four new approaches for estimating confidence intervals for the common percentile of several delta-lognormal distributions are presented: the fiducial generalized confidence interval, the adjusted method of variance estimates recovery, and two Bayesian approaches using fiducial quantity and approximate fiducial distribution. The Monte Carlo simulation was used to evaluate the coverage probabilities and average lengths via the R statistical program. The proposed confidence intervals are compared in terms of their coverage probabilities and average lengths, and the results of a comparative study based on these metrics indicate that one of the Bayesian confidence intervals is better than the others. The efficacies of the approaches are also illustrated by applying them to daily rainfall datasets from various regions in Thailand.
Collapse
Affiliation(s)
- Warisa Thangjai
- Department of Statistics, Faculty of Science, Ramkhamhaeng University, Bangkok, Thailand
| | - Sa-Aat Niwitpong
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| | - Suparat Niwitpong
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| |
Collapse
|
7
|
Point and Interval Estimation of Powers of Scale Parameters for Two Normal Populations with a Common Mean. Stat Pap (Berl) 2022. [DOI: 10.1007/s00362-022-01361-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
8
|
Gao Y, Tian L. Confidence interval estimation for the difference and ratio of the means of two gamma distributions. COMMUN STAT-SIMUL C 2022. [DOI: 10.1080/03610918.2022.2116646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Yi Gao
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
9
|
Khatun H, Tripathy MR, Pal N. Hypothesis testing and interval estimation for quantiles of two normal populations with a common mean. COMMUN STAT-THEOR M 2022. [DOI: 10.1080/03610926.2020.1845735] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Affiliation(s)
- Habiba Khatun
- Department of Mathematics, National Institute of Technology Rourkela, Rourkela, India
| | - Manas Ranjan Tripathy
- Department of Mathematics, National Institute of Technology Rourkela, Rourkela, India
| | - Nabendu Pal
- Department of Mathematics, University of Louisiana at Lafayette, Lafayette, Louisiana, USA
| |
Collapse
|
10
|
Yan L. Confidence interval estimation of the common mean of several gamma populations. PLoS One 2022; 17:e0269971. [PMID: 35714130 PMCID: PMC9205481 DOI: 10.1371/journal.pone.0269971] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Accepted: 06/01/2022] [Indexed: 11/18/2022] Open
Abstract
Gamma distributions are widely used in applied fields due to its flexibility of accommodating right-skewed data. Although inference methods for a single gamma mean have been well studied, research on the common mean of several gamma populations are sparse. This paper addresses the problem of confidence interval estimation of the common mean of several gamma populations using the concept of generalized inference and the method of variance estimates recovery (MOVER). Simulation studies demonstrate that several proposed approaches can provide confidence intervals with satisfying coverage probabilities even at small sample sizes. The proposed methods are illustrated using two examples.
Collapse
Affiliation(s)
- Li Yan
- Department of Biostatistics and Bioinformatics, Roswell Park Comprehensive Cancer Center, Buffalo, NY, United States of America
| |
Collapse
|
11
|
Gao Y, Tian L. Interval estimation for the difference of two correlated gamma means: a generalized inference method and hybrid methods. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2022.2046747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Yi Gao
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
12
|
Yin J, Samawi H, Tian L. Joint inference about the AUC and Youden index for paired biomarkers. Stat Med 2022; 41:37-64. [PMID: 34964512 DOI: 10.1002/sim.9222] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 09/22/2021] [Accepted: 09/27/2021] [Indexed: 11/05/2022]
Abstract
It is common to compare biomarkers' diagnostic or prognostic performance using some summary ROC measures such as the area under the ROC curve (AUC) or the Youden index. We propose to compare two paired biomarkers using both the AUC and the Youden index since the two indices describe different aspects of the ROC curve. This comparison can be made by estimating the joint confidence region (an elliptical area) of the differences of the paired AUCs and the Youden indices. Furthermore, for deciding if one marker is better than the other in terms of both the A U C and the Youden index (J), we can test H 0 : A U C a ≤ A U C b or J a ≤ J b against H a : A U C a > A U C b and J a > J b using the paired differences. The construction of such a joint hypothesis is an example of the multivariate order-restricted hypotheses. For such a hypothesis, we propose and compare three testing procedures: (1) the intersection-union test ( I U T ); (2) the conditional test; and (3) the joint test. The performance of the proposed inference methods was evaluated and compared through simulations. The simulation results demonstrate that the proposed joint confidence region maintains the desired confidence level, and all three tests maintain the type I error under the null. Furthermore, among the three proposed testing methods, the conditional test is the preferred approach with markedly larger power consistently than the other two competing methods.
Collapse
Affiliation(s)
- Jingjing Yin
- Department of Biostatistics, Epidemiology and Environmental Health Sciences, Georgia Southern University, Statesboro, Georgia, USA
| | - Hani Samawi
- Department of Biostatistics, Epidemiology and Environmental Health Sciences, Georgia Southern University, Statesboro, Georgia, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, New York, USA
| |
Collapse
|
13
|
Schaible BJ, Yin J. Joint confidence region estimation on predictive values. Pharm Stat 2021; 20:1147-1167. [PMID: 34021708 DOI: 10.1002/pst.2131] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Revised: 04/29/2021] [Accepted: 05/01/2021] [Indexed: 01/04/2023]
Abstract
For evaluating diagnostic accuracy of inherently continuous diagnostic tests/biomarkers, sensitivity and specificity are well-known measures both of which depend on a diagnostic cut-off, which is usually estimated. Sensitivity (specificity) is the conditional probability of testing positive (negative) given the true disease status. However, a more relevant question is "what is the probability of having (not having) a disease if a test is positive (negative)?". Such post-test probabilities are denoted as positive predictive value (PPV) and negative predictive value (NPV). The PPV and NPV at the same estimated cut-off are correlated, hence it is desirable to make the joint inference on PPV and NPV to account for such correlation. Existing inference methods for PPV and NPV focus on the individual confidence intervals and they were developed under binomial distribution assuming binary instead of continuous test results. Several approaches are proposed to estimate the joint confidence region as well as the individual confidence intervals of PPV and NPV. Simulation results indicate the proposed approaches perform well with satisfactory coverage probabilities for normal and non-normal data and, additionally, outperform existing methods with improved coverage as well as narrower confidence intervals for PPV and NPV. The Alzheimer's Disease Neuroimaging Initiative (ADNI) data set is used to illustrate the proposed approaches and compare them with the existing methods.
Collapse
Affiliation(s)
- Braydon J Schaible
- Department of Biostatistics, Epidemiology and Environmental Health Sciences, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, Georgia, USA
| | - Jingjing Yin
- Department of Biostatistics, Epidemiology and Environmental Health Sciences, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, Georgia, USA
| |
Collapse
|
14
|
Gao Y, Tian L. Confidence interval estimation for sensitivity and difference between two sensitivities at a given specificity under tree ordering. Stat Med 2021; 40:3695-3723. [PMID: 33906262 DOI: 10.1002/sim.8993] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Revised: 03/24/2021] [Accepted: 04/01/2021] [Indexed: 11/07/2022]
Abstract
This article considers a setting in diagnostic studies (or biomarker study) which involves a healthy class and a diseased class and the latter consists of several subclasses. The problem of interest is to evaluate the accuracy of a biomarker (or a diagnostic test) measured on a continuous scale correctly identifying healthy subjects from diseased subjects without requiring specification of an ordering in terms of marker values for subclasses relative to each other within the diseased class. Such setting is quite common in practice and it falls in the framework of tree ordering or umbrella ordering. This article explores several parametric and nonparametric approaches for estimating confidence intervals of sensitivity of single biomarker and difference between sensitivities of two correlated biomarkers under tree ordering at a given specificity. The performances of all the methods are evaluated and compared by a comprehensive simulation study. A published microarray data set is analyzed using the proposed methods.
Collapse
Affiliation(s)
- Yi Gao
- Department of Biostatistics, University at Buffalo, Buffalo, New York, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, New York, USA
| |
Collapse
|
15
|
Thangjai W, Niwitpong SA, Niwitpong S. Confidence intervals for the common coefficient of variation of rainfall in Thailand. PeerJ 2020; 8:e10004. [PMID: 33005493 PMCID: PMC7513754 DOI: 10.7717/peerj.10004] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Accepted: 08/30/2020] [Indexed: 11/20/2022] Open
Abstract
The log-normal distribution is often used to analyze environmental data like daily rainfall amounts. The rainfall is of interest in Thailand because high variable climates can lead to periodic water stress and scarcity. The mean, standard deviation or coefficient of variation of the rainfall in the area is usually estimated. The climate moisture index is the ratio of plant water demand to precipitation. The climate moisture index should use the coefficient of variation instead of the standard deviation for comparison between areas with widely different means. The larger coefficient of variation indicates greater dispersion, whereas the lower coefficient of variation indicates the lower risk. The common coefficient of variation, is the weighted coefficients of variation based on k areas, presents the average daily rainfall. Therefore, the common coefficient of variation is used to describe overall water problems of k areas. In this paper, we propose four novel approaches for the confidence interval estimation of the common coefficient of variation of log-normal distributions based on the fiducial generalized confidence interval (FGCI), method of variance estimates recovery (MOVER), computational, and Bayesian approaches. A Monte Carlo simulation was used to evaluate the coverage probabilities and average lengths of the confidence intervals. In terms of coverage probability, the results show that the FGCI approach provided the best confidence interval estimates for most cases except for when the sample case was equal to six populations (k = 6) and the sample sizes were small (n I < 50), for which the MOVER confidence interval estimates were the best. The efficacies of the proposed approaches are illustrated with example using real-life daily rainfall datasets from regions of Thailand.
Collapse
Affiliation(s)
- Warisa Thangjai
- Department of Statistics, Faculty of Science, Ramkhamhaeng University, Bangkok, Thailand
| | - Sa-Aat Niwitpong
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut's University of Technology North Bangok, Bangkok, Thailand
| | - Suparat Niwitpong
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut's University of Technology North Bangok, Bangkok, Thailand
| |
Collapse
|
16
|
Kim DH, Lee WD, Kang SG, Kim Y. Noninformative priors for linear combinations of normal means with unequal variances. J Korean Stat Soc 2018. [DOI: 10.1016/j.jkss.2018.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
17
|
Thangjai W, Niwitpong SA, Niwitpong S. Adjusted generalized confidence intervals for the common coefficient of variation of several normal populations. COMMUN STAT-SIMUL C 2018. [DOI: 10.1080/03610918.2018.1484138] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Warisa Thangjai
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| | - Sa-Aat Niwitpong
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| | - Suparat Niwitpong
- Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| |
Collapse
|
18
|
Malekzadeh A, Kharrati-Kopaei M. Inferences on the common mean of several normal populations under heteroscedasticity. Comput Stat 2018. [DOI: 10.1007/s00180-017-0789-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
19
|
Feng Y, Tian L. Measuring diagnostic accuracy for biomarkers under tree-ordering. Stat Methods Med Res 2018; 28:1328-1346. [DOI: 10.1177/0962280218755810] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
In the field of diagnostic studies for tree or umbrella ordering, under which the marker measurement for one class is lower or higher than those for the rest unordered classes, there exist a few diagnostic measures such as the naive AUC ( NAUC), the umbrella volume ( UV), and the recently proposed TAUC, i.e. area under a ROC curve for tree or umbrella ordering (TROC). However, an important characteristic about tree or umbrella ordering has been neglected. This paper mainly focuses on promoting the use of the integrated false negative rate under tree ordering ( ITFNR) as an additional diagnostic measure besides TAUC, and proposing the idea of using ( TAUC, ITFNR) instead of TAUC to evaluate the diagnostic accuracy of a biomarker under tree or umbrella ordering. Parametric and non-parametric approaches for constructing joint confidence region of ( TAUC, ITFNR) are proposed. Simulation studies under a variety of settings are carried out to assess and compare the performance of these methods. In the end, a published microarray data set is analyzed.
Collapse
Affiliation(s)
- Yingdong Feng
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
20
|
Yin J, Nakas CT, Tian L, Reiser B. Confidence intervals for differences between volumes under receiver operating characteristic surfaces (VUS) and generalized Youden indices (GYIs). Stat Methods Med Res 2017; 27:675-688. [PMID: 29233075 DOI: 10.1177/0962280217740787] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
This article explores both existing and new methods for the construction of confidence intervals for differences of indices of diagnostic accuracy of competing pairs of biomarkers in three-class classification problems and fills the methodological gaps for both parametric and non-parametric approaches in the receiver operating characteristic surface framework. The most widely used such indices are the volume under the receiver operating characteristic surface and the generalized Youden index. We describe implementation of all methods and offer insight regarding the appropriateness of their use through a large simulation study with different distributional and sample size scenarios. Methods are illustrated using data from the Alzheimer's Disease Neuroimaging Initiative study, where assessment of cognitive function naturally results in a three-class classification setting.
Collapse
Affiliation(s)
- Jingjing Yin
- 1 Department of Biostatistics, Jiann-Ping Hsu College of Public Health, Georgia Southern University, Statesboro, GA, USA
| | - Christos T Nakas
- 2 Laboratory of Biometry, School of Agriculture, University of Thessaly, Volos, Greece.,3 University Institute of Clinical Chemistry, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Lili Tian
- 4 Department of Biostatistics, University at Buffalo, Buffalo, NY, USA
| | - Benjamin Reiser
- 5 Department of Statistics, University of Haifa, Haifa, Israel
| |
Collapse
|
21
|
|
22
|
Kang SG, Lee WD, Kim Y. Objective Bayesian testing on the common mean of several normal distributions under divergence-based priors. Comput Stat 2016. [DOI: 10.1007/s00180-016-0699-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
23
|
Guo X, Wu H, Li G, Li Q. Inference for the common mean of several Birnbaum–Saunders populations. J Appl Stat 2016. [DOI: 10.1080/02664763.2016.1189521] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Xu Guo
- College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing, People's Republic of China
| | - Hecheng Wu
- College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing, People's Republic of China
| | - Gaorong Li
- Beijing Institute for Scientific and Engineering Computing, Beijing University of Technology, Beijing, People's Republic of China
| | - Qiuyue Li
- College of Science, China Agricultural University, Beijing, People's Republic of China
| |
Collapse
|
24
|
|
25
|
Wang D, Attwood K, Tian L. Receiver operating characteristic analysis under tree orderings of disease classes. Stat Med 2015; 35:1907-26. [DOI: 10.1002/sim.6843] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Revised: 11/15/2015] [Accepted: 11/19/2015] [Indexed: 11/11/2022]
Affiliation(s)
- Dan Wang
- Department of Biostatistics & Bioinformatics; Roswell Park Cancer Institute; Elm and Carlton Streets Buffalo 14263 NY U.S.A
- Department of Biostatistics; SUNY University at Buffalo; 3435 Main St. Buffalo 14214 NY U.S.A
| | - Kristopher Attwood
- Department of Biostatistics & Bioinformatics; Roswell Park Cancer Institute; Elm and Carlton Streets Buffalo 14263 NY U.S.A
| | - Lili Tian
- Department of Biostatistics & Bioinformatics; Roswell Park Cancer Institute; Elm and Carlton Streets Buffalo 14263 NY U.S.A
- Department of Biostatistics; SUNY University at Buffalo; 3435 Main St. Buffalo 14214 NY U.S.A
| |
Collapse
|
26
|
Yin J, Tian L. Joint inference about sensitivity and specificity at the optimal cut-off point associated with Youden index. Comput Stat Data Anal 2014. [DOI: 10.1016/j.csda.2014.01.021] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
27
|
|
28
|
Dong T, Kang L, Hutson A, Xiong C, Tian L. Confidence interval estimation of the difference between two sensitivities to the early disease stage. Biom J 2013; 56:270-86. [PMID: 24265123 DOI: 10.1002/bimj.201200012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2012] [Revised: 06/18/2013] [Accepted: 08/26/2013] [Indexed: 11/11/2022]
Abstract
Although most of the statistical methods for diagnostic studies focus on disease processes with binary disease status, many diseases can be naturally classified into three ordinal diagnostic categories, that is normal, early stage, and fully diseased. For such diseases, the volume under the ROC surface (VUS) is the most commonly used index of diagnostic accuracy. Because the early disease stage is most likely the optimal time window for therapeutic intervention, the sensitivity to the early diseased stage has been suggested as another diagnostic measure. For the purpose of comparing the diagnostic abilities on early disease detection between two markers, it is of interest to estimate the confidence interval of the difference between sensitivities to the early diseased stage. In this paper, we present both parametric and non-parametric methods for this purpose. An extensive simulation study is carried out for a variety of settings for the purpose of evaluating and comparing the performance of the proposed methods. A real example of Alzheimer's disease (AD) is analyzed using the proposed approaches.
Collapse
Affiliation(s)
- Tuochuan Dong
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, USA
| | | | | | | | | |
Collapse
|
29
|
Yin J, Tian L. Joint confidence region estimation for area under ROC curve and Youden index. Stat Med 2013; 33:985-1000. [PMID: 24123069 DOI: 10.1002/sim.5992] [Citation(s) in RCA: 102] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Revised: 08/27/2013] [Accepted: 08/28/2013] [Indexed: 11/07/2022]
Abstract
In the field of diagnostic studies, the area under the ROC curve (AUC) serves as an overall measure of a biomarker/diagnostic test's accuracy. Youden index, defined as the overall correct classification rate minus one at the optimal cut-off point, is another popular index. For continuous biomarkers of binary disease status, although researchers mainly evaluate the diagnostic accuracy using AUC, for the purpose of making diagnosis, Youden index provides an important and direct measure of the diagnostic accuracy at the optimal threshold and hence should be taken into consideration in addition to AUC. Furthermore, AUC and Youden index are generally correlated. In this paper, we initiate the idea of evaluating diagnostic accuracy based on AUC and Youden index simultaneously. As the first step toward this direction, this paper only focuses on the confidence region estimation of AUC and Youden index for a single marker. We present both parametric and non-parametric approaches for estimating joint confidence region of AUC and Youden index. We carry out extensive simulation study to evaluate the performance of the proposed methods. In the end, we apply the proposed methods to a real data set.
Collapse
Affiliation(s)
- Jingjing Yin
- Department of Biostatistics, University at Buffalo, Buffalo, NY, 14214-3000, U.S.A
| | | |
Collapse
|
30
|
Tian L. Confidence Interval Estimation for Sensitivity at a Fixed Level of Specificity for Combined Biomarkers. J Biopharm Stat 2013; 23:499-512. [DOI: 10.1080/10543406.2011.616967] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Lili Tian
- a Department of Biostatistics , University at Buffalo , Buffalo , New York , USA
| |
Collapse
|
31
|
Abstract
Recently, the concept of generalized treatment effect, defined as P(X > Y) where X and Y denote continuous outcome variables for treatment arm and control arm, respectively, has been proposed as an appropriate measure of treatment effect in clinical trials with parallel design. Compared to the mean difference, the generalized treatment effect has many advantages; for example, it is a scaleless measure and it does not change under monotonic transformations. This article investigates the problem of testing equality of generalized treatment effects among several clinical trials. The proposed approach follows the same vein as the generalized variable method for testing equality of several log-normal means proposed by Li (2009). Numerical study demonstrates that the proposed test has excellent type I error control for clinical trials with small to medium sample sizes. Robustness study shows that the proposed method performs reasonably for categorical data.
Collapse
Affiliation(s)
- Lili Tian
- Department of Biostatistics, SUNY at Buffalo, Buffalo, NY 14214-3000, USA.
| | | | | |
Collapse
|
32
|
Li X, Williamson PP. Testing on the common mean of normal distributions using Bayesian methods. J STAT COMPUT SIM 2012. [DOI: 10.1080/00949655.2012.744838] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
33
|
Ye RD, Ma TF, Luo K. Inferences on the reliability in balanced and unbalanced one-way random models. J STAT COMPUT SIM 2012. [DOI: 10.1080/00949655.2012.741598] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
34
|
Affiliation(s)
- S. H. Lin
- a National Taichung University of Science and Technology , Taichung , Taiwan
| | - R. S. Wang
- a National Taichung University of Science and Technology , Taichung , Taiwan
| |
Collapse
|
35
|
Ye RD, Hu YQ, Luo K. Inferences on the Among-Group Variance Component in Unbalanced Heteroscedastic One-Fold Nested Design. COMMUN STAT-SIMUL C 2012. [DOI: 10.1080/03610918.2011.594533] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
36
|
Dong T, Tian L, Hutson A, Xiong C. Parametric and non-parametric confidence intervals of the probability of identifying early disease stage given sensitivity to full disease and specificity with three ordinal diagnostic groups. Stat Med 2011; 30:3532-45. [PMID: 22139763 DOI: 10.1002/sim.4401] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2011] [Accepted: 08/12/2011] [Indexed: 12/14/2022]
Abstract
In practice, there exist many disease processes with three ordinal disease classes, that is, the non-diseased stage, the early disease stage, and the fully diseased stage. Because early disease stage is likely the best time window for treatment interventions, it is important to have diagnostic tests that have good diagnostic ability to discriminate the early disease stage from the other two stages. In this paper, we present both parametric and non-parametric approaches for confidence interval estimation of probability of detecting early disease stage given the true classification rates for non-diseased group and diseased group, namely, the specificity and the sensitivity to full disease. We analyze a data set on the clinical diagnosis of early-stage Alzheimer's disease from the neuropsychological database at the Washington University Alzheimer's Disease Research Center using the proposed approaches.
Collapse
Affiliation(s)
- Tuochuan Dong
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214-3000, USA
| | | | | | | |
Collapse
|
37
|
|
38
|
Generalized confidence intervals for the process capability indices in general random effect model with balanced data. Stat Pap (Berl) 2011. [DOI: 10.1007/s00362-009-0216-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
39
|
Lai CY, Tian L, Schisterman EF. Exact confidence interval estimation for the Youden index and its corresponding optimal cut-point. Comput Stat Data Anal 2010; 56:1103-1114. [PMID: 27099407 DOI: 10.1016/j.csda.2010.11.023] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
In diagnostic studies, the receiver operating characteristic (ROC) curve and the area under the ROC curve are important tools in assessing the utility of biomarkers in discriminating between non-diseased and diseased populations. For classifying a patient into the non-diseased or diseased group, an optimal cut-point of a continuous biomarker is desirable. Youden's index (J), defined as the maximum vertical distance between the ROC curve and the diagonal line, serves as another global measure of overall diagnostic accuracy and can be used in choosing an optimal cut-point. The proposed approach is to make use of a generalized approach to estimate the confidence intervals of the Youden index and its corresponding optimal cut-point. Simulation results are provided for comparing the coverage probabilities of the confidence intervals based on the proposed method with those based on the large sample method and the parametric bootstrap method. Finally, the proposed method is illustrated via an application to a data set from a study on Duchenne muscular dystrophy (DMD).
Collapse
Affiliation(s)
- Chin-Ying Lai
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, USA
| | - Lili Tian
- Department of Biostatistics, University at Buffalo, Buffalo, NY 14214, USA
| | - Enrique F Schisterman
- Division of Epidemiology, Statistics and Prevention Research, Eunice Kennedy Shriver, National Institute of Child Health and Human Development, National Institutes of Health, DHHS, 6100 Executive Blvd, 7B03, Rockville, Bethesda, MD, USA
| |
Collapse
|
40
|
Tian L, Xiong C, Lai CY, Vexler A. Exact confidence interval estimation for the difference in diagnostic accuracy with three ordinal diagnostic groups. J Stat Plan Inference 2010; 141:549-558. [PMID: 23538945 DOI: 10.1016/j.jspi.2010.07.004] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In the cases with three ordinal diagnostic groups, the important measures of diagnostic accuracy are the volume under surface (VUS) and the partial volume under surface (PVUS) which are the extended forms of the area under curve (AUC) and the partial area under curve (PAUC). This article addresses confidence interval estimation of the difference in paired VUS s and the difference in paired PVUS s. To focus especially on studies with small to moderate sample sizes, we propose an approach based on the concepts of generalized inference. A Monte Carlo study demonstrates that the proposed approach generally can provide confidence intervals with reasonable coverage probabilities even at small sample sizes. The proposed approach is compared to a parametric bootstrap approach and a large sample approach through simulation. Finally, the proposed approach is illustrated via an application to a data set of blood test results of anemia patients.
Collapse
Affiliation(s)
- Lili Tian
- Department of Biostatistics, University at Buffalo, 249 Farber Hall, 3435 Main St. Bldg. 26 Buffalo, NY 14214-3000, USA
| | | | | | | |
Collapse
|
41
|
Ye RD, Ma TF, Wang SG. Inferences on the common mean of several inverse Gaussian populations. Comput Stat Data Anal 2010. [DOI: 10.1016/j.csda.2009.09.039] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
42
|
Confidence interval estimation of partial area under curve based on combined biomarkers. Comput Stat Data Anal 2010. [DOI: 10.1016/j.csda.2009.09.016] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
43
|
Tian L, Vexler A, Yan L, Schisterman EF. Confidence interval estimation of the difference between paired AUCs based on combined biomarkers. J Stat Plan Inference 2009; 139:3725-3732. [PMID: 19946609 DOI: 10.1016/j.jspi.2009.05.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In many diagnostic studies, multiple diagnostic tests are performed on each subject or multiple disease markers are available. Commonly, the information should be combined to improve the diagnostic accuracy. We consider the problem of comparing the discriminatory abilities between two groups of biomarkers. Specifically, this article focuses on confidence interval estimation of the difference between paired AUCs based on optimally combined markers under the assumption of multivariate normality. Simulation studies demonstrate that the proposed generalized variable approach provides confidence intervals with satisfying coverage probabilities at finite sample sizes. The proposed method can also easily provide P-values for hypothesis testing. Application to analysis of a subset of data from a study on coronary heart disease illustrates the utility of the method in practice.
Collapse
Affiliation(s)
- Lili Tian
- Department of Biostatistics, University at Buffalo, 249 Farber Hall, 3435 Main St. Bldg. 26 Buffalo, NY 14214-3000, USA
| | | | | | | |
Collapse
|
44
|
|
45
|
Inferences on the difference and ratio of the means of two inverse Gaussian distributions. J Stat Plan Inference 2008. [DOI: 10.1016/j.jspi.2007.09.005] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
46
|
Tian L, Wilding GE. Confidence interval estimation of a common correlation coefficient. Comput Stat Data Anal 2008. [DOI: 10.1016/j.csda.2008.04.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
47
|
|
48
|
Tian L. Inferences about the between-study variance in meta-analysis with normally distributed outcomes. Biom J 2008; 50:248-56. [PMID: 18383448 DOI: 10.1002/bimj.200710408] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
This paper presents a new approach for confidence interval estimation of the between-study variance in meta-analysis with normally distributed responses based on the concepts of generalized variables. Simulation study shows that the coverage probabilities of the proposed confidence intervals are generally satisfactory. Moreover, the proposed approach can easily provide P -values for hypothesis testing. For meta-analysis of controlled clinical trials or epidemiological studies, within which the responses are normally distributed, the proposed approach is an ideal candidate for making inference about the between-study variance.
Collapse
Affiliation(s)
- Lili Tian
- Department of Biostatistics, University at Buffalo, School of Public Health and Health Professions, 249 Farber Hall 3435 Main St. Bldg. 26, Buffalo, NY 14214-3000, USA.
| |
Collapse
|
49
|
Tian L. Generalized Inferences on the Overall Treatment Effect in Meta-analysis with Normally Distributed Outcomes. Biom J 2008; 50:237-47. [DOI: 10.1002/bimj.200710409] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
50
|
|