1
|
Vasconcelos JCS, Travassos TDC, Ortega EMM, Cordeiro GM, Oliveira Reis L. Alternative statistical modeling for radical prostatectomy data. J Appl Stat 2023; 51:1007-1022. [PMID: 38524792 PMCID: PMC10956922 DOI: 10.1080/02664763.2023.2229973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 06/18/2023] [Indexed: 03/26/2024]
Abstract
Several statistical models have been proposed in recent years, among them is the semiparametric regression. In medicine, there are several situations in which it is impracticable to consider a linear regression for statistical modeling, especially when the data contain explanatory variables that present a nonlinear relationship with the response variable. Another common situation is when the response variable does not have a unimodal shape, and it is not possible to adopt distributions belonging to the symmetric or asymmetric classes. In this context, a semiparametric heteroskedastic regression is proposed based on an extension of the normal distribution. Then, we show the usefulness of this model to analyze the cost of prostate cancer surgery. The predictor variables refer to two groups of patients such that one group receives a multimodal local anesthetic solution (Preemptive Target Anesthetic Solution) and the second group is treated with neuraxial blockade (spinal anesthesia/traditional standard). The other relevant predictor variables are also evaluated, thus allowing for the in-depth interpretation of the predictor variables with a nonlinear effect on the dependent variable cost. The penalized maximum likelihood method is adopted to estimate the model parameters. The new regression is a useful statistical tool for analyzing medical data.
Collapse
|
2
|
Vasconcelos JCS, Cordeiro GM, Ortega EMM, dos Santos DP, Vila R. A useful semiparametric regression for climatology. COMMUN STAT-SIMUL C 2022. [DOI: 10.1080/03610918.2022.2107220] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Affiliation(s)
- Julio C. S. Vasconcelos
- Institute of Science and Technology, UNIFESP, Federal University of São Paulo, São José dos Campos, SP, Brazil
| | - Gauss M. Cordeiro
- Department of Statistics, UFPE, Federal University of Pernambuco, Recife, PE, Brazil
| | - Edwin M. M. Ortega
- Department of Exact Sciences, ESALQ, University of São Paulo, Piracicaba, SP, Brazil
| | - Denize P. dos Santos
- Department of Exact Sciences, ESALQ, University of São Paulo, Piracicaba, SP, Brazil
| | - Roberto Vila
- Department of Statistics, UNB, University of Brasilia, Brasilia, Brazil
| |
Collapse
|
3
|
Vasconcelos JCS, Cordeiro GM, Ortega EMM. The semiparametric regression model for bimodal data with different penalized smoothers applied to climatology, ethanol and air quality data. J Appl Stat 2022; 49:248-267. [DOI: 10.1080/02664763.2020.1803812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
| | - G. M. Cordeiro
- UFPE, Universidade Federal de Pernambuco, Recife, Brazil
| | | |
Collapse
|
4
|
Zou B, Mi X, Tighe PJ, Koch GG, Zou F. On kernel machine learning for propensity score estimation under complex confounding structures. Pharm Stat 2021; 20:752-764. [PMID: 33619894 PMCID: PMC8670098 DOI: 10.1002/pst.2105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2020] [Revised: 12/16/2020] [Accepted: 02/05/2021] [Indexed: 11/11/2022]
Abstract
Post marketing data offer rich information and cost-effective resources for physicians and policy-makers to address some critical scientific questions in clinical practice. However, the complex confounding structures (e.g., nonlinear and nonadditive interactions) embedded in these observational data often pose major analytical challenges for proper analysis to draw valid conclusions. Furthermore, often made available as electronic health records (EHRs), these data are usually massive with hundreds of thousands observational records, which introduce additional computational challenges. In this paper, for comparative effectiveness analysis, we propose a statistically robust yet computationally efficient propensity score (PS) approach to adjust for the complex confounding structures. Specifically, we propose a kernel-based machine learning method for flexibly and robustly PS modeling to obtain valid PS estimation from observational data with complex confounding structures. The estimated propensity score is then used in the second stage analysis to obtain the consistent average treatment effect estimate. An empirical variance estimator based on the bootstrap is adopted. A split-and-merge algorithm is further developed to reduce the computational workload of the proposed method for big data, and to obtain a valid variance estimator of the average treatment effect estimate as a by-product. As shown by extensive numerical studies and an application to postoperative pain EHR data comparative effectiveness analysis, the proposed approach consistently outperforms other competing methods, demonstrating its practical utility.
Collapse
Affiliation(s)
- Baiming Zou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Xinlei Mi
- Department of Biostatistics, Columbia University, New York, NY 10032, USA
| | - Patrick J. Tighe
- Department of Anesthesiology, University of Florida, Gainesville, FL 32611, USA
| | - Gary G. Koch
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Fei Zou
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
5
|
Semiparametric estimation for average causal effects using propensity score-based spline. J Stat Plan Inference 2021. [DOI: 10.1016/j.jspi.2020.10.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
6
|
Sung CL, Hung Y, Rittase W, Zhu C, Wu CFJ. Calibration for Computer Experiments With Binary Responses and Application to Cell Adhesion Study. J Am Stat Assoc 2020. [DOI: 10.1080/01621459.2019.1699419] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Affiliation(s)
- Chih-Li Sung
- Department of Statistics and Probability, Michigan State University, East Lansing, MI
| | - Ying Hung
- Department of Statistics, Rutgers, The State University of New Jersey, Piscataway, NJ
| | - William Rittase
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA
| | - Cheng Zhu
- Department of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA
| | - C. F. J. Wu
- School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA
| |
Collapse
|
7
|
Souza Vasconcelos JC, Villegas C. Generalized symmetrical partial linear model. J Appl Stat 2020; 48:557-572. [PMID: 35706541 DOI: 10.1080/02664763.2020.1726301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
In this work, we propose a new model called generalized symmetrical partial linear model, based on the theory of generalized linear models and symmetrical distributions. In our model the response variable follows a symmetrical distribution such a normal, Student-t, power exponential, among others. Following the context of generalized linear models we consider replacing the traditional linear predictors by the more general predictors in whose case one covariate is related with the response variable in a non-parametric fashion, that we do not specified the parametric function. As an example, we could imagine a regression model in which the intercept term is believed to vary in time or geographical location. The backfitting algorithm is used for estimating the parameters of the proposed model. We perform a simulation study for assessing the behavior of the penalized maximum likelihood estimators. We use the quantile residuals for checking the assumption of the model. Finally, we analyzed real data set related with pH rivers in Ireland.
Collapse
|
8
|
|
9
|
Talamakrouni M, El Ghouch A, Van Keilegom I. Parametrically guided local quasi-likelihood with censored data. Electron J Stat 2017. [DOI: 10.1214/17-ejs1293] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
10
|
Li CS. A test for the linearity of the nonparametric part of a semiparametric logistic regression model. J Appl Stat 2016. [DOI: 10.1080/02664763.2015.1070803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
11
|
|
12
|
Abstract
Cubic B-splines are used to estimate the nonparametric component of a semiparametric generalized linear model. A penalized log-likelihood ratio test statistic is constructed for the null hypothesis of the linearity of the non-parametric function. When the number of knots is fixed, its limiting null distribution is the distribution of a linear combination of independent chi-squared random variables, each with one df. The smoothing parameter is determined by giving a specified value for its asymptotically expected value under the null hypothesis. A simulation study is conducted to evaluate its power performance; a real-life dataset is used to illustrate its practical use.
Collapse
Affiliation(s)
- Chin-Shang Li
- Division of Biostatistics, Department of Public Health Sciences, University of California, Davis, CA 95616
| |
Collapse
|
13
|
Abstract
Generalized linear models and quasi-likelihood method extend the ordinary regression models to accommodate more general conditional distributions of the response. Nonparametric methods need no explicit parametric specification and the resulting model is completely determined by the data themselves. However nonparametric estimation schemes generally have a slower convergence rate such as the local polynomial smoothing estimation of nonparametric generalized linear models studied in Fan, Heckman and Wand (1995). In this work, we propose two parametrically guided nonparametric estimation schemes by incorporating prior shape information on the link transformation of the response variable's conditional mean in terms of the predictor variable. Asymptotic results and numerical simulations demonstrate the improvement of our new estimation schemes over the original nonparametric counterpart.
Collapse
|
14
|
|
15
|
Chen XD, Tang NS, Wang XR. On confidence regions of semiparametric nonlinear reproductive dispersion models. STATISTICS-ABINGDON 2009. [DOI: 10.1080/02331880802689332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
16
|
|
17
|
Antoniadis A. Wavelet methods in statistics: some recent developments and their applications. STATISTICS SURVEYS 2007. [DOI: 10.1214/07-ss014] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
18
|
Gu C, Kim YJ. Penalized likelihood regression: General formulation and efficient approximation. CAN J STAT 2002. [DOI: 10.2307/3316100] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
19
|
Li B. Nonparametric estimating equations based on a penalized information criterion. CAN J STAT 2000. [DOI: 10.2307/3315970] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
20
|
|
21
|
Biller C. Adaptive Bayesian Regression Splines in Semiparametric Generalized Linear Models. J Comput Graph Stat 2000. [DOI: 10.1080/10618600.2000.10474869] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
22
|
Sakamoto W, Shirahata S. Likelihood-based cross-validation score for selecting the smoothing parameter in maximum penalized likelihood estimation. COMMUN STAT-THEOR M 1999. [DOI: 10.1080/03610929908832379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Wataru Sakamoto
- a Department of Informatics and Mathematical Science , Graduate School of Engineering Science , Toyonaka , Osaka , 560-8531 , Japan
| | - Shingo Shirahata
- a Department of Informatics and Mathematical Science , Graduate School of Engineering Science , Toyonaka , Osaka , 560-8531 , Japan
| |
Collapse
|
23
|
|
24
|
Sakamoto W, Shirahata S. SIMPLE CALCULATION OF LIKELIHOOD-BASED CROSS-VALIDATION SCORE IN MAXIMUM PENALIZED LIKELIHOOD ESTIMATION OF REGRESSION FUNCTIONS. JOURNAL JAPANESE SOCIETY OF COMPUTATIONAL STATISTICS 1997. [DOI: 10.5183/jjscs1988.10.27] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
25
|
Rossini AJ, Tsiatis AA. A Semiparametric Proportional Odds Regression Model for the Analysis of Current Status Data. J Am Stat Assoc 1996. [DOI: 10.1080/01621459.1996.10476939] [Citation(s) in RCA: 53] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
26
|
|
27
|
Wahba G, Wang Y, Gu C, Klein R, Klein B. Smoothing spline ANOVA for exponential families, with application to the Wisconsin Epidemiological Study of Diabetic Retinopathy : the 1994 Neyman Memorial Lecture. Ann Stat 1995. [DOI: 10.1214/aos/1034713638] [Citation(s) in RCA: 124] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
28
|
Abstract
Epidemiologists have used the term 'tracking' to connote an individual's maintenance of relative rank of some longitudinally measured characteristic over a given time span. To assess the extent to which an attribute tracks we have first to summarize individual growth curves, and second to quantify the notion of maintenance of relative rank, both in the face of random error. A sequence of papers appearing in 1981 provided differing methodologies for appraising tracking. Here we take a different approach to tracking by using regression trees for longitudinal data. The above two concerns are simultaneously addressed in that the procedure identifies subgroups, defined in terms of covariates, within which the collection of growth curves is homogeneous. After reviewing the existing approaches to tracking we describe the tree-structured methodology, and present an illustrative example pertaining to lung function growth in children.
Collapse
Affiliation(s)
- M R Segal
- Division of Biostatistics, University of California, San Francisco 94143-0560
| | | |
Collapse
|
29
|
|
30
|
|
31
|
|
32
|
|
33
|
|
34
|
|