Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Total Articles

449
(from Reference Citation Analysis)

Article PDFs (51)

Cited by > 0 (302)

Searched Name

Donglin Zeng

Ranked By

Results Analysis

Year Published Analysis
Article Type Analysis
Publication Title Analysis
Category Analysis

Results Analysis

Number	Citation Analysis
201	Zhang P, Li M, Chiang C, Wang L, Xiang Y, Cheng L, Feng W, Schleyer TK, Quinney SK, Wu H, Zeng D, Li L. Three-Component Mixture Model-Based Adverse Drug Event Signal Detection for the Adverse Event Reporting System. CPT Pharmacometrics Syst Pharmacol 2018;7:499-506. [PMID: 30091855 PMCID: PMC6118321 DOI: 10.1002/psp4.12294] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 02/26/2018] [Indexed: 01/24/2023] Open Abstract The US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) is an important source for detecting adverse drug event (ADE) signals. In this article, we propose a three-component mixture model (3CMM) for FAERS signal detection. In 3CMM, a drug-ADE pair is assumed to have either a zero relative risk (RR), or a background RR (mean RR = 1), or an increased RR (mean RR >1). By clearly defining the second component (mean RR = 1) as the null distribution, 3CMM estimates local false discovery rates (FDRs) for ADE signals under the empirical Bayes framework. Compared with existing approaches, the local FDR's top signals have noninferior or better sensitivities to detect true signals in both FAERS analysis and simulation studies. Additionally, we identify that the top signals of different approaches have different patterns, and they are complementary to each other. Collapse Key Words Collapse MESH Headings Adverse Drug Reaction Reporting Systems Complex Mixtures/toxicity Databases, Factual Humans United States United States Food and Drug Administration Collapse Grants R01 GM124104 NIGMS NIH HHS R01 LM011945 NLM NIH HHS Collapse
202	Bulka CM, Daviglus ML, Persky VW, Durazo-Arvizu RA, Avilés-Santa ML, Gallo LC, Hosgood HD, Singer RH, Talavera GA, Thyagarajan B, Zeng D, Argos M. Occupational Exposures and Metabolic Syndrome Among Hispanics/Latinos: Cross-Sectional Results From the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). J Occup Environ Med 2018;59:1047-1055. [PMID: 29112602 DOI: 10.1097/jom.0000000000001115] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Abstract OBJECTIVE We assessed the cross-sectional relationships of self-reported current occupational exposures to solvents, metals, and pesticides with metabolic syndrome and its components among 7127 participants in the Hispanic Community Health Study/Study of Latinos. METHODS Metabolic syndrome was defined as a clustering of abdominal obesity, high triglycerides, low high-density lipoprotein cholesterol, high blood pressure, and/or high fasting glucose. Regression models that incorporated inverse probability of exposure weighting were used to estimate prevalence ratios. RESULTS Solvent exposure was associated with a 32% higher prevalence of high blood pressure (95% confidence interval: 1.09 to 1.60) than participants not reporting exposure. No associations were observed for occupational exposures with abdominal obesity, high triglycerides, low high-density lipoprotein, or metabolic syndrome. CONCLUSION Our findings suggest that solvent exposure may be an important occupational risk factor for high blood pressure among Hispanics/Latinos in the United States. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
203	Kim S, Zeng D, Cai J. Analysis of multiple survival events in generalized case-cohort designs. Biometrics 2018;74:1250-1260. [PMID: 29992545 DOI: 10.1111/biom.12923] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Revised: 05/01/2018] [Accepted: 05/01/2018] [Indexed: 01/04/2023] Abstract Generalized case-cohort design has been proposed to assess the effects of exposures on survival outcomes when measuring exposures is expensive and events are not rare in the cohort. In such design, expensive exposure information is collected from both a (stratified) randomly selected subcohort and a subset of individuals with events. In this article, we consider extension of such design to study multiple types of survival events by selecting a proportion of cases for each type of event. We propose a general weighting scheme to analyze data. Furthermore, we examine the optimal choice of weights and show that this optimal weighting yields much improved efficiency gain both asymptotically and in simulation studies. Finally, we apply our proposed methods to data from the Atherosclerosis Risk in Communities study. Collapse Key Words Case-cohort study Multiple disease outcomes Multiple events Non-rare diseases Proportional hazards Stratified sampling Survival analysis Collapse MESH Headings Collapse Grants Collapse
204	Gao F, Zeng D, Lin DY. Semiparametric regression analysis of interval-censored data with informative dropout. Biometrics 2018;74:1213-1222. [PMID: 29870067 DOI: 10.1111/biom.12911] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 02/01/2018] [Accepted: 04/01/2018] [Indexed: 12/01/2022] Abstract Interval-censored data arise when the event time of interest can only be ascertained through periodic examinations. In medical studies, subjects may not complete the examination schedule for reasons related to the event of interest. In this article, we develop a semiparametric approach to adjust for such informative dropout in regression analysis of interval-censored data. Specifically, we propose a broad class of joint models, under which the event time of interest follows a transformation model with a random effect and the dropout time follows a different transformation model but with the same random effect. We consider nonparametric maximum likelihood estimation and develop an EM algorithm that involves simple and stable calculations. We prove that the resulting estimators of the regression parameters are consistent, asymptotically normal, and asymptotically efficient with a covariance matrix that can be consistently estimated through profile likelihood. In addition, we show how to consistently estimate the survival function when dropout represents voluntary withdrawal and the cumulative incidence function when dropout is an unavoidable terminal event. Furthermore, we assess the performance of the proposed numerical and inferential procedures through extensive simulation studies. Finally, we provide an application to data on the incidence of diabetes from a major epidemiological cohort study. Collapse Key Words Joint models Nonparametric likelihood Random effects Semiparametric efficiency Terminal event Transformation models Collapse MESH Headings Collapse Grants Collapse
205	Liu Y, Wang Y, Kosorok MR, Zhao Y, Zeng D. Augmented outcome-weighted learning for estimating optimal dynamic treatment regimens. Stat Med 2018;37:3776-3788. [PMID: 29873099 DOI: 10.1002/sim.7844] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 03/30/2018] [Accepted: 05/12/2018] [Indexed: 11/08/2022] Abstract Dynamic treatment regimens (DTRs) are sequential treatment decisions tailored by patient's evolving features and intermediate outcomes at each treatment stage. Patient heterogeneity and the complexity and chronicity of many diseases call for learning optimal DTRs that can best tailor treatment according to each individual's time-varying characteristics (eg, intermediate response over time). In this paper, we propose a robust and efficient approach referred to as Augmented Outcome-weighted Learning (AOL) to identify optimal DTRs from sequential multiple assignment randomized trials. We improve previously proposed outcome-weighted learning to allow for negative weights. Furthermore, to reduce the variability of weights for numeric stability and improve estimation accuracy, in AOL, we propose a robust augmentation to the weights by making use of predicted pseudooutcomes from regression models for Q-functions. We show that AOL still yields Fisher-consistent DTRs even if the regression models are misspecified and that an appropriate choice of the augmentation guarantees smaller stochastic errors in value function estimation for AOL than the previous outcome-weighted learning. Finally, we establish the convergence rates for AOL. The comparative advantage of AOL over existing methods is demonstrated through extensive simulation studies and an application to a sequential multiple assignment randomized trial for major depressive disorder. Collapse Key Words Q-learning SMARTs adaptive intervention individualized treatment rule machine learning outcome-weighted learning personalized medicine Collapse MESH Headings Collapse Grants Collapse
206	Hardy ST, Zeng D, Kshirsagar AV, Viera AJ, Avery CL, Heiss G. Primary prevention of chronic kidney disease through population-based strategies for blood pressure control: The ARIC study. J Clin Hypertens (Greenwich) 2018;20:1018-1026. [PMID: 29797488 DOI: 10.1111/jch.13311] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Revised: 04/04/2018] [Accepted: 04/20/2018] [Indexed: 12/13/2022] Abstract While much of the chronic kidney disease (CKD) literature focuses on the role of blood pressure reduction in delaying CKD progression, little is known about the benefits of modest population-wide decrements in blood pressure on incident CKD. The authors used multivariable linear regression to characterize the impact on incident CKD of two approaches for blood pressure management: (1) a 1-mm Hg reduction in systolic BP across the entire study population; and (2) a 10% reduction in participants with unaware, untreated, and uncontrolled BP above goal as defined by the Seventh Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure (JNC 7) thresholds. Over a mean of 20 years of follow-up (ARIC [Atherosclerosis Risk in Communities] study, n = 15 390), 3852 incident CKD events were ascertained. After adjustment, a 1-mm Hg decrement in systolic BP across the population was associated with an estimated 11.7 (95% confidence interval [CI], 6.2-17.3) and 13.4 (95% CI, 10.3-16.6) fewer CKD events per 100 000 person-years in blacks and whites, respectively. Among participants with BP above JNC 7 goal, a 10% decrease in unaware, untreated, or uncontrolled BP was associated with 3.2 (95% CI, 2.0-4.9), 2.8 (95% CI, 1.8-4.3), and 5.8 (95% CI, 3.6-8.8) fewer CKD events per 100 000 person-years in blacks and 3.1 (95% CI, 2.3-4.1), 0.7 (95% CI, 0.5-0.9), and 1.0 (95% CI, 1.3-2.4) fewer CKD events per 100 000 person-years in whites. Modest population-wide reductions in systolic BP hold potential for the primary prevention of CKD. Collapse Key Words blood pressure chronic kidney disease end-stage renal disease epidemiology hypertension prevention Collapse MESH Headings Collapse Grants Collapse
207	Diao G, Zeng D, Ke C, Ma H, Jiang Q, Ibrahim JG. Semiparametric regression analysis for composite endpoints subject to componentwise censoring. Biometrika 2018. [DOI: 10.1093/biomet/asy013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
208	Zhou Y, Ni X, Wen B, Duan L, Sun H, Yang M, Zou F, Lin Y, Liu Q, Zeng Y, Fu X, Pan K, Jing B, Wang P, Zeng D. Appropriate dose of Lactobacillus buchneri supplement improves intestinal microbiota and prevents diarrhoea in weaning Rex rabbits. Benef Microbes 2018;9:401-416. [DOI: 10.3920/bm2017.0055] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Abstract This study examined the effects on intestinal microbiota and diarrhoea of Lactobacillus buchneri supplementation to the diet of weaning Rex rabbits. To this end, rabbits were treated with L. buchneri at two different doses (LC: 104 cfu/g diet and HC: 105 cfu/g diet) for 4 weeks. PCR-DGGE was used to determine the diversity of the intestinal microbiota, while real-time PCR permitted the detection of individual bacterial species. ELISA and real-time PCR allowed the identification of numerous cytokines in the intestinal tissues. Zonula occludens-1, polymeric immunoglobulin receptor and immunoglobulin A genes were examined to evaluate intestinal barriers. Results showed that the biodiversity of the intestinal microbiota of weaning Rex rabbits improved in the whole tract of the treated groups. The abundance of most detected bacterial species was highly increased in the duodenum, jejunum and ileum after L. buchneri administration. The species abundance in the HC group was more increased than in the LC group when compared to the control. Although the abundance of Enterobacteriaceae exhibited a different pattern, Escherichia coli was inhibited in all treatment groups. Toll-like receptor (TLR)2 and TLR4 genes were down-regulated in all intestinal tissues as the microbiota changed. In the LC group, the secretion of the inflammatory cytokine tumour necrosis factor-α was reduced, the gene expression of the anti-inflammatory cytokine interleukin (IL)-4 was up-regulated and the expression of intestinal-barrier-related genes was enhanced. Conversely, IL-4 expression was increased and the expression of other tested genes did not change in the HC group. The beneficial effects of LC were greater than those of HC or the control in terms of improving the daily weight gain and survival rate of weaning Rex rabbits and reducing their diarrhoea rate. Therefore, 104 cfu/g L. buchneri treatment improved the microbiota of weaning Rex rabbits and prevented diarrhoea in these animals. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
209	Zeng D, Zhou R, Yu Y, Luo Y, Zhang J, Sun H, Bin J, Liao Y, Rao J, Zhang Y, Liao W. Gene expression profiles for a prognostic immunoscore in gastric cancer. Br J Surg 2018;105:1338-1348. [PMID: 29691839 PMCID: PMC6099214 DOI: 10.1002/bjs.10871] [Citation(s) in RCA: 133] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2017] [Revised: 02/22/2018] [Accepted: 03/07/2018] [Indexed: 12/12/2022] Abstract Background Increasing evidence has indicated an association between immune infiltration in gastric cancer and clinical outcome. However, reliable prognostic signatures, based on systematic assessments of the immune landscape inferred from bulk tumour transcriptomes, have not been established. The aim was to develop an immune signature, based on the cellular composition of the immune infiltrate inferred from bulk tumour transcriptomes, to improve the prognostic predictions of gastric cancer. Methods Twenty‐two types of immune cell fraction were estimated based on large public gastric cancer cohorts from the Gene Expression Omnibus using CIBERSORT. An immunoscore based on the fraction of immune cell types was then constructed using a least absolute shrinkage and selection operator (LASSO) Cox regression model. Results Using the LASSO model, an immunoscore was established consisting of 11 types of immune cell fraction. In the training cohort (490 patients), significant differences were found between high‐ and low‐immunoscore groups in overall survival across and within subpopulations with an identical TNM stage. Multivariable analysis revealed that the immunoscore was an independent prognostic factor (hazard ratio 1·92, 95 per cent c.i. 1·54 to 2·40). The prognostic value of the immunoscore was also confirmed in the validation (210) and entire (700) cohorts. Conclusion The proposed immunoscore represents a promising signature for estimating overall survival in patients with gastric cancer. Immunoscore predicts prognosis Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
210	Zeng D, Hyun N, Cai J. Semiparametric Additive Model for Estimating Risk Difference in Multicenter Studies. BIOSTATISTICS & EPIDEMIOLOGY 2018;2:84-98. [PMID: 30631827 PMCID: PMC6322696 DOI: 10.1080/24709360.2018.1445430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2018] [Accepted: 02/14/2018] [Indexed: 10/17/2022] Abstract Many cancer studies are conducted in multiple centers. While they have the advantage of more patients and larger population, center-to-center heterogeneity could be significant such that it cannot be ignored in analysis. In this paper, we propose semiparametric additive risk models with a general link function to estimate risk effects while accounting for center-specific baseline function. We propose an estimating equation for inference and show that the derived estimators are consistent and asymptotically normal. Simulation studies demonstrate good small-sample performance of the proposed method. We apply the method to analyze data from the Study of Left Ventricular Dysfunction (SOLVD) in 1990 and discuss application to one-to-one matched design. Collapse Key Words Additive risk models Estimating equation multi-center study one-to-one matched design proportional hazards model recurrent event Collapse MESH Headings Collapse Grants P01 CA142538 NCI NIH HHS R01 GM124104 NIGMS NIH HHS Collapse
211	Stewart TG, Zeng D, Wu MC. Constructing support vector machines with missing data. ACTA ACUST UNITED AC 2018. [DOI: 10.1002/wics.1430] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
212	Diao G, Dong J, Zeng D, Ke C, Rong A, Ibrahim JG. Biomarker threshold adaptive designs for survival endpoints. J Biopharm Stat 2018;28:1038-1054. [PMID: 29436940 DOI: 10.1080/10543406.2018.1434191] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Abstract Due to the importance of precision medicine, it is essential to identify the right patients for the right treatment. Biomarkers, which have been commonly used in clinical research as well as in clinical practice, can facilitate selection of patients with a good response to the treatment. In this paper, we describe a biomarker threshold adaptive design with survival endpoints. In the first stage, we determine subgroups for one or more biomarkers such that patients in these subgroups benefit the most from the new treatment. The analysis in this stage can be based on historical or pilot studies. In the second stage, we sample subjects from the subgroups determined in the first stage and randomly allocate them to the treatment or control group. Extensive simulation studies are conducted to examine the performance of the proposed design. Application to a real data example is provided for implementation of the first-stage algorithms. Collapse Key Words Adaptive enrichment design predictive biomarker survival endpoint two-stage design Collapse MESH Headings Collapse Grants Collapse
213	Li X, Xie S, Zeng D, Wang Y. Efficient ℓ₀ -norm feature selection based on augmented and penalized minimization. Stat Med 2018;37:473-486. [PMID: 29082539 PMCID: PMC5768461 DOI: 10.1002/sim.7526] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Revised: 07/04/2017] [Accepted: 09/13/2017] [Indexed: 11/06/2022] Abstract Advances in high-throughput technologies in genomics and imaging yield unprecedentedly large numbers of prognostic biomarkers. To accommodate the scale of biomarkers and study their association with disease outcomes, penalized regression is often used to identify important biomarkers. The ideal variable selection procedure would search for the best subset of predictors, which is equivalent to imposing an ℓ0 -penalty on the regression coefficients. Since this optimization is a nondeterministic polynomial-time hard (NP-hard) problem that does not scale with number of biomarkers, alternative methods mostly place smooth penalties on the regression parameters, which lead to computationally feasible optimization problems. However, empirical studies and theoretical analyses show that convex approximation of ℓ0 -norm (eg, ℓ1 ) does not outperform their ℓ0 counterpart. The progress for ℓ0 -norm feature selection is relatively slower, where the main methods are greedy algorithms such as stepwise regression or orthogonal matching pursuit. Penalized regression based on regularizing ℓ0 -norm remains much less explored in the literature. In this work, inspired by the recently popular augmenting and data splitting algorithms including alternating direction method of multipliers, we propose a 2-stage procedure for ℓ0 -penalty variable selection, referred to as augmented penalized minimization-L0 (APM-L0 ). The APM-L0 targets ℓ0 -norm as closely as possible while keeping computation tractable, efficient, and simple, which is achieved by iterating between a convex regularized regression and a simple hard-thresholding estimation. The procedure can be viewed as arising from regularized optimization with truncated ℓ1 norm. Thus, we propose to treat regularization parameter and thresholding parameter as tuning parameters and select based on cross-validation. A 1-step coordinate descent algorithm is used in the first stage to significantly improve computational efficiency. Through extensive simulation studies and real data application, we demonstrate superior performance of the proposed method in terms of selection accuracy and computational speed as compared to existing methods. The proposed APM-L0 procedure is implemented in the R-package APML0. Collapse Key Words ADMM biomarker signature censored data variable selection ℓ0-penalty Collapse MESH Headings Algorithms Biomarkers Computer Simulation Genomics Humans Likelihood Functions Models, Statistical Prognosis Regression Analysis Collapse Grants R01 CA082659 NCI NIH HHS R37 GM047845 NIGMS NIH HHS R01 GM047845 NIGMS NIH HHS R01 NS073671 NINDS NIH HHS U01 NS082062 NINDS NIH HHS Collapse
214	Engeda JC, Holliday KM, Hardy ST, Chakladar S, Lin DY, Talavera GA, Howard BV, Daviglus ML, Pirzada A, Schreiner PJ, Zeng D, Avery CL. Transitions from Ideal to Intermediate Cholesterol Levels may vary by Cholesterol Metric. Sci Rep 2018;8:2782. [PMID: 29426885 PMCID: PMC5807429 DOI: 10.1038/s41598-018-20660-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 01/09/2018] [Indexed: 11/08/2022] Open Abstract To examine the ability of total cholesterol (TC), a low-density lipoprotein cholesterol (LDL-C) proxy widely used in public health initiatives, to capture important population-level shifts away from ideal and intermediate LDL-C throughout adulthood. We estimated age (≥20 years)-, race/ethnic (Caucasian, African American, and Hispanic/Latino)-, and sex- specific net transition probabilities between ideal, intermediate, and poor TC and LDL-C using National Health and Nutrition Examination Survey (2007-2014; N = 13,584) and Hispanic Community Health Study/Study of Latinos (2008-2011; N = 15,612) data in 2016 and validated and calibrated novel Markov-type models designed for cross-sectional data. At age 20, >80% of participants had ideal TC, whereas the race/ethnic- and sex-specific prevalence of ideal LDL-C ranged from 39.2%-59.6%. Net transition estimates suggested that the largest one-year net shifts away from ideal and intermediate LDL-C occurred approximately two decades earlier than peak net population shifts away from ideal and intermediate TC. Public health and clinical initiatives focused on monitoring TC in middle-adulthood may miss important shifts away from ideal and intermediate LDL-C, potentially increasing the duration, perhaps by decades, that large segments of the population are exposed to suboptimal LDL-C. Collapse Key Words Collapse MESH Headings Adult Black or African American Aged Cholesterol, LDL/blood Cross-Sectional Studies Data Analysis Female Health Status Hispanic or Latino Humans Male Medical Records Middle Aged Retrospective Studies White People Collapse Grants R21 HL121580 NHLBI NIH HHS T32 ES007018 NIEHS NIH HHS T32 HL007055 NHLBI NIH HHS HHSN268201300025C NHLBI NIH HHS Collapse
215	Zhu A, Zeng D, Zhang P, Li L. Estimating causal log-odds ratio using the case-control sample and its application in the pharmaco-epidemiology study. Stat Methods Med Res 2018;28:2165-2178. [PMID: 29355073 DOI: 10.1177/0962280217750175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Abstract One important goal in pharmaco-epidemiology studies is to understand the causal relationship between drug exposures and their clinical outcomes, including adverse drug events. In order to achieve this goal, however, we need to resolve several challenges. Most of pharmaco-epidemiology data are observational and confounding is largely present due to many co-medications. The pharmaco-epidemiology study data set is often sampled from large medical record databases using a matched case-control design, and it may not be representative of the original patient population in the medical record databases. Data analysis method needs to handle a large sample size that cannot be handled using existing statistical analysis packages. In this paper, we tackle these challenges both methodologically and computationally. We propose a conditional causal log-odds ratio (OR) definition to characterize causal effects of drug exposures on a binary adverse drug event adjusting for individual level confounders. Using a case-control design, we present a propensity score estimation using only case samples and we provide sufficient conditions for the consistency of the estimation of the causal log-odds ratio using case-based propensity scores. Computationally, we implement a principle component analysis to reduce high-dimensional confounders. Extensive simulation studies are performed to demonstrate superior performance of our method to existing methods. Finally, we apply the proposed method to analyze drug-induced myopathy data sampled from a de-identified subset of medical record database (close to 5 million patient records), The Indiana Network for Patient Care. Our method identified 70 drug-induced myopathy (p < 0.05) out 72 drugs, which have myoathy side effects on their FDA drug labels. These 70 drugs include three statins who are known for their myopathy side effects. Collapse Key Words Case-control design OR causal inference pharmaco-epidemiology principal components propensity scores Collapse MESH Headings Collapse Grants Collapse
216	Choi J, Zeng D, Olshan AF, Cai J. Joint modeling of survival time and longitudinal outcomes with flexible random effects. LIFETIME DATA ANALYSIS 2018;24:126-152. [PMID: 28856493 PMCID: PMC5756108 DOI: 10.1007/s10985-017-9405-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Accepted: 08/17/2017] [Indexed: 06/07/2023] Abstract Joint models with shared Gaussian random effects have been conventionally used in analysis of longitudinal outcome and survival endpoint in biomedical or public health research. However, misspecifying the normality assumption of random effects can lead to serious bias in parameter estimation and future prediction. In this paper, we study joint models of general longitudinal outcomes and survival endpoint but allow the underlying distribution of shared random effect to be completely unknown. For inference, we propose to use a mixture of Gaussian distributions as an approximation to this unknown distribution and adopt an Expectation-Maximization (EM) algorithm for computation. Either AIC and BIC criteria are adopted for selecting the number of mixtures. We demonstrate the proposed method via a number of simulation studies. We illustrate our approach with the data from the Carolina Head and Neck Cancer Study (CHANCE). Collapse Key Words Gaussian mixtures Generalized linear mixed model Maximum likelihood estimator Random effect Simultaneous modeling Stratified Cox proportional hazards model Collapse MESH Headings Algorithms Biometry/methods Computer Simulation Head and Neck Neoplasms/epidemiology Head and Neck Neoplasms/psychology Humans Likelihood Functions Linear Models Longitudinal Studies Normal Distribution North Carolina/epidemiology Proportional Hazards Models Quality of Life Survival Analysis Collapse Grants P01 CA142538 NIH HHS R01 GM047845 NIGMS NIH HHS UL1 RR025747 NCRR NIH HHS P01 CA142538 NCI NIH HHS R01 ES021900 NIEHS NIH HHS P2C HD050924 NICHD NIH HHS R01 ES021900 National Institutes of Health (US) Collapse
217	Zhang P, Wu H, Chiang C, Wang L, Binkheder S, Wang X, Zeng D, Quinney SK, Li L. Translational Biomedical Informatics and Pharmacometrics Approaches in the Drug Interactions Research. CPT-PHARMACOMETRICS & SYSTEMS PHARMACOLOGY 2017;7:90-102. [PMID: 29193890 PMCID: PMC5824109 DOI: 10.1002/psp4.12267] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 11/08/2017] [Indexed: 12/18/2022] Abstract Drug interaction is a leading cause of adverse drug events and a major obstacle for current clinical practice. Pharmacovigilance data mining, pharmacokinetic modeling, and text mining are computation and informatic tools on integrating drug interaction knowledge and generating drug interaction hypothesis. We provide a comprehensive overview of these translational biomedical informatics methodologies with related databases. We hope this review illustrates the complementary nature of these informatic approaches and facilitates the translational drug interaction research. Collapse Key Words Collapse MESH Headings Computational Biology/methods Data Mining Databases, Factual Drug Interactions Drug-Related Side Effects and Adverse Reactions/epidemiology Humans Pharmacovigilance Translational Research, Biomedical/methods Collapse Grants R01 GM117206 NIGMS NIH HHS R01 GM124104 NIGMS NIH HHS Collapse
218	Zeng D, Pan J, Hu K, Chi E, Lin DY. Improving the power to establish clinical similarity in a Phase 3 efficacy trial by incorporating prior evidence of analytical and pharmacokinetic similarity. J Biopharm Stat 2017;28:320-332. [PMID: 29173074 DOI: 10.1080/10543406.2017.1397012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Abstract To improve patients' access to safe and effective biological medicines, abbreviated licensure pathways for biosimilar and interchangeable biological products have been established in the US, Europe, and other countries around the world. The US Food and Drug Administration and European Medicines Agency have published various guidance documents on the development and approval of biosimilars, which recommend a "totality-of-the-evidence" approach with a stepwise process to demonstrate biosimilarity. The approach relies on comprehensive comparability studies ranging from analytical and nonclinical studies to clinical pharmacokinetic/pharmacodynamic (PK/PD) and efficacy studies. A clinical efficacy study may be necessary to address residual uncertainty about the biosimilarity of the proposed product to the reference product and support a demonstration that there are no clinically meaningful differences. In this article, we propose a statistical strategy that takes into account the similarity evidence from analytical assessments and PK studies in the design and analysis of the clinical efficacy study in order to address residual uncertainty and enhance statistical power and precision. We assume that if the proposed biosimilar product and the reference product are shown to be highly similar with respect to the analytical and PK parameters, then they should also be similar with respect to the efficacy parameters. We show that the proposed methods provide correct control of the type I error and improve the power and precision of the efficacy study upon the standard analysis that disregards the prior evidence. We confirm and illustrate the theoretical results through simulation studies based on the biosimilars development experience of many different products. Collapse Key Words Bioequivalence biological medicine biosimilars equivalence margins rejection region stepwise approach totality of evidence Collapse MESH Headings Collapse Grants Collapse
219	Wang X, Zhang P, Chiang CW, Wu H, Shen L, Ning X, Zeng D, Wang L, Quinney SK, Feng W, Li L. Mixture drug-count response model for the high-dimensional drug combinatory effect on myopathy. Stat Med 2017;37:673-686. [PMID: 29171062 DOI: 10.1002/sim.7545] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2017] [Revised: 09/21/2017] [Accepted: 10/05/2017] [Indexed: 01/24/2023] Abstract Drug-drug interactions (DDIs) are a common cause of adverse drug events (ADEs). The electronic medical record (EMR) database and the FDA's adverse event reporting system (FAERS) database are the major data sources for mining and testing the ADE associated DDI signals. Most DDI data mining methods focus on pair-wise drug interactions, and methods to detect high-dimensional DDIs in medical databases are lacking. In this paper, we propose 2 novel mixture drug-count response models for detecting high-dimensional drug combinations that induce myopathy. The "count" indicates the number of drugs in a combination. One model is called fixed probability mixture drug-count response model with a maximum risk threshold (FMDRM-MRT). The other model is called count-dependent probability mixture drug-count response model with a maximum risk threshold (CMDRM-MRT), in which the mixture probability is count dependent. Compared with the previous mixture drug-count response model (MDRM) developed by our group, these 2 new models show a better likelihood in detecting high-dimensional drug combinatory effects on myopathy. CMDRM-MRT identified and validated (54; 374; 637; 442; 131) 2-way to 6-way drug interactions, respectively, which induce myopathy in both EMR and FAERS databases. We further demonstrate FAERS data capture much higher maximum myopathy risk than EMR data do. The consistency of 2 mixture models' parameters and local false discovery rate estimates are evaluated through statistical simulation studies. Collapse Key Words FDA's adverse event reporting system drug-count response model electronic medical record high-dimensional drug interactions myopathy Collapse MESH Headings Collapse Grants Collapse
220	Jung HK, Kuzmiak CM, Kim KW, Choi NM, Kim HJ, Langman EL, Yoon S, Steen D, Zeng D, Gao F. Potential Use of American College of Radiology BI-RADS Mammography Atlas for Reporting and Assessing Lesions Detected on Dedicated Breast CT Imaging: Preliminary Study. Acad Radiol 2017;24:1395-1401. [PMID: 28728854 DOI: 10.1016/j.acra.2017.06.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2016] [Revised: 05/11/2017] [Accepted: 06/08/2017] [Indexed: 01/20/2023] Abstract RATIONALE AND OBJECTIVES Dedicated breast computed tomography (DBCT) is an emerging and promising modality for breast lesions. The objective of this study was to evaluate the potential use of applying the BI-RADS Mammography Atlas 5th Edition for reporting and assessing breast lesions on DBCT. Currently, no atlas exists for DBCT. MATERIALS AND METHODS Four radiologists trained in breast imaging were recruited in this institutional review board-approved, Health Insurance Portability and Accountability Act-compliant study. The enrolled radiologists, who were blinded to mammographic and histopathologic findings, individually reviewed 30 randomized DBCT cases that contained marked lesions. Thirty-four lesions were included in this study: 24 (70.6%) masses, 7 (20.6%) calcifications, and 3 (8.8%) architectural distortions. Eight (23.5%) lesions were malignant and 26 (76.5%) were benign. The reader was asked to specify according to the BI-RADS Mammography Atlas for each marked DBCT lesion: primary findings, features, breast density, and final assessment. We calculated readers' diagnostic performances for differentiating between benign and malignant lesions and interobserver variability for reporting and assessing lesions using a generalized estimating equation and the Fleiss kappa (κ) statistic. RESULTS The estimated overall sensitivity of the readers was 0.969, and the specificity was 0.529. There were no significant differences in the sensitivity and the specificity between lesion types. For reporting the presence of a primary finding, the overall substantial agreement (κ = 0.70) was seen. In assigning the breast density and the final assessment, the overall agreement was moderate (κ = 0.53) and fair (κ = 0.30). CONCLUSION The use of the BI-RADS Mammography Atlas 5th Edition for DBCT showed high performance and good agreement among readers. Collapse Key Words BI-RADS Breast neoplasm breast CT mammography Collapse MESH Headings Adult Breast/diagnostic imaging Breast Density Breast Neoplasms/diagnostic imaging Breast Neoplasms/pathology Calcinosis/diagnostic imaging Female Humans Mammography/standards Middle Aged Observer Variation Sensitivity and Specificity Single-Blind Method Tomography, X-Ray Computed Collapse Grants Collapse
221	Qiu X, Zeng D, Wang Y. Estimation and evaluation of linear individualized treatment rules to guarantee performance. Biometrics 2017;74:517-528. [PMID: 28960239 DOI: 10.1111/biom.12773] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2017] [Revised: 08/01/2017] [Accepted: 08/01/2017] [Indexed: 11/30/2022] Abstract In clinical practice, an informative and practically useful treatment rule should be simple and transparent. However, because simple rules are likely to be far from optimal, effective methods to construct such rules must guarantee performance, in terms of yielding the best clinical outcome (highest reward) among the class of simple rules under consideration. Furthermore, it is important to evaluate the benefit of the derived rules on the whole sample and in pre-specified subgroups (e.g., vulnerable patients). To achieve both goals, we propose a robust machine learning method to estimate a linear treatment rule that is guaranteed to achieve optimal reward among the class of all linear rules. We then develop a diagnostic measure and inference procedure to evaluate the benefit of the obtained rule and compare it with the rules estimated by other methods. We provide theoretical justification for the proposed method and its inference procedure, and we demonstrate via simulations its superior performance when compared to existing methods. Lastly, we apply the method to the Sequenced Treatment Alternatives to Relieve Depression (STARD) trial on major depressive disorder and show that the estimated optimal linear rule provides a large benefit for mildly depressed and severely depressed patients but manifests a lack-of-fit for moderately depressed patients. Collapse Key Words* Dynamic treatment regime Machine learning Qualitative interaction Robust loss function Treatment response heterogeneity Collapse MESH Headings Collapse Grants Collapse
222	Zeng D, Gao F, Lin DY. Maximum likelihood estimation for semiparametric regression models with multivariate interval-censored data. Biometrika 2017;104:505-525. [PMID: 29391606 PMCID: PMC5787874 DOI: 10.1093/biomet/asx029] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2016] [Indexed: 11/13/2022] Open Abstract Interval-censored multivariate failure time data arise when there are multiple types of failure or there is clustering of study subjects and each failure time is known only to lie in a certain interval. We investigate the effects of possibly time-dependent covariates on multivariate failure times by considering a broad class of semiparametric transformation models with random effects, and we study nonparametric maximum likelihood estimation under general interval-censoring schemes. We show that the proposed estimators for the finite-dimensional parameters are consistent and asymptotically normal, with a limiting covariance matrix that attains the semiparametric efficiency bound and can be consistently estimated through profile likelihood. In addition, we develop an EM algorithm that converges stably for arbitrary datasets. Finally, we assess the performance of the proposed methods in extensive simulation studies and illustrate their application using data derived from the Atherosclerosis Risk in Communities Study. Collapse Key Words Current-status data EM algorithm Multivariate failure time data Nonparametric likelihood Profile likelihood Proportional hazards Proportional odds Random effects Collapse MESH Headings Collapse Grants R01 CA082659 NCI NIH HHS R37 AI029168 NIAID NIH HHS R01 GM047845 NIGMS NIH HHS R01 NS073671 NINDS NIH HHS P01 CA142538 NCI NIH HHS R01 HL149683 NHLBI NIH HHS Collapse
223	Gao F, Dong J, Zeng D, Rong A, Ibrahim JG. Pattern mixture models for clinical validation of biomarkers in the presence of missing data. Stat Med 2017;36:2994-3004. [PMID: 28464562 DOI: 10.1002/sim.7328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Revised: 02/16/2017] [Accepted: 04/07/2017] [Indexed: 11/09/2022] Abstract Targeted therapies for cancers are sometimes only effective in a subset of patients with a particular biomarker status. In clinical development, the biomarker status is typically determined by an investigational-use-only/laboratory-developed test. A market ready test (MRT) is developed later to meet regulatory requirements and for future commercial use. In the USA, the clinical validation of MRT showing efficacy and safety profile of the targeted therapy in the biomarker subgroups determined by MRT is needed for pre-market approval. One of the major challenges in carrying out clinical validation is that the biomarker status per MRT is often missing for many subjects. In this paper, we treat biomarker status as a missing covariate and develop a novel pattern mixture model in the setting of a proportional hazards model for the time-to-event outcome variable. We specify a multinomial regression model for the missing biomarker statuses, and develop an expectation-maximization algorithm by the Method of Weights (Ibrahim, Journal of the American Statistical Association, 1990) to estimate the parameters in the regression model. We use Louis' formula (Louis, Journal of the Royal Statistical Society. Series B, 1982) to obtain standard errors estimates. We examine the performance of our method in extensive simulation studies and apply our method to a clinical trial in metastatic colorectal cancer. Copyright © 2017 John Wiley & Sons, Ltd. Collapse Key Words clinical trials companion diagnostics missing data Collapse MESH Headings Collapse Grants Collapse
224	Gao F, Liu GF, Zeng D, Xu L, Lin B, Diao G, Golm G, Heyse JF, Ibrahim JG. Control-based imputation for sensitivity analyses in informative censoring for recurrent event data. Pharm Stat 2017;16:424-432. [DOI: 10.1002/pst.1821] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Revised: 05/11/2017] [Accepted: 07/10/2017] [Indexed: 11/08/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
225	Zeng Y, Zeng D, Zhang Y, Ni XQ, Wang J, Jian P, Zhou Y, Li Y, Yin ZQ, Pan KC, Jing B. Lactobacillus plantarumBS22 promotes gut microbial homeostasis in broiler chickens exposed to aflatoxin B1. J Anim Physiol Anim Nutr (Berl) 2017;102:e449-e459. [DOI: 10.1111/jpn.12766] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 05/15/2017] [Indexed: 11/30/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
226	Byrne C, Ursin G, Martin CF, Peck JD, Cole EB, Zeng D, Kim E, Yaffe MD, Boyd NF, Heiss G, McTiernan A, Chlebowski RT, Lane DS, Manson JE, Wactawski-Wende J, Pisano ED. Mammographic Density Change With Estrogen and Progestin Therapy and Breast Cancer Risk. J Natl Cancer Inst 2017;109:3064857. [PMID: 28376149 DOI: 10.1093/jnci/djx001] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2014] [Accepted: 01/06/2017] [Indexed: 02/04/2023] Open Abstract Background Estrogen plus progestin therapy increases both mammographic density and breast cancer incidence. Whether mammographic density change associated with estrogen plus progestin initiation predicts breast cancer risk is unknown. Methods We conducted an ancillary nested case-control study within the Women's Health Initiative trial that randomly assigned postmenopausal women to daily conjugated equine estrogen 0.625 mg plus medroxyprogesterone acetate 2.5 mg or placebo. Mammographic density was assessed from mammograms taken prior to and one year after random assignment for 174 women who later developed breast cancer (cases) and 733 healthy women (controls). Logistic regression analyses included adjustment for confounders and baseline mammographic density when appropriate. Results Among women in the estrogen plus progestin arm (97 cases/378 controls), each 1% positive change in percent mammographic density increased breast cancer risk 3% (odds ratio [OR] = 1.03, 95% confidence interval [CI] = 1.01 to 1.06). For women in the highest quintile of mammographic density change (>19.3% increase), breast cancer risk increased 3.6-fold (95% CI = 1.52 to 8.56). The effect of estrogen plus progestin use on breast cancer risk (OR = 1.28, 95% CI = 0.90 to 1.82) was eliminated in this study, after adjusting for change in mammographic density (OR = 1.00, 95% CI = 0.66 to 1.51). Conclusions We found the one-year change in mammographic density after estrogen plus progestin initiation predicted subsequent increase in breast cancer risk. All of the increased risk from estrogen plus progestin use was mediated through mammographic density change. Doctors should evaluate changes in mammographic density with women who initiate estrogen plus progestin therapy and discuss the breast cancer risk implications. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
227	Hardy ST, Holliday KM, Chakladar S, Engeda JC, Allen NB, Heiss G, Lloyd-Jones DM, Schreiner PJ, Shay CM, Lin D, Zeng D, Avery CL. Heterogeneity in Blood Pressure Transitions Over the Life Course: Age-Specific Emergence of Racial/Ethnic and Sex Disparities in the United States. JAMA Cardiol 2017;2:653-661. [PMID: 28423153 PMCID: PMC5634332 DOI: 10.1001/jamacardio.2017.0652] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Abstract Importance Many studies have assessed racial/ethnic and sex disparities in the prevalence of elevated blood pressure (BP) from childhood to adulthood, yet few have examined differences in age-specific transitions between categories of BP over the life course in contemporary, multiracial/multiethnic populations. Objective To estimate age, racial/ethnic, and sex-specific annual net transition probabilities between categories of BP using Markov modeling of cross-sectional data from the National Health and Nutrition Examination Survey. Design, Setting, and Participants National probability sample (National Health and Nutrition Examination Survey in 2007-2008, 2009-2010, and 2011-2012) of 17 747 African American, white American, and Mexican American participants aged 8 to 80 years. The data were analyzed from September 2014 to November 2015. Main Outcomes and Measures Age-specific American Heart Association-defined BP categories. Results Three National Health and Nutrition Examination Survey cross-sectional samples were used to characterize the ages at which self-reported African American (n = 4973), white American (n = 8886), and Mexican American (n = 3888) populations transitioned between ideal BP, prehypertension, and hypertension across the life course. At age 8 years, disparities in the prevalence of ideal BP were observed, with the prevalence being lower among boys (86.6%-88.8%) compared with girls (93.0%-96.3%). From ages 8 to 30 years, annual net transition probabilities from ideal to prehypertension among male individuals were more than 2 times the net transition probabilities of their female counterparts. The largest net transition probabilities for ages 8 to 30 years occurred in African American young men, among whom a net 2.9% (95% CI, 2.3%-3.4%) of those with ideal BP transitioned to prehypertension 1 year later. Mexican American young women aged 8 to 30 years experienced the lowest ideal to prehypertension net transition probabilities (0.6%; 95% CI, 0.3%-0.8%). After age 40 years, ideal to prehypertension net transition probabilities stabilized or decreased (range, 3.0%-4.5%) for men, whereas net transition probabilities for women increased rapidly (range, 2.6%-13.0%). Mexican American women exhibited the largest ideal to prehypertension net transition probabilities after age 60 years. The largest prehypertension to hypertension net transition probabilities occurred at young ages in boys of white race/ethnicity and African Americans, approximately age 8 years and age 25 years, respectively, while net transition probabilities for white women and Mexican Americans increased over the life course. Conclusions and Relevance Heterogeneity in net transition probabilities from ideal BP emerge during childhood, with associated rapid declines in ideal BP observed in boys and African Americans, thus introducing disparities. Primordial prevention beginning in childhood and into early adulthood is necessary to preempt the development of prehypertension and hypertension, as well as associated racial/ethnic and sex disparities. Collapse Key Words Collapse MESH Headings Adolescent Adult Black or African American/statistics & numerical data Age Factors Aged Aged, 80 and over Child Cross-Sectional Studies Disease Progression Ethnicity/statistics & numerical data Female Health Status Disparities Humans Hypertension/epidemiology Hypertension/physiopathology Male Markov Chains Mexican Americans/statistics & numerical data Middle Aged Prehypertension/epidemiology Prehypertension/physiopathology Sex Factors United States/epidemiology White People/statistics & numerical data Young Adult Collapse Grants T32 ES007018 NIEHS NIH HHS K99 HL098458 NHLBI NIH HHS HHSN268201300027C NHLBI NIH HHS HHSN268201300029C NHLBI NIH HHS HHSN268201300025C NIA NIH HHS HHSN268201300026C NHLBI NIH HHS P30 ES010126 NIEHS NIH HHS R21 HL121580 NHLBI NIH HHS T32 DK007750 NIDDK NIH HHS HHSN268200900041C NHLBI NIH HHS HHSN268201300028C NHLBI NIH HHS P2C HD050924 NICHD NIH HHS T32 HL007055 NHLBI NIH HHS Collapse
228	Chen W, Zeng D, Desai A, Badillo M, Feng L, Yan F, Nomie K, Ping L, Ye H, Liang Y, Lee H, Oki Y, Romaguera J, Wang M. Improved outcome for patients with relapsed/refractory mantle cell lymphoma (MCL) who stop ibrutinib +/− rituximab for reasons other than progression of disease. Hematol Oncol 2017. [DOI: 10.1002/hon.2439_114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
229	Diao G, Zeng D, Hu K, Ibrahim JG. Modeling event count data in the presence of informative dropout with application to bleeding and transfusion events in myelodysplastic syndrome. Stat Med 2017;36:3475-3494. [PMID: 28560768 DOI: 10.1002/sim.7351] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Revised: 04/01/2017] [Accepted: 05/05/2017] [Indexed: 11/05/2022] Abstract In many biomedical studies, it is often of interest to model event count data over the study period. For some patients, we may not follow up them for the entire study period owing to informative dropout. The dropout time can potentially provide valuable insight on the rate of the events. We propose a joint semiparametric model for event count data and informative dropout time that allows for correlation through a Gamma frailty. We develop efficient likelihood-based estimation and inference procedures. The proposed nonparametric maximum likelihood estimators are shown to be consistent and asymptotically normal. Furthermore, the asymptotic covariances of the finite-dimensional parameter estimates attain the semiparametric efficiency bound. Extensive simulation studies demonstrate that the proposed methods perform well in practice. We illustrate the proposed methods through an application to a clinical trial for bleeding and transfusion events in myelodysplastic syndrome. Copyright © 2017 John Wiley & Sons, Ltd. Collapse Key Words Cox model Poisson regression model informative dropout nonparametric maximum likelihood estimators semiparametric efficiency Collapse MESH Headings Collapse Grants Collapse
230	Gao F, Zeng D, Lin DY. Semiparametric estimation of the accelerated failure time model with partly interval-censored data. Biometrics 2017;73:1161-1168. [PMID: 28444688 DOI: 10.1111/biom.12700] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Revised: 03/01/2017] [Accepted: 03/01/2017] [Indexed: 11/29/2022] Abstract Partly interval-censored (PIC) data arise when some failure times are exactly observed while others are only known to lie within certain intervals. In this article, we consider efficient semiparametric estimation of the accelerated failure time (AFT) model with PIC data. We first generalize the Buckley-James estimator for right-censored data to PIC data. Then, we develop a one-step estimator by deriving and estimating the efficient score for the regression parameters. We show that under mild regularity conditions the generalized Buckley-James estimator is consistent and asymptotically normal and the one-step estimator is consistent and asymptotically normal with a covariance matrix that attains the semiparametric efficiency bound. We conduct extensive simulation studies to examine the performance of the proposed estimators in finite samples and apply our methods to data derived from an AIDS study. Collapse Key Words Bootstrap Buckley-James estimator Kernel estimation One-step estimator Semiparametric efficiency Survival data Collapse MESH Headings Collapse Grants Collapse
231	Holliday KM, Lin DY, Chakladar S, Castañeda SF, Daviglus ML, Evenson KR, Marquez DX, Qi Q, Shay CM, Sotres-Alvarez D, Vidot DC, Zeng D, Avery CL. Targeting physical activity interventions for adults: When should intervention occur? Prev Med 2017;97:13-18. [PMID: 28024863 PMCID: PMC5337155 DOI: 10.1016/j.ypmed.2016.12.036] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Revised: 11/29/2016] [Accepted: 12/21/2016] [Indexed: 11/25/2022] Abstract Understanding demographic differences in transitions across physical activity (PA) levels is important for informing PA-promoting interventions, yet few studies have examined these transitions in contemporary multi-ethnic adult populations. We estimated age-, race/ethnicity-, and sex-specific 1-year net transition probabilities (NTPs) for National Health and Nutrition Examination Survey (2007-2012, n=11,556) and Hispanic Community Health Study/Study of Latinos (2008-2011, n=15,585) adult participants using novel Markov-type state transition models developed for cross-sectional data. Among populations with ideal PA (≥150min/week; ranging from 56% (non-Hispanic black females) to 88% (non-Hispanic white males) at age 20), NTPs to intermediate PA (>0-<149min/week) generally increased with age, particularly for non-Hispanic black females for whom a net 0.0% (95% confidence interval (CI): 0.0, 0.2) transitioned from ideal to intermediate PA at age 20; by age 70, the NTP rose to 3.6% (95% CI: 2.3, 4.8). Heterogeneity in intermediate to poor (0min/week) PA NTPs also was observed, with NTPs peaking at age 20 for Hispanic/Latino males and females [age 20 NTP=3.7% (95% CI: 2.0, 5.5) for females and 5.0% (1.2, 8.7) for males], but increasing throughout adulthood for non-Hispanic blacks and whites [e.g. age 70 NTP=7.8% (95% CI: 6.1, 9.6%) for black females and 8.1% (4.7, 11.6) for black males]. Demographic differences in PA net transitions across adulthood justify further development of tailored interventions. However, innovative efforts may be required for populations in which large proportions have already transitioned from ideal PA by early adulthood. Collapse Key Words Minority health Physical activity Population-based planning Collapse MESH Headings Collapse Grants Collapse
232	Wang Y, Fu H, Zeng D. Learning Optimal Personalized Treatment Rules in Consideration of Benefit and Risk: with an Application to Treating Type 2 Diabetes Patients with Insulin Therapies. J Am Stat Assoc 2017;113:1-13. [PMID: 30034060 PMCID: PMC6051551 DOI: 10.1080/01621459.2017.1303386] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Revised: 01/01/2017] [Indexed: 12/26/2022] Abstract Individualized medical decision making is often complex due to patient treatment response heterogeneity. Pharmacotherapy may exhibit distinct efficacy and safety profiles for different patient populations. An "optimal" treatment that maximizes clinical benefit for a patient may also lead to concern of safety due to a high risk of adverse events. Thus, to guide individualized clinical decision making and deliver optimal tailored treatments, maximizing clinical benefit should be considered in the context of controlling for potential risk. In this work, we propose two approaches to identify personalized optimal treatment strategy that maximizes clinical benefit under a constraint on the average risk. We derive the theoretical optimal treatment rule under the risk constraint and draw an analogy to the Neyman-Pearson lemma to prove the theorem. We present algorithms that can be easily implemented by any off-the-shelf quadratic programming package. We conduct extensive simulation studies to show satisfactory risk control when maximizing the clinical benefit. Lastly, we apply our method to a randomized trial of type 2 diabetes patients to guide optimal utilization of the first line insulin treatments based on individual patient characteristics while controlling for the rate of hypoglycemia events. We identify baseline glycated hemoglobin level, body mass index, and fasting blood glucose as three key factors among 18 biomarkers to differentiate treatment assignments, and demonstrate a successful control of the risk of hypoglycemia in both the training and testing data set. Collapse Key Words Benefit Risk Analysis Hypoglycemia Machine Learning Neyman-Pearson Lemma Personalized Medicine Collapse MESH Headings Collapse Grants R37 GM047845 NIGMS NIH HHS R01 GM047845 NIGMS NIH HHS R01 NS073671 NINDS NIH HHS R01 GM124104 NIGMS NIH HHS U01 NS082062 NINDS NIH HHS Collapse
233	Diao G, Zeng D, Ibrahim JG, Rong A, Lee O, Zhang K, Chen Q. Statistical design of noninferiority multiple region clinical trials to assess global and consistent treatment effects. J Biopharm Stat 2017;27:933-944. [PMID: 28296570 DOI: 10.1080/10543406.2017.1293075] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Abstract Noninferiority multiregional clinical trials (MRCTs) have recently received increasing attention in drug development. While a major goal in an MRCT is to estimate the global treatment effect, it is also important to assess the consistency of treatment effects across multiple regions. In this paper, we propose an intuitive definition of consistency of noninferior treatment effects across regions under the random-effects modeling framework. Specifically, we quantify the consistency of treatment effects by the percentage of regions that meet a predefined treatment margin. This new approach enables us to achieve both goals in one modeling framework. We propose to use a signed likelihood ratio test for testing the global treatment effect and the consistency of noninferior treatment effects. In addition, we provide guidelines for the allocation rule to achieve optimal power for testing consistency among multiple regions. Extensive simulation studies are conducted to examine the performance of the proposed methodology. An application to a real data example is provided. Collapse Key Words Consistency of treatment effects global treatment effect multiregional clinical trial noninferiority clinical trial random effects model signed likelihood ratio test Collapse MESH Headings Collapse Grants Collapse
234	Deng Y, Zeng D, Zhao J, Cai J. Proportional hazards model with a change point for clustered event data. Biometrics 2017;73:835-845. [PMID: 28257142 DOI: 10.1111/biom.12655] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2015] [Revised: 12/01/2016] [Accepted: 12/01/2016] [Indexed: 11/30/2022] Abstract In many epidemiology studies, family data with survival endpoints are collected to investigate the association between risk factors and disease incidence. Sometimes the risk of the disease may change when a certain risk factor exceeds a certain threshold. Finding this threshold value could be important for disease risk prediction and diseases prevention. In this work, we propose a change-point proportional hazards model for clustered event data. The model incorporates the unknown threshold of a continuous variable as a change point in the regression. The marginal pseudo-partial likelihood functions are maximized for estimating the regression coefficients and the unknown change point. We develop a supremum test based on robust score statistics to test the existence of the change point. The inference for the change point is based on the m out of n bootstrap. We establish the consistency and asymptotic distributions of the proposed estimators. The finite-sample performance of the proposed method is demonstrated via extensive simulation studies. Finally, the Strong Heart Family Study dataset is analyzed to illustrate the methods. Collapse Key Words Change point Clustered event Proportional hazards model m out of n bootstrap Collapse MESH Headings Collapse Grants Collapse
235	Tao R, Zeng D, Lin DY. Efficient Semiparametric Inference Under Two-Phase Sampling, With Applications to Genetic Association Studies. J Am Stat Assoc 2017;112:1468-1476. [PMID: 29479125 DOI: 10.1080/01621459.2017.1295864] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Abstract In modern epidemiological and clinical studies, the covariates of interest may involve genome sequencing, biomarker assay, or medical imaging and thus are prohibitively expensive to measure on a large number of subjects. A cost-effective solution is the two-phase design, under which the outcome and inexpensive covariates are observed for all subjects during the first phase and that information is used to select subjects for measurements of expensive covariates during the second phase. For example, subjects with extreme values of quantitative traits were selected for whole-exome sequencing in the National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP). Herein, we consider general two-phase designs, where the outcome can be continuous or discrete, and inexpensive covariates can be continuous and correlated with expensive covariates. We propose a semiparametric approach to regression analysis by approximating the conditional density functions of expensive covariates given inexpensive covariates with B-spline sieves. We devise a computationally efficient and numerically stable EM-algorithm to maximize the sieve likelihood. In addition, we establish the consistency, asymptotic normality, and asymptotic efficiency of the estimators. Furthermore, we demonstrate the superiority of the proposed methods over existing ones through extensive simulation studies. Finally, we present applications to the aforementioned NHLBI ESP. Collapse Key Words Biased sampling EM algorithm Genome sequencing Responseselective sampling Semiparametric efficiency Sieve approximation Collapse MESH Headings Collapse Grants Collapse
236	Mao L, Lin DY, Zeng D. Semiparametric regression analysis of interval-censored competing risks data. Biometrics 2017;73:857-865. [PMID: 28211951 DOI: 10.1111/biom.12664] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Revised: 11/01/2016] [Accepted: 12/01/2016] [Indexed: 11/30/2022] Abstract Interval-censored competing risks data arise when each study subject may experience an event or failure from one of several causes and the failure time is not observed directly but rather is known to lie in an interval between two examinations. We formulate the effects of possibly time-varying (external) covariates on the cumulative incidence or sub-distribution function of competing risks (i.e., the marginal probability of failure from a specific cause) through a broad class of semiparametric regression models that captures both proportional and non-proportional hazards structures for the sub-distribution. We allow each subject to have an arbitrary number of examinations and accommodate missing information on the cause of failure. We consider nonparametric maximum likelihood estimation and devise a fast and stable EM-type algorithm for its computation. We then establish the consistency, asymptotic normality, and semiparametric efficiency of the resulting estimators for the regression parameters by appealing to modern empirical process theory. In addition, we show through extensive simulation studies that the proposed methods perform well in realistic situations. Finally, we provide an application to a study on HIV-1 infection with different viral subtypes. Collapse Key Words Cumulative incidence Interval censoring Nonparametric maximum likelihood estimation Self-consistency algorithm Time-varying covariates Transformation models Collapse MESH Headings Collapse Grants Collapse
237	Chen H, Zeng D, Wang Y. Penalized nonlinear mixed effects model to identify biomarkers that predict disease progression. Biometrics 2017;73:1343-1354. [PMID: 28182831 DOI: 10.1111/biom.12663] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2016] [Revised: 12/01/2016] [Accepted: 01/01/2017] [Indexed: 12/29/2022] Abstract Precise modeling of disease progression in neurodegenerative disorders may enable early intervention before clinical manifestation of a disease, which is crucial since early intervention at the premanifest stage is expected to be more effective. Neuroimaging biomarkers are indicative of the underlying disease pathology and may be used to predict future disease occurrence at the premanifest stage. As observed in many pivotal studies, longitudinal measurements of clinical outcomes, such as motor or cognitive symptoms, often present nonlinear sigmoid shapes over time, where the inflection points of the trajectories mark a meaningful time in disease progression. Therefore, to identify neuroimaging biomarkers predicting disease progression, we propose a nonlinear mixed effects model based on a sigmoid function to predict longitudinal clinical outcomes, and associate a linear combination of neuroimaging biomarkers with subject-specific inflection points. Based on an expectation-maximization (EM) algorithm, we propose a method that can fit a nonlinear model with many potentially correlated biomarkers for random inflection points while achieving computational stability. Variable selection is introduced in the algorithm in order to identify important biomarkers of disease progression and to reduce prediction variability. We apply the proposed method to the data from the Predictors of Huntington's Disease study to select brain subcortical regional volumes predictive of the inflection points of the motor and cognitive function trajectories. Our results reveal that brain atrophy in the striatum and expansion of the ventricular system are highly predictive of the inflection points. Furthermore, these inflection points may precede clinically defined disease onset by as early as a decade and thus may be useful biomarkers as early signs of Huntington's Disease onset. Collapse Key Words EM algorithm Inflection point Neuroimaging biomarkers Nonlinear mixed model Sigmoid function Variable selection Collapse MESH Headings Collapse Grants Collapse
238	Gao F, Liu G, Zeng D, Diao G, Heyse JF, Ibrahim JG. On inference of control-based imputation for analysis of repeated binary outcomes with missing data. J Biopharm Stat 2017;27:358-372. [PMID: 28287873 DOI: 10.1080/10543406.2017.1289957] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Abstract Missing data are common in longitudinal clinical trials. How to handle missing data is critical for both sponsors and regulatory agencies to assess treatment effect from the trials. Recently, a control-based imputation has been proposed, where the missing data are imputed based on the assumption that patients who discontinued the test drug will have a similar response profile to the patients in the control group. Under control-based imputation, the variance estimation may be biased using Rubin's formula which could produce biased statistical inferences. We evaluate several statistical methods for obtaining appropriate variances under control-based imputation for analysis of repeated binary outcomes with monotone missing data and show that both the analytical method developed by Robins & Wang and the nonparametric bootstrap method provide more appropriate variance estimates under various simulation settings. We use the methods in an application of an antidepressant Phase III clinical trial and give discussion and recommendations on method performance and preference. Collapse Key Words Bootstrap control-based imputation missing data multiple imputation repeated outcomes Collapse MESH Headings Collapse Grants Collapse
239	Li F, Zhang Y, Zeng D, Xia Y, Fan X, Tan Y, Kou J, Yu B. The Combination of Three Components Derived from Sheng MaiSan Protects Myocardial Ischemic Diseases and Inhibits Oxidative Stress via Modulating MAPKs and JAK2-STAT3 Signaling Pathways Based on Bioinformatics Approach. Front Pharmacol 2017;8:21. [PMID: 28197101 PMCID: PMC5282471 DOI: 10.3389/fphar.2017.00021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/11/2017] [Indexed: 01/25/2023] Open Abstract GRS is a drug combination of three components including ginsenoside Rb1, ruscogenin and schisandrin. It derived from the well-known TCM formula Sheng MaiSan, a widely used traditional Chinese medicine for the treatment of cardiovascular diseases in clinic. The present study illuminates its underlying mechanisms against myocardial ischemic diseases based on the combined methods of bioinformatic prediction and experimental verification. A protein database was established through constructing the drug-protein network. And the target-pathway interaction network clustered the potential signaling pathways and targets of GRS in treatment of myocardial ischemic diseases. Several target proteins, such as NFKB1, STAT3 and MAPK14, were identified as the candidate key proteins, and MAPKs and JAK-STAT signaling pathway were suggested as the most related pathways, which were in accordance with the gene ontology analysis. Then, the predictive results were further validated and we found that GRS treatment alleviated hypoxia/reoxygenation (H/R)-induced cardiomyocytes injury via suppression of MDA levels and ROS generation, and potential mechanisms might related to the suppression of activation of MAPKs and JAK2-STAT3 signaling pathways. Conclusively, our results offer the evidence that GRS attenuates myocardial ischemia injury via regulating oxidative stress and MAPKs and JAK2-STAT3 signaling pathways, which supplied some new insights for its prevention and treatment of myocardial ischemia diseases. Collapse Key Words GRS JAK-STAT signaling pathway MAPKs signaling pathway Sheng MaiSan bioinformatics approach myocardial ischemic diseases oxidative stress Collapse MESH Headings Collapse Grants Collapse
240	Chen G, Zeng D, Kosorok MR. Personalized Dose Finding Using Outcome Weighted Learning. J Am Stat Assoc 2017;111:1509-1521. [PMID: 28255189 PMCID: PMC5327863 DOI: 10.1080/01621459.2016.1148611] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Revised: 12/01/2015] [Indexed: 10/22/2022] Abstract In dose-finding clinical trials, it is becoming increasingly important to account for individual level heterogeneity while searching for optimal doses to ensure an optimal individualized dose rule (IDR) maximizes the expected beneficial clinical outcome for each individual. In this paper, we advocate a randomized trial design where candidate dose levels assigned to study subjects are randomly chosen from a continuous distribution within a safe range. To estimate the optimal IDR using such data, we propose an outcome weighted learning method based on a nonconvex loss function, which can be solved efficiently using a difference of convex functions algorithm. The consistency and convergence rate for the estimated IDR are derived, and its small-sample performance is evaluated via simulation studies. We demonstrate that the proposed method outperforms competing approaches. Finally, we illustrate this method using data from a cohort study for Warfarin (an anti-thrombotic drug) dosing. Collapse Key Words DC Algorithm Dose Finding Individualized Dose Rule Risk Bound Weighted Support Vector Regression Collapse MESH Headings Collapse Grants R01 CA082659 NCI NIH HHS R01 GM047845 NIGMS NIH HHS R01 NS073671 NINDS NIH HHS P01 CA142538 NCI NIH HHS U01 NS082062 NINDS NIH HHS Collapse
241	Chen G, Zeng D, Kosorok MR. Rejoinder. J Am Stat Assoc 2017. [DOI: 10.1080/01621459.2016.1250573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
242	Song R, Luo S, Zeng D, Zhang HH, Lu W, Li Z. Semiparametric Single-Index Model for Estimating Optimal Individualized Treatment Strategy. Electron J Stat 2017;11:364-384. [PMID: 28959371 PMCID: PMC5612500 DOI: 10.1214/17-ejs1226] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Abstract Different from the standard treatment discovery framework which is used for finding single treatments for a homogenous group of patients, personalized medicine involves finding therapies that are tailored to each individual in a heterogeneous group. In this paper, we propose a new semiparametric additive single-index model for estimating individualized treatment strategy. The model assumes a flexible and nonparametric link function for the interaction between treatment and predictive covariates. We estimate the rule via monotone B-splines and establish the asymptotic properties of the estimators. Both simulations and an real data application demonstrate that the proposed method has a competitive performance. Collapse Key Words Personalized medicine Semiparametric inference Single index model Collapse MESH Headings Collapse Grants P01 CA142538 NCI NIH HHS R01 GM047845 NIGMS NIH HHS R01 NS073671 NINDS NIH HHS U01 NS082062 NINDS NIH HHS Collapse
243	Liang B, Tong X, Zeng D, Wang Y. SEMIPARAMETRIC REGRESSION ANALYSIS OF REPEATED CURRENT STATUS DATA. Stat Sin 2017;27:1079-1100. [PMID: 28959115 DOI: 10.5705/ss.202014.0153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Abstract In many clinical studies, patients may be asked to report their medication adherence, presence of side effects, substance use, and hospitalization information during the study period. However, the exact occurrence time of these recurrent events may not be available due to privacy protection, recall difficulty, or incomplete medical records. Instead, the only available information is whether the events of interest have occurred during the past period. In this paper, we call these incomplete recurrent events as repeated current status data. Currently, there are no valid standard methods for this kind of data. We propose to use the Andersen-Gill proportional intensity assumption to analyze such data. Specifically, we propose a maximum sieve likelihood approach for inference and we show that the proposed estimators for regression coefficients are consistent, asymptotically normal and attain semiparametric efficiency bounds. Simulation studies show that the proposed approach performs well with small sample sizes. Finally, our method is applied to study medication adherence in a clinical trial on non-psychotic major depressive disorder. Collapse Key Words Andersen-Gill model Current status data Recurrent events Semiparametric efficiency Sieve estimation Collapse MESH Headings Collapse Grants Collapse
244	Zhao Y, Yang C, Chen X, Peng M, Chen X, Zeng D. Effects of cryopreservation on ultrastructural morphology of white shrimp (Litopenaeus vannamei) sperm. CRYO LETTERS 2017;38:357-363. [PMID: 29734402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/08/2023] Abstract BACKGROUND The structural integrity of a spermatozoon is very important for the processes of fertilization and embryo development. OBJECTIVE To provide valuable data for developing better cryopreservation techniques for shrimp sperm. MATERIALS AND METHODS Using scanning electron microscopy and transmission electron microscopy, we examined the morphological alteration of Litopenaeus vannamei sperm after cryopreservation with different concentrations of dimethylsulfoxide (DMSO). RESULTS We found that the damaged post-thaw sperm presented either vesiculated acrosomal contents, wrinkled membranes, perforated membranes, and loss of the acrosomal spike. The seriously damaged sperm showed missing acrosomal spikes, deformed nuclei, burst nuclear membranes, and vacuolated nuclei. In addition, we found that the post-thaw sperm stored with 5% DMSO had the highest viability rate and lowest DNA damage coefficient by eosin-nigrosin staining and comet assay. CONCLUSION Our results suggested that cryopreservation has deleterious effects on ultrastructural morphology of L. vannamei sperm, especially on acrosomal spikes and membranes. Collapse Key Words Collapse MESH Headings Animals Cell Survival/drug effects Comet Assay Cryopreservation/methods DNA Damage/drug effects Dimethyl Sulfoxide/pharmacology Male Microscopy, Fluorescence Penaeidae/ultrastructure Semen Preservation Sperm Motility/drug effects Spermatozoa/cytology Spermatozoa/drug effects Spermatozoa/ultrastructure Collapse Grants Collapse
245	Chen J, Liu Y, Zeng D, Song R, Zhao Y, Kosorok MR. Comment. J Am Stat Assoc 2016;111:942-947. [PMID: 28003710 DOI: 10.1080/01621459.2016.1200914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Abstract Xu, Müller, Wahed, and Thall proposed a Bayesian model to analyze an acute leukemia study involving multi-stage chemotherapy regimes. We discuss two alternative methods, Q-learning and O-learning, to solve the same problem from the machine learning point of view. The numerical studies show that these methods can be flexible and have advantages in some situations to handle treatment heterogeneity while being robust to model misspecification. Collapse Key Words Dynamic treatment regimes Multi-stage chemotherapy regimes O-learning Q-learning Collapse MESH Headings Collapse Grants Collapse
246	Wang Y, Wu P, Liu Y, Weng C, Zeng D. Learning Optimal Individualized Treatment Rules from Electronic Health Record Data. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS. IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS 2016;2016:65-71. [PMID: 28503676 DOI: 10.1109/ichi.2016.13] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Abstract Medical research is experiencing a paradigm shift from "one-size-fits-all" strategy to a precision medicine approach where the right therapy, for the right patient, and at the right time, will be prescribed. We propose a statistical method to estimate the optimal individualized treatment rules (ITRs) that are tailored according to subject-specific features using electronic health records (EHR) data. Our approach merges statistical modeling and medical domain knowledge with machine learning algorithms to assist personalized medical decision making using EHR. We transform the estimation of optimal ITR into a classification problem and account for the non-experimental features of the EHR data and confounding by clinical indication. We create a broad range of feature variables that reflect both patient health status and healthcare data collection process. Using EHR data collected at Columbia University clinical data warehouse, we construct a decision tree for choosing the best second line therapy for treating type 2 diabetes patients. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse
247	Liu Y, Wang Y, Huang C, Zeng D. Estimating personalized diagnostic rules depending on individualized characteristics. Stat Med 2016;36:1099-1117. [PMID: 27917508 DOI: 10.1002/sim.7182] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2015] [Revised: 07/25/2016] [Accepted: 10/27/2016] [Indexed: 11/10/2022] Abstract There is an increasing demand for personalization of disease screening based on assessment of patient risk and other characteristics. For example, in breast cancer screening, advanced imaging technologies have made it possible to move away from 'one-size-fits-all' screening guidelines to targeted risk-based screening for those who are in need. Because diagnostic performance of various imaging modalities may vary across subjects, applying the most accurate modality to the patients who would benefit the most requires personalized strategy. To address these needs, we propose novel machine learning methods to estimate personalized diagnostic rules for medical screening or diagnosis by maximizing a weighted combination of sensitivity and specificity across subgroups of subjects. We first develop methods that can be applied when competing modalities or screening strategies that are observed on the same subject (paired design). Next, we present methods for studies where not all subjects receive both modalities (unpaired design). We study theoretical properties including consistency and risk bound of the personalized diagnostic rules and conduct simulation studies to examine performance of the proposed methods. Lastly, we analyze data collected from a brain imaging study of Parkinson's disease using positron emission tomography and diffusion tensor imaging with paired and unpaired designs. Our results show that in some cases, a personalized modality assignment is estimated to improve empirical area under the receiver operating curve compared with a 'one-size-fits-all' assignment strategy. Copyright © 2016 John Wiley & Sons, Ltd. Collapse Key Words Parkinson's disease personalized screening weighted support vector machine Collapse MESH Headings Collapse Grants Collapse
248	Cui Z, Stevens J, Truesdale KP, Zeng D, French S, Gordon-Larsen P. Prediction of Body Mass Index Using Concurrently Self-Reported or Previously Measured Height and Weight. PLoS One 2016;11:e0167288. [PMID: 27898706 PMCID: PMC5127553 DOI: 10.1371/journal.pone.0167288] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2016] [Accepted: 11/11/2016] [Indexed: 11/19/2022] Open Abstract Objective To compare alternative models for the imputation of BMI_M (measured weight in kilograms/measured height in meters squared) in a longitudinal study. Methods We used data from 11,008 adults examined at wave III (2001–2002) and wave IV (2007–2008) in the National Longitudinal Study of Adolescent to Adult Health. Participants were asked their height and weight before being measured. Equations to predict wave IV BMI_M were developed in an 80% random subsample and evaluated in the remaining participants. The validity of models that included BMI constructed from previously measured height and weight (BMI_PM) was compared to the validity of models that used BMI calculated from concurrently self-reported height and weight (BMI_SR). The usefulness of including demographics and perceived weight category in those models was also examined. Results The model that used BMI_SR, compared to BMI_PM, as the only variable produced a larger R² (0.913 vs. 0.693), a smaller root mean square error (2.07 vs. 3.90 kg/m²) and a lower bias between normal-weight participants and those with obesity (0.98 vs. 4.24 kg/m²). The performance of the model containing BMI_SR alone was not substantially improved by the addition of demographics, perceived weight category or BMI_PM. Conclusions Our work is the first to show that concurrent self-reports of height and weight may be more useful than previously measured height and weight for imputation of missing BMI_M when the time interval between measures is relatively long. Other time frames and alternatives to in-person collection of self-reported data need to be examined. Collapse Key Words Collapse MESH Headings Adolescent Adult Body Height/physiology Body Mass Index Body Weight/physiology Female Humans Longitudinal Studies Male Models, Theoretical Self Report Surveys and Questionnaires Young Adult Collapse Grants P01 HD031921 NICHD NIH HHS P2C HD050924 NICHD NIH HHS R01 HD057194 NICHD NIH HHS U01 HL103561 NHLBI NIH HHS National Heart, Lung, and Blood Institute Eunice Kennedy Shriver National Institute of Child Health and Human Development Office of Behavioral and Social Sciences Research Collapse
249	Ou FS, Zeng D, Cai J. Quantile Regression Models for Current Status Data. J Stat Plan Inference 2016;178:112-127. [PMID: 27994307 PMCID: PMC5160027 DOI: 10.1016/j.jspi.2016.06.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Abstract Current status data arise frequently in demography, epidemiology, and econometrics where the exact failure time cannot be determined but is only known to have occurred before or after a known observation time. We propose a quantile regression model to analyze current status data, because it does not require distributional assumptions and the coefficients can be interpreted as direct regression effects on the distribution of failure time in the original time scale. Our model assumes that the conditional quantile of failure time is a linear function of covariates. We assume conditional independence between the failure time and observation time. An M-estimator is developed for parameter estimation which is computed using the concave-convex procedure and its confidence intervals are constructed using a subsampling method. Asymptotic properties for the estimator are derived and proven using modern empirical process theory. The small sample performance of the proposed method is demonstrated via simulation studies. Finally, we apply the proposed method to analyze data from the Mayo Clinic Study of Aging. Collapse Key Words Concave-convex procedure Current status data M-estimation Quantile regression Subsampling Collapse MESH Headings Collapse Grants P01 CA142538 NCI NIH HHS R01 CA082659 NCI NIH HHS R01 GM047845 NIGMS NIH HHS U01 NS082062 NINDS NIH HHS Collapse
250	Du C, Xu Y, Yang K, Chen S, Wang X, Wang S, Wang C, Shen M, Chen F, Chen M, Zeng D, Li F, Wang T, Wang F, Zhao J, Ai G, Cheng T, Su Y, Wang J. Estrogen promotes megakaryocyte polyploidization via estrogen receptor beta-mediated transcription of GATA1. Leukemia 2016;31:945-956. [DOI: 10.1038/leu.2016.285] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Revised: 09/13/2016] [Accepted: 09/14/2016] [Indexed: 12/21/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse