1
|
Shrestha P, Graff M, Gu Y, Wang Y, Avery CL, Ginnis J, Simancas-Pallares MA, Ferreira Zandoná AG, Ahn HS, Nguyen KN, Lin DY, Preisser JS, Slade GD, Marazita ML, North KE, Divaris K. Multi-ancestry Genome-Wide Association Study of Early Childhood Caries. medRxiv 2024:2024.03.12.24303742. [PMID: 38562815 PMCID: PMC10984042 DOI: 10.1101/2024.03.12.24303742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Early childhood caries (ECC) is the most common non-communicable childhood disease. It is an important health problem with known environmental and social/behavioral influences that lacks evidence for specific associated genetic risk loci. To address this knowledge gap, we conducted a genome-wide association study of ECC in a multi-ancestry population of U.S. preschool-age children (n=6,103) participating in a community-based epidemiologic study of early childhood oral health. Calibrated examiners used ICDAS criteria to measure ECC with the primary trait using the dmfs index with decay classified as macroscopic enamel loss (ICDAS ≥3). We estimated heritability, concordance rates, and conducted genome-wide association analyses to estimate overall genetic effects; the effects stratified by sex, household water fluoride, and dietary sugar; and leveraged the combined gene/gene-environment effects using the 2-degree-of-freedom (2df) joint test. The common genetic variants explained 24% of the phenotypic variance (heritability) of the primary ECC trait and the concordance rate was higher with a higher degree of relatedness. We identified 21 novel non-overlapping genome-wide significant loci for ECC. Two loci, namely RP11-856F16 . 2 (rs74606067) and SLC41A3 (rs71327750) showed evidence of association with dental caries in external cohorts, namely the GLIDE consortium adult cohort (n=∼487,000) and the GLIDE pediatric cohort (n=19,000), respectively. The gene-based tests identified TAAR6 as a genome-wide significant gene. Implicated genes have relevant biological functions including roles in tooth development and taste. These novel associations expand the genomics knowledge base for this common childhood disease and underscore the importance of accounting for sex and pertinent environmental exposures in genetic investigations of oral health.
Collapse
|
2
|
Xu Y, Zeng D, Lin DY. Marginal proportional hazards models for multivariate interval-censored data. Biometrika 2023; 110:815-830. [PMID: 37601305 PMCID: PMC10434824 DOI: 10.1093/biomet/asac059] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/22/2023] Open
Abstract
Multivariate interval-censored data arise when there are multiple types of events or clusters of study subjects, such that the event times are potentially correlated and when each event is only known to occur over a particular time interval. We formulate the effects of potentially time-varying covariates on the multivariate event times through marginal proportional hazards models while leaving the dependence structures of the related event times unspecified. We construct the nonparametric pseudolikelihood under the working assumption that all event times are independent, and we provide a simple and stable EM-type algorithm. The resulting nonparametric maximum pseudolikelihood estimators for the regression parameters are shown to be consistent and asymptotically normal, with a limiting covariance matrix that can be consistently estimated by a sandwich estimator under arbitrary dependence structures for the related event times. We evaluate the performance of the proposed methods through extensive simulation studies and present an application to data from the Atherosclerosis Risk in Communities Study.
Collapse
Affiliation(s)
- Yangjianchen Xu
- Department of Biostatistics, University of North Carolina, 3101E McGavran-Greenberg Hall, Chapel Hill, North Carolina 27599, U.S.A
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina, 3101E McGavran-Greenberg Hall, Chapel Hill, North Carolina 27599, U.S.A
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, 3101E McGavran-Greenberg Hall, Chapel Hill, North Carolina 27599, U.S.A
| |
Collapse
|
3
|
Yang H, Lin DY, Li Q. An Efficient Greedy Search Algorithm for High-dimensional Linear Discriminant Analysis. Stat Sin 2023; 33:1343-1364. [PMID: 37455685 PMCID: PMC10348717 DOI: 10.5705/ss.202021.0028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
High-dimensional classification is an important statistical problem that has applications in many areas. One widely used classifier is the Linear Discriminant Analysis (LDA). In recent years, many regularized LDA classifiers have been proposed to solve the problem of high-dimensional classification. However, these methods rely on inverting a large matrix or solving large-scale optimization problems to render classification rules-methods that are computationally prohibitive when the dimension is ultra-high. With the emergence of big data, it is increasingly important to develop more efficient algorithms to solve the high-dimensional LDA problem. In this paper, we propose an efficient greedy search algorithm that depends solely on closed-form formulae to learn a high-dimensional LDA rule. We establish theoretical guarantee of its statistical properties in terms of variable selection and error rate consistency; in addition, we provide an explicit interpretation of the extra information brought by an additional feature in a LDA problem under some mild distributional assumptions. We demonstrate that this new algorithm drastically improves computational speed compared with other high-dimensional LDA methods, while maintaining comparable or even better classification performance.
Collapse
Affiliation(s)
- Hannan Yang
- Department of Biostatistics, University of North Carolina, Chapel Hill
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill
| | - Quefeng Li
- Department of Biostatistics, University of North Carolina, Chapel Hill
| |
Collapse
|
4
|
Wang J, Zeng D, Lin DY. Semiparametric single-index models for optimal treatment regimens with censored outcomes. Lifetime Data Anal 2022; 28:744-763. [PMID: 35939142 DOI: 10.1007/s10985-022-09566-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 06/07/2022] [Indexed: 06/15/2023]
Abstract
There is a growing interest in precision medicine, where a potentially censored survival time is often the most important outcome of interest. To discover optimal treatment regimens for such an outcome, we propose a semiparametric proportional hazards model by incorporating the interaction between treatment and a single index of covariates through an unknown monotone link function. This model is flexible enough to allow non-linear treatment-covariate interactions and yet provides a clinically interpretable linear rule for treatment decision. We propose a sieve maximum likelihood estimation approach, under which the baseline hazard function is estimated nonparametrically and the unknown link function is estimated via monotone quadratic B-splines. We show that the resulting estimators are consistent and asymptotically normal with a covariance matrix that attains the semiparametric efficiency bound. The optimal treatment rule follows naturally as a linear combination of the maximum likelihood estimators of the model parameters. Through extensive simulation studies and an application to an AIDS clinical trial, we demonstrate that the treatment rule derived from the single-index model outperforms the treatment rule under the standard Cox proportional hazards model.
Collapse
Affiliation(s)
- Jin Wang
- Department of Biostatistics, University Of North Carolina, Chapel Hill, NC, United States
| | - Donglin Zeng
- Department of Biostatistics, University Of North Carolina, Chapel Hill, NC, United States
| | - D Y Lin
- Department of Biostatistics, University Of North Carolina, Chapel Hill, NC, United States.
| |
Collapse
|
5
|
Zeng BD, Lin DY. Maximum Likelihood Estimation for Semiparametric Regression Models With Panel Count Data. Biometrika 2021; 108:947-963. [PMID: 34949875 DOI: 10.1093/biomet/asaa091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Panel count data, in which the observation for each study subject consists of the number of recurrent events between successive examinations, are commonly encountered in industrial reliability testing, medical research, and various other scientific investigations. We formulate the effects of potentially time-dependent covariates on one or more types of recurrent events through non-homogeneous Poisson processes with random effects. We adopt nonparametric maximum likelihood estimation under arbitrary examination schemes and develop a simple and stable EM algorithm. We show that the resulting estimators of the regression parameters are consistent and asymptotically normal, with a covariance matrix that achieves the semiparametric efficiency bound and can be estimated through profile likelihood. We evaluate the performance of the proposed methods through extensive simulation studies and present a skin cancer clinical trial.
Collapse
Affiliation(s)
- By Donglin Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| |
Collapse
|
6
|
Jian MJ, Chung HY, Chang CK, Lin JC, Yeh KM, Chen CW, Lin DY, Chang FY, Hung KS, Perng CL, Shang HS. SARS-CoV-2 Variants with T135I Nucleocapsid Mutations may Affect Antigen Test Performance. Int J Infect Dis 2021; 114:112-114. [PMID: 34758391 PMCID: PMC8572148 DOI: 10.1016/j.ijid.2021.11.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 11/01/2021] [Accepted: 11/02/2021] [Indexed: 11/10/2022] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a pandemic. Diagnostic testing for SARS-CoV-2 has continuously been challenged due to several variants with diverse spike (S) and nucleocapsid (N) protein mutations []. SARS-CoV-2 variant proliferation potentially affects N protein-targeted rapid antigen testing. In this study, rapid antigen and reverse transcription PCR (RT-PCR) tests were performed simultaneously in patients with suspected coronavirus disease 2019 (COVID-19). Direct whole genome sequencing was performed to determine the N protein variations, and the viral assemblies were uploaded to GISAID. The genomes were then compared with those of global virus strains from GISAID. These isolates belonged to the B.1.1.7 variant, exhibiting several amino acid substitutions, including D3L, R203K, G204R, and S235F N protein mutations. The T135I mutation was also identified in one variant case in which the rapid antigen test and RT-PCR test were discordantly negative and positive, respectively. These findings suggest that the variants undetected by the Panbio COVID-19 rapid antigen test may be due to the T135I mutation in the N protein, posing a potential diagnostic risk for commercially available antigen tests. Hence, we recommend concomitant paired rapid antigen tests and molecular diagnostic methods to detect SARS-CoV-2. False-negative results could be rapidly corrected using confirmatory RT-PCR results to prevent future COVID-19 outbreaks.
Collapse
Affiliation(s)
- Ming-Jr Jian
- Division of Clinical Pathology, Department of Pathology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - Hsing-Yi Chung
- Division of Clinical Pathology, Department of Pathology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - Chih-Kai Chang
- Division of Clinical Pathology, Department of Pathology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - Jung-Chung Lin
- Division of Infectious Diseases and Tropical Medicine, Department of Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - Kuo-Ming Yeh
- Division of Infectious Diseases and Tropical Medicine, Department of Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - Chien-Wen Chen
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - De-Yu Lin
- Division of Infectious Diseases and Tropical Medicine, Department of Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - Feng-Yee Chang
- Division of Infectious Diseases and Tropical Medicine, Department of Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - Kuo-Sheng Hung
- Center for Precision Medicine and Genomics, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - Cherng-Lih Perng
- Division of Clinical Pathology, Department of Pathology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC
| | - Hung-Sheng Shang
- Division of Clinical Pathology, Department of Pathology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan, ROC.
| |
Collapse
|
7
|
Affiliation(s)
- Jean Pan
- a Amgen Inc , Thousand Oaks , CA , USA
| | - D Y Lin
- b Department of Biostatistics, University of North Carolina , Chapel Hill , NC , USA
| |
Collapse
|
8
|
Yang T, Zhao YL, Li WP, Yu CY, Luan JH, Lin DY, Fan L, Jiao ZB, Liu WH, Liu XJ, Kai JJ, Huang JC, Liu CT. Ultrahigh-strength and ductile superlattice alloys with nanoscale disordered interfaces. Science 2020; 369:427-432. [PMID: 32703875 DOI: 10.1126/science.abb6830] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 06/08/2020] [Indexed: 12/24/2022]
Abstract
Alloys that have high strengths at high temperatures are crucial for a variety of important industries including aerospace. Alloys with ordered superlattice structures are attractive for this purpose but generally suffer from poor ductility and rapid grain coarsening. We discovered that nanoscale disordered interfaces can effectively overcome these problems. Interfacial disordering is driven by multielement cosegregation that creates a distinctive nanolayer between adjacent micrometer-scale superlattice grains. This nanolayer acts as a sustainable ductilizing source, which prevents brittle intergranular fractures by enhancing dislocation mobilities. Our superlattice materials have ultrahigh strengths of 1.6 gigapascals with tensile ductilities of 25% at ambient temperature. Simultaneously, we achieved negligible grain coarsening with exceptional softening resistance at elevated temperatures. Designing similar nanolayers may open a pathway for further optimization of alloy properties.
Collapse
Affiliation(s)
- T Yang
- Department of Mechanical Engineering, City University of Hong Kong, Hong Kong, China.,Hong Kong Institute for Advanced Study, City University of Hong Kong, Hong Kong, China
| | - Y L Zhao
- Department of Mechanical Engineering, City University of Hong Kong, Hong Kong, China.,Department of Materials Science and Engineering, City University of Hong Kong, Hong Kong, China
| | - W P Li
- Department of Materials Science and Engineering, City University of Hong Kong, Hong Kong, China
| | - C Y Yu
- College of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen, China
| | - J H Luan
- Department of Materials Science and Engineering, City University of Hong Kong, Hong Kong, China
| | - D Y Lin
- Software Center for High Performance Numerical Simulation and Institute of Applied Physics and Computational Mathematics, Chinese Academy of Engineering Physics, Beijing, China
| | - L Fan
- Department of Mechanical Engineering, The Hong Kong Polytechnic University, Hong Kong, China
| | - Z B Jiao
- Department of Mechanical Engineering, The Hong Kong Polytechnic University, Hong Kong, China
| | - W H Liu
- School of Materials Science and Engineering, Harbin Institute of Technology, Shenzhen, China
| | - X J Liu
- School of Materials Science and Engineering, Harbin Institute of Technology, Shenzhen, China.,Institute of Materials Genome and Big Data, Harbin Institute of Technology, Shenzhen, China
| | - J J Kai
- Department of Mechanical Engineering, City University of Hong Kong, Hong Kong, China.,Department of Materials Science and Engineering, City University of Hong Kong, Hong Kong, China
| | - J C Huang
- Hong Kong Institute for Advanced Study, City University of Hong Kong, Hong Kong, China.,Department of Materials Science and Engineering, City University of Hong Kong, Hong Kong, China
| | - C T Liu
- Department of Mechanical Engineering, City University of Hong Kong, Hong Kong, China. .,Hong Kong Institute for Advanced Study, City University of Hong Kong, Hong Kong, China.,Department of Materials Science and Engineering, City University of Hong Kong, Hong Kong, China
| |
Collapse
|
9
|
Zeng D, Pan Z, Lin DY. Design and analysis of bridging studies with prior probabilities on the null and alternative hypotheses. Biometrics 2019; 76:224-234. [PMID: 31724739 DOI: 10.1111/biom.13175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Revised: 05/21/2019] [Accepted: 08/06/2019] [Indexed: 11/28/2022]
Abstract
The pharmaceutical industry and regulatory agencies are increasingly interested in conducting bridging studies in order to bring an approved drug product from the original region (eg, United States or European Union) to a new region (eg, Asian-Pacific countries). In this article, we provide a new methodology for the design and analysis of bridging studies by assuming prior knowledge on how the null and alternative hypotheses in the original, foreign study are related to the null and alternative hypotheses in the bridging study and setting the type I error for the bridging study according to the strength of the foreign-study evidence. The new methodology accounts for randomness in the foreign-study evidence and controls the average type I error of the bridging study over all possibilities of the foreign-study evidence. In addition, the new methodology increases statistical power, when compared to approaches that do not use foreign-study evidence, and it allows for the possibility of not conducting the bridging study when the foreign-study evidence is unfavorable. Finally, we conducted extensive simulation studies to demonstrate the usefulness of the proposed methodology.
Collapse
Affiliation(s)
- Donglin Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina
| | - Zhiying Pan
- Amgen Inc, 1 Amgen Center Dr, Thousand Oaks, California
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina
| |
Collapse
|
10
|
Abstract
1. PercollTM is one of the most widely used colloid for animal sperm preparation. The aim of this study was to evaluate whether PercollTM colloid centrifugation could be practical to improve cockerel sperm quality, and to compare the effects of PercollTM single layer centrifugation (SLC) and density gradient centrifugation (DGC) in order to obtain the most optimal protocol for cockerel semen.2. In the experiment with PercollTM SLC for fresh semen, an increase of motile sperm was seen after PercollTM 80% SLC and 90% SLC was conducted, at levels of 28.8% and 30.2% respectively (P < 0.01). The increase of progressively motile sperm after PercollTM 80% SLC and 90% SLC was 177.2% and 202.4% respectively (P < 0.01). Meanwhile, for semen stored at 4°C for 24 h, the increase of motile sperm after PercollTM 70% SLC and 80% SLC was 41.2% and 44.0% (P < 0.01), and the increase of progressive sperm after PercollTM 70% SLC and 80% SLC was 71.3% and 83.1% respectively (P < 0.01). Both the percentage of motile sperm and progressive sperm of the fresh and stored cockerel semen after appropriate PercollTM SLC was significantly enhanced.3. Sperm membrane integrity did not show any decrease after PercollTM centrifugation compared with non-centrifuged semen, which suggested that the PercollTM centrifugation treatment in this study did not cause damage to cockerel sperm membranes.4. In the experiment regarding the comparison of PercollTM SLC and DGC with fresh semen, the increase of motile sperm after PercollTM 80% SLC, 90% SLC and 40%/80% DGC was 29.5%, 36.4%, and 25.0% respectively; and the increase of progressive sperm was 44.7%, 58.5%, and 54.7%, respectively. For semen stored at 4°C for 24 h, the increase of motile sperm after PercollTM 70% SLC, 80% SLC and 35%/70% DGC were 41.2%, 44.0%, and 26.4%; and the increase of progressive sperm was 71.3%, 83.1%, and 43.7%, respectively. There were no significant differences between the increase of sperm motility after PercollTM 80%, 90% SLC or PercollTM 40%/80% DGC in fresh cockerel semen. There was no significant difference between PercollTM 70%, 80% SLC and PercollTM 35%/70% in stored cockerel semen. There was a tendency for sperm recovery rates with PercollTM SLC to be higher than PercollTM DGC, although this did not reach statistical significance in this study.5. It was concluded that PercollTM SLC was more suitable for cockerel sperm separation than PercollTM DGC. The results suggested that PercollTM 80% SLC was the most optimal procedure to separate fresh cockerel sperm and PercollTM 70% SLC was the most optimal procedure to separate stored cockerel sperm. PercollTM SLC is more simple, user-friendly and economical and less time-consuming than DGC for cockerel semen processing.
Collapse
Affiliation(s)
- H L Lin
- Physiology Division, Livestock Research Institute, Council of Agriculture, Tainan, Taiwan
| | - Y H Chen
- Physiology Division, Livestock Research Institute, Council of Agriculture, Tainan, Taiwan
| | - D Y Lin
- Breeding and Genetic Division, Livestock Research Institute, Council of Agriculture, Tainan, Taiwan
| | - Y Y Lai
- Breeding and Genetic Division, Livestock Research Institute, Council of Agriculture, Tainan, Taiwan
| | - M C Wu
- Breeding and Genetic Division, Livestock Research Institute, Council of Agriculture, Tainan, Taiwan
| | - L R Chen
- Physiology Division, Livestock Research Institute, Council of Agriculture, Tainan, Taiwan.,Institute of Biotechnology, National Cheng Kung University, Tainan, Taiwan
| |
Collapse
|
11
|
Fu CD, Lin DY, Liang CK, Qiu XL, Sun SS, Feng Q, Liu HX. [Validation and optimization of the indicator system of risk assessment for mechanical cuts]. Zhonghua Lao Dong Wei Sheng Zhi Ye Bing Za Zhi 2019; 37:449-452. [PMID: 31256529 DOI: 10.3760/cma.j.issn.1001-9391.2019.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Objective: To validation and optimization the indicator system of risk assessment for mechanical cuts. Methods: The risk assessment index system of mechanical cutting injury established earlier was used to assess the risk of mechanical cutting injury in 40 cases of mechanical cutting injury registered from January 2015 to December 2017 and 40 similar positions without accidents in the same period. The multiple stepwise regression analysis was used to screen the indicator system, and to adjust the weight coefficient of each index. The total coincidence rate and Kappa value were compared between before and after optimization respectively. Results: The new index system has 3 first-class indicators, 10 second-class indicators and 14 three-class indicators, fewer than the old index system which has 3 first-class indicators, 10 second-class indicators, 34 three-class indicators. There three indicators have revamped in the first-class. The total of coincidence rates of the new and old indicator systems were 67.50% and 90.00%, the difference was statistically significant (P<0.01). The Kappa value were 0.35 and 0.80, respectively. Conclusion: The evaluation results with new indicator systems is more consistent with the actual hazard detection the the old indicator systems, and scientific, reasonable and practical, and the indicator system of risk assessment for mechanical cuts can be used for the risk assessment of mechanical cutting injuries.
Collapse
Affiliation(s)
- C D Fu
- Guangdong Work Injury Rehabilitation Hosplital, Guangzhou 510440, China
| | | | | | | | | | | | | |
Collapse
|
12
|
Abstract
Analysis of genomic data is often complicated by the presence of missing values, which may arise due to cost or other reasons. The prevailing approach of single imputation is generally invalid if the imputation model is misspecified. In this paper, we propose a robust score statistic based on imputed data for testing the association between a phenotype and a genomic variable with (partially) missing values. We fit a semiparametric regression model for the genomic variable against an arbitrary function of the linear predictor in the phenotype model and impute each missing value by its estimated posterior expectation. We show that the score statistic with such imputed values is asymptotically unbiased under general missing-data mechanisms, even when the imputation model is misspecified. We develop a spline-based method to estimate the semiparametric imputation model and derive the asymptotic distribution of the corresponding score statistic with a consistent variance estimator using sieve approximation theory and empirical process theory. The proposed test is computationally feasible regardless of the number of independent variables in the imputation model. We demonstrate the advantages of the proposed method over existing methods through extensive simulation studies and provide an application to a major cancer genomics study.
Collapse
Affiliation(s)
- Kin Yau Wong
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
13
|
Abstract
Health sciences research often involves both right- and interval-censored events because the occurrence of a symptomatic disease can only be observed up to the end of follow-up, while the occurrence of an asymptomatic disease can only be detected through periodic examinations. We formulate the effects of potentially time-dependent covariates on the joint distribution of multiple right- and interval-censored events through semiparametric proportional hazards models with random effects that capture the dependence both within and between the two types of events. We consider nonparametric maximum likelihood estimation and develop a simple and stable EM algorithm for computation. We show that the resulting estimators are consistent and the parametric components are asymptotically normal and efficient with a covariance matrix that can be consistently estimated by profile likelihood or nonparametric bootstrap. In addition, we leverage the joint modelling to provide dynamic prediction of disease incidence based on the evolving event history. Furthermore, we assess the performance of the proposed methods through extensive simulation studies. Finally, we provide an application to a major epidemiological cohort study. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Fei Gao
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC
| | - David Couper
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC
| |
Collapse
|
14
|
Abstract
Structural equation modeling is commonly used to capture complex structures of relationships among multiple variables, both latent and observed. We propose a general class of structural equation models with a semiparametric component for potentially censored survival times. We consider nonparametric maximum likelihood estimation and devise a combined Expectation-Maximization and Newton-Raphson algorithm for its implementation. We establish conditions for model identifiability and prove the consistency, asymptotic normality, and semiparametric efficiency of the estimators. Finally, we demonstrate the satisfactory performance of the proposed methods through simulation studies and provide an application to a motivating cancer study that contains a variety of genomic variables. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Kin Yau Wong
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599
| | - Donglin Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599
| |
Collapse
|
15
|
Lin DY. Discussion of the Paper by R. L. Prentice and Y. Huang - Optimal Designs and Efficient Inference for Biomarker Studies. Stat Theory Relat Fields 2018; 2:21-22. [PMID: 30662976 PMCID: PMC6333203 DOI: 10.1080/24754269.2018.1493630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Affiliation(s)
- D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, U.S.A
| |
Collapse
|
16
|
Lin HL, Liaw RB, Chen YH, Kang TC, Lin DY, Chen LR, Wu MC. Evaluation of cockerel spermatozoa viability and motility by a novel enzyme based cell viability assay. Br Poult Sci 2018; 60:467-471. [PMID: 29355473 DOI: 10.1080/00071668.2018.1426832] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
1. The results of spermatozoa assessment by the WST-8 (2-[2-methoxy-4-nitrophenyl]-3-[4-nitrophenyl]-5-[2,4-disulfophenyl]-2H-tetrazolium, monosodium salt) assay, flow cytometry (FC) or computer-assisted sperm analysis (CASA) were compared. 2. Different live/killed ratios of cockerel semen were serially diluted to 120, 60, and 30 × 106 cells/ml, and each sample was analysed by (1) WST-8 assay at 0, 10, 20, 30, 40, 50, 60 min, (2) viability with FC, and (3) motility with CASA. 3. The WST-8 reduction rate was closely correlated with spermatozoa viability and motility. The optimal semen concentration for the WST-8 assay was 120 × 106 cells/ml, and the standard curves for spermatozoa viability and motility predictions, respectively, were yviability60 = 162.8x + 104.96 (R2 = 0.9594) after 60 min of incubation and ymotility40 = 225.09x + 96.299 (R2 = 0.8475) after 40 min of incubation. 4. It was concluded that the WST-8 assay is useful for the practical evaluation of cockerel spermatozoa viability and motility. Compared to FC and CASA, the WST-8 assay does not require expensive and complex instrumentation in the lab. Furthermore, one well of the WST-8 reaction can be used to predict spermatozoa viability and motility at the same time, which all lead it to be efficient and economical for semen quality assessment.
Collapse
Affiliation(s)
- H L Lin
- a Breeding and Genetic Division , Livestock Research Institute, Council of Agriculture , Tainan, Taiwan
| | - R B Liaw
- a Breeding and Genetic Division , Livestock Research Institute, Council of Agriculture , Tainan, Taiwan
| | - Y H Chen
- b Physiology Division , Livestock Research Institute, Council of Agriculture , Tainan, Taiwan
| | - T C Kang
- b Physiology Division , Livestock Research Institute, Council of Agriculture , Tainan, Taiwan
| | - D Y Lin
- a Breeding and Genetic Division , Livestock Research Institute, Council of Agriculture , Tainan, Taiwan
| | - L R Chen
- b Physiology Division , Livestock Research Institute, Council of Agriculture , Tainan, Taiwan.,c Institute of Biotechnology , National Chung Kung University , Tainan , Taiwan
| | - M C Wu
- a Breeding and Genetic Division , Livestock Research Institute, Council of Agriculture , Tainan, Taiwan
| |
Collapse
|
17
|
Zeng D, Pan J, Hu K, Chi E, Lin DY. Improving the power to establish clinical similarity in a Phase 3 efficacy trial by incorporating prior evidence of analytical and pharmacokinetic similarity. J Biopharm Stat 2017; 28:320-332. [PMID: 29173074 DOI: 10.1080/10543406.2017.1397012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
To improve patients' access to safe and effective biological medicines, abbreviated licensure pathways for biosimilar and interchangeable biological products have been established in the US, Europe, and other countries around the world. The US Food and Drug Administration and European Medicines Agency have published various guidance documents on the development and approval of biosimilars, which recommend a "totality-of-the-evidence" approach with a stepwise process to demonstrate biosimilarity. The approach relies on comprehensive comparability studies ranging from analytical and nonclinical studies to clinical pharmacokinetic/pharmacodynamic (PK/PD) and efficacy studies. A clinical efficacy study may be necessary to address residual uncertainty about the biosimilarity of the proposed product to the reference product and support a demonstration that there are no clinically meaningful differences. In this article, we propose a statistical strategy that takes into account the similarity evidence from analytical assessments and PK studies in the design and analysis of the clinical efficacy study in order to address residual uncertainty and enhance statistical power and precision. We assume that if the proposed biosimilar product and the reference product are shown to be highly similar with respect to the analytical and PK parameters, then they should also be similar with respect to the efficacy parameters. We show that the proposed methods provide correct control of the type I error and improve the power and precision of the efficacy study upon the standard analysis that disregards the prior evidence. We confirm and illustrate the theoretical results through simulation studies based on the biosimilars development experience of many different products.
Collapse
Affiliation(s)
- Donglin Zeng
- a Department of Biostatistics , University of North Carolina , Chapel Hill , NC , USA
| | - Jean Pan
- b Amgen Inc , Thousand Oaks , CA , USA
| | | | - Eric Chi
- b Amgen Inc , Thousand Oaks , CA , USA
| | - D Y Lin
- a Department of Biostatistics , University of North Carolina , Chapel Hill , NC , USA
| |
Collapse
|
18
|
Wu PY, Cheng CY, Liu CE, Lee YC, Yang CJ, Tsai MS, Cheng SH, Lin SP, Lin DY, Wang NC, Lee YC, Sun HY, Tang HJ, Hung CC. Multicenter study of skin rashes and hepatotoxicity in antiretroviral-naïve HIV-positive patients receiving non-nucleoside reverse-transcriptase inhibitor plus nucleoside reverse-transcriptase inhibitors in Taiwan. PLoS One 2017; 12:e0171596. [PMID: 28222098 PMCID: PMC5319792 DOI: 10.1371/journal.pone.0171596] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Accepted: 01/23/2017] [Indexed: 11/29/2022] Open
Abstract
OBJECTIVES Two nucleos(t)ide reverse-transcriptase inhibitors (NRTIs) plus 1 non-NRTI (nNRTI) remain the preferred or alternative combination antiretroviral therapy (cART) for antiretroviral-naive HIV-positive patients in Taiwan. The three most commonly used nNRTIs are nevirapine (NVP), efavirenz (EFV) and rilpivirine (RPV). This study aimed to determine the incidences of hepatotoxicity and skin rashes within 4 weeks of initiation of cART containing 1 nNRTI plus 2 NRTIs. METHODS Between June, 2012 and November, 2015, all antiretroviral-naive HIV-positive adult patients initiating nNRTI-containing cART at 8 designated hospitals for HIV care were included in this retrospective observational study. According to the national HIV treatment guidelines, patients were assessed at baseline, 2 and 4 weeks of cART initiation, and subsequently every 8 to 12 weeks. Plasma HIV RNA load, CD4 cell count and aminotransferases were determined. The toxicity grading scale of the Division of AIDS (DAIDS) 2014 was used for reporting clinical and laboratory adverse events. RESULTS During the 3.5-year study period, 2,341 patients initiated nNRTI-containing cART: NVP in 629 patients, EFV 1,363 patients, and RPV 349 patients. Rash of any grade occurred in 14.1% (n = 331) of the patients. In multiple logistic regression analysis, baseline CD4 cell counts (per 100-cell/μl increase, adjusted odds ratio [AOR], 1.125; 95% confidence interval [95% CI], 1.031-1.228) and use of NVP (AOR, 2.443; 95% CI, 1.816-3.286) (compared with efavirenz) were independently associated with the development of skin rashes. Among the 1,455 patients (62.2%) with aminotransferase data both at baseline and week 4, 72 (4.9%) developed grade 2 or greater hepatotoxicity. In multiple logistic regression analysis, presence of antibody for hepatitis C virus (HCV) (AOR, 2.865; 95% CI, 1.439-5.704) or hepatitis B surface antigen (AOR, 2.397; 95% CI, 1.150-4.997), and development of skin rashes (AOR, 2.811; 95% CI, 1.051-7.521) were independently associated with the development of hepatotoxicity. CONCLUSIONS The baseline CD4 cell counts and use of NVP were associated with increased risk of skin rashes, while hepatotoxicity was independently associated with HCV or hepatitis B virus coinfection, and development of skin rashes in antiretroviral-naïve HIV-positive Taiwanese patients within 4 weeks of initiation of nNRTI-containing regimens.
Collapse
Affiliation(s)
- Pei-Ying Wu
- Center of Infection Control, National Taiwan University Hospital, Taipei, Taiwan
| | - Chien-Yu Cheng
- Department of Internal Medicine, Taoyuan General Hospital, Ministry of Health and Welfare, Tao-Yuan, Taiwan
- School of Public Health, National Yang-Ming University, Taipei, Taiwan
| | - Chun-Eng Liu
- Department of Internal Medicine, Changhua Christian Hospital, Changhua, Taiwan
| | - Yi-Chien Lee
- Department of Internal Medicine, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chia-Yi, Taiwan
| | - Chia-Jui Yang
- School of Medicine, National Yang-Ming University, Taipei, Taiwan
- Department of Internal Medicine, Far Eastern Memorial Hospital, New Taipei City, Taiwan
| | - Mao-Song Tsai
- Department of Internal Medicine, Far Eastern Memorial Hospital, New Taipei City, Taiwan
| | - Shu-Hsing Cheng
- Department of Internal Medicine, Taoyuan General Hospital, Ministry of Health and Welfare, Tao-Yuan, Taiwan
- School of Public Health, College of Public Health and Nutrition, Taipei Medical University, Taipei, Taiwan
| | - Shih-Ping Lin
- Department of Internal Medicine, Taichung Veterans General Hospital, Taichung, Taiwan
| | - De-Yu Lin
- Department of Internal Medicine, Tri-Service General Hospital and National Defense Medical College, Taipei, Taiwan
| | - Ning-Chi Wang
- Department of Internal Medicine, Tri-Service General Hospital and National Defense Medical College, Taipei, Taiwan
| | - Yi-Chieh Lee
- Department of Internal Medicine, Lotung Poh-Ai Hospital, Medical Lo-Hsu Foundation, I-Lan, Taiwan
| | - Hsin-Yun Sun
- Department of Internal Medicine, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan
| | - Hung-Jen Tang
- Department of Internal Medicine, Chi Mei Medical Center, Tainan, Taiwan
- Department of Health and Nutrition, Chia Nan University of Pharmacy and Sciences, Tainan, Taiwan
| | - Chien-Ching Hung
- Department of Internal Medicine, National Taiwan University Hospital and National Taiwan University College of Medicine, Taipei, Taiwan
- Department of Parasitology, National Taiwan University College of Medicine, Taipei, Taiwan
| |
Collapse
|
19
|
Abstract
Interval censoring arises frequently in clinical, epidemiological, financial and
sociological studies, where the event or failure of interest is known only to occur within
an interval induced by periodic monitoring. We formulate the effects of potentially
time-dependent covariates on the interval-censored failure time through a broad class of
semiparametric transformation models that encompasses proportional hazards and
proportional odds models. We consider nonparametric maximum likelihood estimation for this
class of models with an arbitrary number of monitoring times for each subject. We devise
an EM-type algorithm that converges stably, even in the presence of time-dependent
covariates, and show that the estimators for the regression parameters are consistent,
asymptotically normal, and asymptotically efficient with an easily estimated covariance
matrix. Finally, we demonstrate the performance of our procedures through simulation
studies and application to an HIV/AIDS study conducted in Thailand.
Collapse
Affiliation(s)
- Donglin Zeng
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599, U.S.A. , ,
| | - Lu Mao
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599, U.S.A. , ,
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599, U.S.A. , ,
| |
Collapse
|
20
|
Mao L, Lin DY. Efficient Estimation of Semiparametric Transformation Models for the Cumulative Incidence of Competing Risks. J R Stat Soc Series B Stat Methodol 2016; 79:573-587. [PMID: 28239261 DOI: 10.1111/rssb.12177] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The cumulative incidence is the probability of failure from the cause of interest over a certain time period in the presence of other risks. A semiparametric regression model proposed by Fine and Gray (1999) has become the method of choice for formulating the effects of covariates on the cumulative incidence. Its estimation, however, requires modeling of the censoring distribution and is not statistically efficient. In this paper, we present a broad class of semiparametric transformation models which extends the Fine and Gray model, and we allow for unknown causes of failure. We derive the nonparametric maximum likelihood estimators (NPMLEs) and develop simple and fast numerical algorithms using the profile likelihood. We establish the consistency, asymptotic normality, and semiparametric efficiency of the NPMLEs. In addition, we construct graphical and numerical procedures to evaluate and select models. Finally, we demonstrate the advantages of the proposed methods over the existing ones through extensive simulation studies and an application to a major study on bone marrow transplantation.
Collapse
Affiliation(s)
- Lu Mao
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| |
Collapse
|
21
|
Mao L, Lin DY. Semiparametric regression for the weighted composite endpoint of recurrent and terminal events. Biostatistics 2015; 17:390-403. [PMID: 26668069 DOI: 10.1093/biostatistics/kxv050] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2015] [Accepted: 11/02/2015] [Indexed: 11/12/2022] Open
Abstract
Recurrent event data are commonly encountered in clinical and epidemiological studies. A major complication arises when recurrent events are terminated by death. To assess the overall effects of covariates on the two types of events, we define a weighted composite endpoint as the cumulative number of recurrent and terminal events properly weighted by the relative severity of each event. We propose a semiparametric proportional rates model which specifies that the (possibly time-varying) covariates have multiplicative effects on the rate function of the weighted composite endpoint while leaving the form of the rate function and the dependence among recurrent and terminal events completely unspecified. We construct appropriate estimators for the regression parameters and the cumulative frequency function. We show that the estimators are consistent and asymptotically normal with variances that can be consistently estimated. We also develop graphical and numerical procedures for checking the adequacy of the model. We then demonstrate the usefulness of the proposed methods in simulation studies. Finally, we provide an application to a major cardiovascular clinical trial.
Collapse
Affiliation(s)
- Lu Mao
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| |
Collapse
|
22
|
Abstract
Meta-analysis plays an important role in summarizing and synthesizing scientific evidence derived from multiple studies. With high-dimensional data, the incorporation of variable selection into meta-analysis improves model interpretation and prediction. Existing variable selection methods require direct access to raw data, which may not be available in practical situations. We propose a new approach, sparse meta-analysis (SMA), in which variable selection for meta-analysis is based solely on summary statistics and the effect sizes of each covariate are allowed to vary among studies. We show that the SMA enjoys the oracle property if the estimated covariance matrix of the parameter estimators from each study is available. We also show that our approach achieves selection consistency and estimation consistency even when summary statistics include only the variance estimators or no variance/covariance information at all. Simulation studies and applications to high-throughput genomics studies demonstrate the usefulness of our approach.
Collapse
Affiliation(s)
- Qianchuan He
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA
| | - Hao Helen Zhang
- Department of Mathematics, The University of Arizona, Tucson, AZ 85721, USA
| | - Christy L Avery
- Department of Epidemiology, University of North Carolina, Chapel Hill, NC 27599, USA
| | - D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599, USA
| |
Collapse
|
23
|
Abstract
Meta-analysis is widely used to compare and combine the results of multiple independent studies. To account for between-study heterogeneity, investigators often employ random-effects models, under which the effect sizes of interest are assumed to follow a normal distribution. It is common to estimate the mean effect size by a weighted linear combination of study-specific estimators, with the weight for each study being inversely proportional to the sum of the variance of the effect-size estimator and the estimated variance component of the random-effects distribution. Because the estimator of the variance component involved in the weights is random and correlated with study-specific effect-size estimators, the commonly adopted asymptotic normal approximation to the meta-analysis estimator is grossly inaccurate unless the number of studies is large. When individual participant data are available, one can also estimate the mean effect size by maximizing the joint likelihood. We establish the asymptotic properties of the meta-analysis estimator and the joint maximum likelihood estimator when the number of studies is either fixed or increases at a slower rate than the study sizes and we discover a surprising result: the former estimator is always at least as efficient as the latter. We also develop a novel resampling technique that improves the accuracy of statistical inference. We demonstrate the benefits of the proposed inference procedures using simulated and empirical data.
Collapse
Affiliation(s)
- D Zeng
- Department of Biostatistics, CB #7420, University of North Carolina, Chapel Hill, North Carolina 27599, U.S.A
| | - D Y Lin
- Department of Biostatistics, CB #7420, University of North Carolina, Chapel Hill, North Carolina 27599, U.S.A
| |
Collapse
|
24
|
Hu YJ, Lin DY, Sun W, Zeng D. A Likelihood-Based Framework for Association Analysis of Allele-Specific Copy Numbers. J Am Stat Assoc 2015; 109:1533-1545. [PMID: 25663726 DOI: 10.1080/01621459.2014.908777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Copy number variants (CNVs) and single nucleotide polymorphisms (SNPs) co-exist throughout the human genome and jointly contribute to phenotypic variations. Thus, it is desirable to consider both types of variants, as characterized by allele-specific copy numbers (ASCNs), in association studies of complex human diseases. Current SNP genotyping technologies capture the CNV and SNP information simultaneously via fluorescent intensity measurements. The common practice of calling ASCNs from the intensity measurements and then using the ASCN calls in downstream association analysis has important limitations. First, the association tests are prone to false-positive findings when differential measurement errors between cases and controls arise from differences in DNA quality or handling. Second, the uncertainties in the ASCN calls are ignored. We present a general framework for the integrated analysis of CNVs and SNPs, including the analysis of total copy numbers as a special case. Our approach combines the ASCN calling and the association analysis into a single step while allowing for differential measurement errors. We construct likelihood functions that properly account for case-control sampling and measurement errors. We establish the asymptotic properties of the maximum likelihood estimators and develop EM algorithms to implement the corresponding inference procedures. The advantages of the proposed methods over the existing ones are demonstrated through realistic simulation studies and an application to a genome-wide association study of schizophrenia. Extensions to next-generation sequencing data are discussed.
Collapse
|
25
|
Lin HA, Yang YS, Wang JX, Lin HC, Lin DY, Chiu CH, Yeh KM, Lin JC, Chang FY. Comparison of the effectiveness and antibiotic cost among ceftriaxone, ertapenem, and levofloxacin in treatment of community-acquired complicated urinary tract infections. J Microbiol Immunol Infect 2015; 49:237-42. [PMID: 25661278 DOI: 10.1016/j.jmii.2014.12.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2014] [Revised: 12/15/2014] [Accepted: 12/23/2014] [Indexed: 11/26/2022]
Abstract
PURPOSE To study characteristics of patients with community-acquired complicated urinary tract infections (cUTIs) and to compare effectiveness and antibiotic cost of treatment with ceftriaxone (CRO), levofloxacin (LVX), and ertapenem (ETP). METHODS This retrospective study enrolled patients who had community-acquired cUTIs admitted to Division of Infectious Diseases in a single medical center from January 2011 to March 2013. Effectiveness, antibiotic cost, and clinical characteristics were compared among patients treated with CRO, LVX, and ETP. RESULTS There were 358 eligible cases, including 139 who received CRO, 128 treated with ETP, and 91 with LVX. The most common pathogen was Escherichia coli. The susceptibilities of these three agents were higher and more superior than first-line antibiotics. Treatment with ETP was associated with a significantly shorter time to defervescence since admission (CRO: 39 hours, ETP: 30 hours, and LVX: 38 h; p = 0.031) and shorter hospitalization stay (CRO: 4 days, ETP: 3 days, and LVX: 4 days; p < 0.001). However, the average antibiotic costs in the CRO group were significantly lower than that in the other two groups [CRO: 62.4 United States dollars (USD), ETP: 185.33 USD, and LVX: 204.85 USD; p < 0.001]. CONCLUSION The resistance of cUTIs isolates to first-line antibiotic is high. Using ETP, CRO, and LVX in the treatment of cUTIs for good clinical response should be suggested. Among the three agents, ETP had better susceptibility than CRO and LVX, reached defervescence sooner, and was associated with shorter hospital stays. However, using CRO in cUTIs was less expensive than the other two agents.
Collapse
Affiliation(s)
- Hsin-An Lin
- Division of Infectious Diseases and Tropical Medicine, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan; Department of Internal Medicine, Songshan Branch of Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Ya-Sung Yang
- Division of Infectious Diseases and Tropical Medicine, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Jing-Xun Wang
- Division of Infectious Diseases and Tropical Medicine, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Hsin-Chung Lin
- Division of Clinical Pathology, Department of Pathology, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - De-Yu Lin
- Division of Infectious Diseases and Tropical Medicine, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Chun-Hsiang Chiu
- Division of Infectious Diseases and Tropical Medicine, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Kuo-Ming Yeh
- Division of Infectious Diseases and Tropical Medicine, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| | - Jung-Chung Lin
- Division of Infectious Diseases and Tropical Medicine, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan.
| | - Feng-Yee Chang
- Division of Infectious Diseases and Tropical Medicine, Department of Internal Medicine, Tri-Service General Hospital, National Defense Medical Center, Taipei, Taiwan
| |
Collapse
|
26
|
Lin DY, Chiang TY, Huang CC, Lin HD, Tzeng SJ, Kang SR, Sung HM, Wu MC. Polymorphic microsatellite loci isolated from Cervus unicolor (Cervidae) show inbreeding in a domesticated population of Taiwan Sambar deer. Genet Mol Res 2014; 13:3967-71. [PMID: 24938607 DOI: 10.4238/2014.may.23.7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
Primers for eight microsatellites were developed; they successfully amplified DNA from 20 domesticated Formosan Sambar deer (Cervus unicolor swinhoei). All loci were polymorphic, with 10-19 alleles per locus. The average observed heterozygosity across loci and samples was 0.310, ranging from 0 to 0.750 at each locus. All loci but one, CU18, deviated from Hardy-Weinberg equilibrium due to excessive homozygosity in these domesticated broodstocks, reflecting inbreeding. These microsatellite loci will be useful, not only for assessment of population structure and genetic variability, but also for conservation of wild deer populations in Taiwan.
Collapse
Affiliation(s)
- D Y Lin
- Department of Life Sciences, Cheng Kung University, Tainan, Taiwan
| | - T Y Chiang
- Department of Life Sciences, Cheng Kung University, Tainan, Taiwan
| | - C C Huang
- Kinmen National Park, Jinning Shiang, Kinmen, Taiwan
| | - H D Lin
- Department of Life Sciences, Cheng Kung University, Tainan, Taiwan
| | - S J Tzeng
- Department of Medical Laboratory Science and Biotechnology, Chung Hwa University of Medical Technology, Rende, Tainan, Taiwan
| | - S R Kang
- Kaohsiung Animal Propagation Station, COA-LRI, Pingtung, Taiwan
| | - H M Sung
- Department of Life Sciences, Cheng Kung University, Tainan, Taiwan
| | - M C Wu
- Division of Breeding and Genetics, COA-LRI, Muchang, Xinhua, Tainan, Taiwan
| |
Collapse
|
27
|
Abstract
Under two-phase cohort designs, such as case-cohort and nested case-control sampling, information on observed event times, event indicators, and inexpensive covariates is collected in the first phase, and the first-phase information is used to select subjects for measurements of expensive covariates in the second phase; inexpensive covariates are also used in the data analysis to control for confounding and to evaluate interactions. This paper provides efficient estimation of semiparametric transformation models for such designs, accommodating both discrete and continuous covariates and allowing inexpensive and expensive covariates to be correlated. The estimation is based on the maximization of a modified nonparametric likelihood function through a generalization of the expectation-maximization algorithm. The resulting estimators are shown to be consistent, asymptotically normal and asymptotically efficient with easily estimated variances. Simulation studies demonstrate that the asymptotic approximations are accurate in practical situations. Empirical data from Wilms' tumor studies and the Atherosclerosis Risk in Communities (ARIC) study are presented.
Collapse
Affiliation(s)
- Donglin Zeng
- Department of Biostatistics, CB#7420, University of North Carolina, Chapel Hill, NC 27599-7420
| | - D Y Lin
- Department of Biostatistics, CB#7420, University of North Carolina, Chapel Hill, NC 27599-7420
| |
Collapse
|
28
|
Lin DY. Survival analysis with incomplete genetic data. Lifetime Data Anal 2014; 20:16-22. [PMID: 23722305 PMCID: PMC3806886 DOI: 10.1007/s10985-013-9262-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/09/2013] [Accepted: 05/11/2013] [Indexed: 06/02/2023]
Abstract
Genetic data are now collected frequently in clinical studies and epidemiological cohort studies. For a large study, it may be prohibitively expensive to genotype all study subjects, especially with the next-generation sequencing technology. Two-phase sampling, such as case-cohort and nested case-control sampling, is cost-effective in such settings but entails considerable analysis challenges, especially if efficient estimators are desired. Another type of missing data arises when the investigators are interested in the haplotypes or the genetic markers that are not on the genotyping platform used for the current study. Valid and efficient analysis of such missing data is also interesting and challenging. This article provides an overview of these issues and outlines some directions for future research.
Collapse
Affiliation(s)
- D Y Lin
- Department of Biostatistics, University of North Carolina, CB#7420, Chapel Hill, NC, 27599-7420, USA,
| |
Collapse
|
29
|
Abstract
Ross Prentice's work has had the most profound impact on the theory and practice of statistics. His research interests range from survival analysis, longitudinal data analysis, epidemiologic designs and analysis, to genomic studies. His contributions are so broad and so deep that it would be impossible to provide a comprehensive review in any limited amount of space. In this commentary, I will attempt to give a brief tour of some of his statistical work, focusing on ten of my favorite papers of his. I will describe the main ideas in those papers and their influence on the directions of statistical research and on the designs and analysis of medical studies. I will mention a few stories along the way.
Collapse
Affiliation(s)
- D Y Lin
- Department of Biostatistics University of North Carolina at Chapel Hill
| |
Collapse
|
30
|
Abstract
We propose a graphical measure, the generalized negative predictive function, to quantify the predictive accuracy of covariates for survival time or recurrent event times. This new measure characterizes the event-free probabilities over time conditional on a thresholded linear combination of covariates and has direct clinical utility. We show that this function is maximized at the set of covariates truly related to event times and thus can be used to compare the predictive accuracy of different sets of covariates. We construct nonparametric estimators for this function under right censoring and prove that the proposed estimators, upon proper normalization, converge weakly to zero-mean Gaussian processes. To bypass the estimation of complex density functions involved in the asymptotic variances, we adopt the bootstrap approach and establish its validity. Simulation studies demonstrate that the proposed methods perform well in practical situations. Two clinical studies are presented.
Collapse
Affiliation(s)
- Li Chen
- Markey Cancer Center and Department of Biostatistics, University of Kentucky, Lexington, Kentucky 40536, U.S.A. ,
| | | | | |
Collapse
|
31
|
Pham MH, Berthouly-Salazar C, Tran XH, Chang WH, Crooijmans RPMA, Lin DY, Hoang VT, Lee YP, Tixier-Boichard M, Chen CF. Genetic diversity of Vietnamese domestic chicken populations as decision-making support for conservation strategies. Anim Genet 2013; 44:509-21. [PMID: 23714019 DOI: 10.1111/age.12045] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/26/2013] [Indexed: 11/27/2022]
Abstract
The aims of this study were to assess the genetic diversity of 17 populations of Vietnamese local chickens (VNN) and one Red Jungle Fowl population, together with six chicken populations of Chinese origin (CNO), and to provide priorities supporting the conservation of genetic resources using 20 microsatellites. Consequently, the VNN populations exhibited a higher diversity than did CNO populations in terms of number of alleles but showed a slightly lower observed heterozygosity. The VNN populations showed in total seven private alleles, whereas no CNO private alleles were found. The expected heterozygosity of 0.576 in the VNN populations was higher than the observed heterozygosity of 0.490, leading to heterozygote deficiency within populations. This issue could be partly explained by the Wahlund effect due to fragmentation of several populations between chicken flocks. Molecular analysis of variance showed that most of genetic variation was found within VNN populations. The Bayesian clustering analysis showed that VNN and CNO chickens were separated into two distinct groups with little evidence for gene flow between them. Among the 24 populations, 13 were successfully assigned to their own cluster, whereas the structuring was not clear for the remaining 11 chicken populations. The contributions of 24 populations to the total genetic diversity were mostly consistent across two approaches, taking into account the within- and between-populations genetic diversity and allelic richness. The black H'mong, Lien Minh, Luong Phuong and Red Jungle Fowl were ranked with the highest priorities for conservation according to Caballero and Toro's and Petit's approaches. In conclusion, a national strategy needs to be set up for Vietnamese chicken populations, with three main components: conservation of high-priority breeds, within-breed management with animal exchanges between flocks to avoid Wahlund effect and monitoring of inbreeding rate.
Collapse
Affiliation(s)
- M H Pham
- Department of Animal Science, National Chung-Hsing University, Taichung, Taiwan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
32
|
Wu JS, Huang YK, Wu FL, Lin DY. Design and implementation of a versatile and variable-frequency piezoelectric coefficient measurement system. Rev Sci Instrum 2012; 83:085110. [PMID: 22938335 DOI: 10.1063/1.4746769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
We present a simple but versatile piezoelectric coefficient measurement system, which can measure the longitudinal and transverse piezoelectric coefficients in the pressing and bending modes, respectively, at different applied forces and a wide range of frequencies. The functionality of this measurement system has been demonstrated on three samples, including a PbZr(0.52)Ti(0.48)O(3) (PZT) piezoelectric ceramic bulk, a ZnO thin film, and a laminated piezoelectric film sensor. The static longitudinal piezoelectric coefficients of the PZT bulk and the ZnO film are estimated to be around 210 and 8.1 pC/N, respectively. The static transverse piezoelectric coefficients of the ZnO film and the piezoelectric film sensor are determined to be, respectively, -0.284 and -0.031 C/m(2).
Collapse
Affiliation(s)
- J S Wu
- Department of Electronics Engineering, National Changhua University of Education, Changhua 500, Taiwan
| | | | | | | |
Collapse
|
33
|
Abstract
We propose a general strategy for variable selection in semiparametric regression models by penalizing appropriate estimating functions. Important applications include semiparametric linear regression with censored responses and semiparametric regression with missing predictors. Unlike the existing penalized maximum likelihood estimators, the proposed penalized estimating functions may not pertain to the derivatives of any objective functions and may be discrete in the regression coefficients. We establish a general asymptotic theory for penalized estimating functions and present suitable numerical algorithms to implement the proposed estimators. In addition, we develop a resampling technique to estimate the variances of the estimated regression coefficients when the asymptotic variances cannot be evaluated directly. Simulation studies demonstrate that the proposed methods perform well in variable selection and variance estimation. We illustrate our methods using data from the Paul Coverdell Stroke Registry.
Collapse
Affiliation(s)
- Brent A Johnson
- Assistant Professor, Department of Biostatistics, Emory University, Atlanta, GA 30322 (E-mail: )
| | | | | |
Collapse
|
34
|
Abstract
Genomewide association studies have become the primary tool for discovering the genetic basis of complex human diseases. Such studies are susceptible to the confounding effects of population stratification, in that the combination of allele-frequency heterogeneity with disease-risk heterogeneity among different ancestral subpopulations can induce spurious associations between genetic variants and disease. This article provides a statistically rigorous and computationally feasible solution to this challenging problem of unmeasured confounders. We show that the odds ratio of disease with a genetic variant is identifiable if and only if the genotype is independent of the unknown population substructure conditional on a set of observed ancestry-informative markers in the disease-free population. Under this condition, the odds ratio of interest can be estimated by fitting a semiparametric logistic regression model with an arbitrary function of a propensity score relating the genotype probability to ancestry-informative markers. Approximating the unknown function of the propensity score by B-splines, we derive a consistent and asymptotically normal estimator for the odds ratio of interest with a consistent variance estimator. Simulation studies demonstrate that the proposed inference procedures perform well in realistic settings. An application to the well-known Wellcome Trust Case-Control Study is presented. Supplemental materials are available online.
Collapse
Affiliation(s)
- D Y Lin
- Department of Biostatistics, CB#7420, University of North Carolina, Chapel Hill, NC 27599-7420
| | | |
Collapse
|
35
|
Abstract
Semiparametric transformation models provide a very general framework for studying the effects of (possibly time-dependent) covariates on survival time and recurrent event times. Assessing the adequacy of these models is an important task because model misspecification affects the validity of inference and the accuracy of prediction. In this paper, we introduce appropriate time-dependent residuals for these models and consider the cumulative sums of the residuals. Under the assumed model, the cumulative sum processes converge weakly to zero-mean Gaussian processes whose distributions can be approximated through Monte Carlo simulation. These results enable one to assess, both graphically and numerically, how unusual the observed residual patterns are in reference to their null distributions. The residual patterns can also be used to determine the nature of model misspecification. Extensive simulation studies demonstrate that the proposed methods perform well in practical situations. Three medical studies are provided for illustrations.
Collapse
Affiliation(s)
- Li Chen
- Markey Cancer Center, University of Kentucky, Lexington, KY 40536, USA
| | | | | |
Collapse
|
36
|
Abstract
Analysis of untyped single nucleotide polymorphisms (SNPs) can facilitate the localization of disease-causing variants and permit meta-analysis of association studies with different genotyping platforms. We present two approaches for using the linkage disequilibrium structure of an external reference panel to infer the unknown value of an untyped SNP from the observed genotypes of typed SNPs. The maximum-likelihood approach integrates the prediction of untyped genotypes and estimation of association parameters into a single framework and yields consistent and efficient estimators of genetic effects and gene-environment interactions with proper variance estimators. The imputation approach is a two-stage strategy, which first imputes the untyped genotypes by either the most likely genotypes or the expected genotype counts and then uses the imputed values in a downstream association analysis. The latter approach has proper control of type I error in single-SNP tests with possible covariate adjustments even when the reference panel is misspecified; however, type I error may not be properly controlled in testing multiple-SNP effects or gene-environment interactions. In general, imputation yields biased estimators of genetic effects and gene-environment interactions, and the variances are underestimated. We conduct extensive simulation studies to compare the bias, type I error, power, and confidence interval coverage between the maximum likelihood and imputation approaches in the analysis of single-SNP effects, multiple-SNP effects, and gene-environment interactions under cross-sectional and case-control designs. In addition, we provide an illustration with genome-wide data from the Wellcome Trust Case-Control Consortium (WTCCC) [2007].
Collapse
Affiliation(s)
- Y J Hu
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599-7420, USA
| | | |
Collapse
|
37
|
Abstract
Attributable fractions are commonly used to measure the impact of risk factors on disease incidence in the population. These static measures can be extended to functions of time when the time to disease occurrence or event time is of interest. The present paper deals with nonparametric and semiparametric estimation of attributable fraction functions for cohort studies with potentially censored event time data. The semiparametric models include the familiar proportional hazards model and a broad class of transformation models. The proposed estimators are shown to be consistent, asymptotically normal and asymptotically efficient. Extensive simulation studies demonstrate that the proposed methods perform well in practical situations. A cardiovascular health study is provided. Connections to causal inference are discussed.
Collapse
Affiliation(s)
- Li Chen
- Department of Biostatistics, CB# 7420 , University of North Carolina , Chapel Hill, North Carolina 27599-7420 , U.S.A.
| | | | | |
Collapse
|
38
|
Abstract
Meta-analysis is widely used to synthesize the results of multiple studies. Although meta-analysis is traditionally carried out by combining the summary statistics of relevant studies, advances in technologies and communications have made it increasingly feasible to access the original data on individual participants. In the present paper, we investigate the relative efficiency of analyzing original data versus combining summary statistics. We show that, for all commonly used parametric and semiparametric models, there is no asymptotic efficiency gain by analyzing original data if the parameter of main interest has a common value across studies, the nuisance parameters have distinct values among studies, and the summary statistics are based on maximum likelihood. We also assess the relative efficiency of the two methods when the parameter of main interest has different values among studies or when there are common nuisance parameters across studies. We conduct simulation studies to confirm the theoretical results and provide empirical comparisons from a genetic association study.
Collapse
Affiliation(s)
- D Y Lin
- Department of Biostatistics, CB# 7420 , University of North Carolina , Chapel Hill, North Carolina 27599-7420 , U.S.A.
| | | |
Collapse
|
39
|
Abstract
Many complex human diseases such as alcoholism and cancer are rated on ordinal scales. Well-developed statistical methods for the genetic mapping of quantitative traits may not be appropriate for ordinal traits. We propose a class of variance-component models for the joint linkage and association analysis of ordinal traits. The proposed models accommodate arbitrary pedigrees and allow covariates and gene-environment interactions. We develop efficient likelihood-based inference procedures under the proposed models. The maximum likelihood estimators are approximately unbiased, normally distributed, and statistically efficient. Extensive simulation studies demonstrate that the proposed methods perform well in practical situations. An application to data from the Collaborative Study on the Genetics of Alcoholism is provided.
Collapse
Affiliation(s)
- G Diao
- Department of Statistics, George Mason University, MS 4A7, 4400 University Drive, Fairfax, VA 22030-4444, USA.
| | | |
Collapse
|
40
|
Abstract
Missing data arise in genetic association studies when genotypes are unknown or when haplotypes are of direct interest. We provide a general likelihood-based framework for making inference on genetic effects and gene-environment interactions with such missing data. We allow genetic and environmental variables to be correlated while leaving the distribution of environmental variables completely unspecified. We consider 3 major study designs-cross-sectional, case-control, and cohort designs-and construct appropriate likelihood functions for all common phenotypes (e.g. case-control status, quantitative traits, and potentially censored ages at onset of disease). The likelihood functions involve both finite- and infinite-dimensional parameters. The maximum likelihood estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Expectation-Maximization (EM) algorithms are developed to implement the corresponding inference procedures. Extensive simulation studies demonstrate that the proposed inferential and numerical methods perform well in practical settings. Illustration with a genome-wide association study of lung cancer is provided.
Collapse
Affiliation(s)
- Y J Hu
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| | | | | |
Collapse
|
41
|
Abstract
To identify genetic variants with modest effects on complex human diseases, a growing number of networks or consortia are created for sharing data from multiple genome-wide association studies on the same disease or related disorders. A central question in this enterprise is whether to obtain summary results or individual participant data from relevant studies. We show theoretically and numerically that meta-analysis of summary results is statistically as efficient as joint analysis of individual participant data (provided that both analyses are performed properly under the same modeling assumptions). We illustrate this equivalence with case-control data from the Finland-United States Investigation of NIDDM Genetics (FUSION) study. Collating only summary results will increase the number and representativeness of available studies, simplify data collection and analysis, reduce resource utilization, and accelerate discovery.
Collapse
Affiliation(s)
- D Y Lin
- Department of Biostatistics, University of North Carolina, McGavran-Greenberg Hall, CB #7420, Chapel Hill, NC 27599-7420, USA.
| | | |
Collapse
|
42
|
Lin DY, Villegas MS, Tan PL, Wang S, Shek LP. Severe Kikuchi's disease responsive to immune modulation. Singapore Med J 2010; 51:e18-e21. [PMID: 20200761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Kikuchi's disease, although an uncommon entity, has been increasingly reported since it was first discovered in 1972. The most common manifestation of Kikuchi's disease, cervical lymphadenopathy, has no clinically distinguishable features. Therefore, a diagnosis of Kikuchi's disease has largely been based on clinical suspicion and histopathological confirmation. We present a 15-year-old Chinese girl with severe Kikuchi's disease, whose relapsing course was only responsive to highdose steroid and intravenous immunoglobulin therapy.
Collapse
Affiliation(s)
- D Y Lin
- Department of Paediatrics, National University Hospital, 5 Lower Kent Ridge Road, Singapore 119074
| | | | | | | | | |
Collapse
|
43
|
Abstract
Case-control association studies often collect extensive information on secondary phenotypes, which are quantitative or qualitative traits other than the case-control status. Exploring secondary phenotypes can yield valuable insights into biological pathways and identify genetic variants influencing phenotypes of direct interest. All publications on secondary phenotypes have used standard statistical methods, such as least-squares regression for quantitative traits. Because of unequal selection probabilities between cases and controls, the case-control sample is not a random sample from the general population. As a result, standard statistical analysis of secondary phenotype data can be extremely misleading. Although one may avoid the sampling bias by analyzing cases and controls separately or by including the case-control status as a covariate in the model, the associations between a secondary phenotype and a genetic variant in the case and control groups can be quite different from the association in the general population. In this article, we present novel statistical methods that properly reflect the case-control sampling in the analysis of secondary phenotype data. The new methods provide unbiased estimation of genetic effects and accurate control of false-positive rates while maximizing statistical power. We demonstrate the pitfalls of the standard methods and the advantages of the new methods both analytically and numerically. The relevant software is available at our website.
Collapse
Affiliation(s)
- D Y Lin
- Department of Biostatistics, University of North Carolina, Chapel Hill, North Carolina 27599-7420, USA.
| | | |
Collapse
|
44
|
Abstract
We propose a broad class of semiparametric transformation models with random effects for the joint analysis of recurrent events and a terminal event. The transformation models include proportional hazards/intensity and proportional odds models. We estimate the model parameters by the nonparametric maximum likelihood approach. The estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Simple and stable numerical algorithms are provided to calculate the parameter estimators and to estimate their variances. Extensive simulation studies demonstrate that the proposed inference procedures perform well in realistic settings. Applications to two HIV/AIDS studies are presented.
Collapse
Affiliation(s)
- Donglin Zeng
- Department of Biostatistics, CB 7420, University of North Carolina, Chapel Hill, North Carolina 27599-7420, USA.
| | | |
Collapse
|
45
|
Abstract
The analysis of genomewide association studies requires methods that are both computationally feasible and statistically powerful. Given the large-scale collection of single nucleotide polymorphisms (SNPs), it is desirable to explore the information contained in their interrelationships. In particular, utilizing haplotypes rather than individual SNPs and accounting for correlations of polymorphisms in adjustment for multiple testing can lead to increased power. We present a statistically powerful and numerically efficient method based on sliding windows of adjacent SNPs to detect haplotype-disease association in genomewide studies. This method consists of an efficient algorithm to calculate a proper likelihood-ratio statistic for any given window of SNPs, along with an accurate and efficient Monte Carlo procedure to adjust for multiple testing. Simulation studies using the HapMap data showed that the proposed method performs well in realistic situations. We applied the new method to a case-control study on rheumatoid arthritis and identified several loci worthy of further investigations.
Collapse
Affiliation(s)
- B E Huang
- Department of Biostatistics, University of North Carolina, North Carolina 27599-7420, USA
| | | | | |
Collapse
|
46
|
Abstract
With the availability of high-throughput microarray technologies, investigators can simultaneously measure the expression levels of many thousands of genes in a short period. Although there are rich statistical methods for analyzing microarray data in the literature, limited work has been done in mapping expression quantitative trait loci (eQTL) that influence the variation in levels of gene expression. Most existing eQTL mapping methods assume that the expression phenotypes follow a normal distribution and violation of the normality assumption may lead to inflated type I error and reduced power. QTL analysis of expression data involves the mapping of many expression phenotypes at thousands or hundreds of thousands of marker loci across the whole genome. An appropriate procedure to adjust for multiple testing is essential for guarding against an abundance of false positive results. In this study, we applied a semiparametric quantitative trait loci (SQTL) mapping method to human gene expression data. The SQTL mapping method is rank-based and therefore robust to non-normality and outliers. Furthermore, we apply an efficient Monte Carlo procedure to account for multiple testing and assess the genome-wide significance level. Particularly, we apply the SQTL mapping method and the Monte-Carlo approach to the gene expression data provided by Genetic Analysis Workshop 15.
Collapse
Affiliation(s)
- Guoqing Diao
- Department of Statistics, George Mason University, 4400 University Drive, MS 4A7, Fairfax, Virginia 22030, USA.
| | | |
Collapse
|
47
|
Abstract
In his discussion of Cox's (1972) paper on proportional hazards regression, Breslow (1972) provided the maximum likelihood estimator for the cumulative baseline hazard function. This estimator is commonly used in practice. The estimator has also been highly valuable in the further development of Cox regression and semiparametric inference with censored data. The present paper describes the Breslow estimator and its tremendous impact on the theory and practice of survival analysis.
Collapse
Affiliation(s)
- D Y Lin
- Department of Biostatistics, University of North Carolina, CB#7420, Chapel Hill, NC 27599-7420, USA.
| |
Collapse
|
48
|
Abstract
We propose a simple and general resampling strategy to estimate variances for parameter estimators derived from nonsmooth estimating functions. This approach applies to a wide variety of semiparametric and nonparametric problems in biostatistics. It does not require solving estimating equations and is thus much faster than the existing resampling procedures. Its usefulness is illustrated with heteroscedastic quantile regression and censored data rank regression. Numerical results based on simulated and real data are provided.
Collapse
Affiliation(s)
- Donglin Zeng
- Department of Biostatistics, CB 7420, University of North Carolina, Chapel Hill, NC 27599-7420, USA
| | | |
Collapse
|
49
|
|
50
|
Huang BE, Lin DY. Efficient association mapping of quantitative trait loci with selective genotyping. Am J Hum Genet 2007; 80:567-76. [PMID: 17273979 PMCID: PMC1821103 DOI: 10.1086/512727] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2006] [Accepted: 01/09/2007] [Indexed: 11/03/2022] Open
Abstract
Selective genotyping (i.e., genotyping only those individuals with extreme phenotypes) can greatly improve the power to detect and map quantitative trait loci in genetic association studies. Because selection depends on the phenotype, the resulting data cannot be properly analyzed by standard statistical methods. We provide appropriate likelihoods for assessing the effects of genotypes and haplotypes on quantitative traits under selective-genotyping designs. We demonstrate that the likelihood-based methods are highly effective in identifying causal variants and are substantially more powerful than existing methods.
Collapse
Affiliation(s)
- B E Huang
- Department of Biostatistics, University of North Carolina, Chapel Hill 27599-7420, USA
| | | |
Collapse
|