1
|
Lee J, Cook RJ. The illness-death model for family studies. Biostatistics 2019; 22:482-503. [PMID: 31742352 DOI: 10.1093/biostatistics/kxz048] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 10/14/2019] [Accepted: 10/18/2019] [Indexed: 11/12/2022] Open
Abstract
Family studies involve the selection of affected individuals from a disease registry who provide right-truncated ages of disease onset. Coarsened disease histories are then obtained from consenting family members, either through examining medical records, retrospective reporting, or clinical examination. Methods for dealing with such biased sampling schemes are available for continuous, binary, and failure time responses, but methods for more complex life history processes are less developed. We consider a simple joint model for clustered illness-death processes which we formulate to study covariate effects on the marginal intensity for disease onset and to study the within-family dependence in disease onset times. We construct likelihoods and composite likelihoods for family data obtained from biased sampling schemes. In settings where the disease is rare and data are insufficient to fit the model of interest, we show how auxiliary data can augment the composite likelihood to facilitate estimation. We apply the proposed methods to analyze data from a family study of psoriatic arthritis carried out at the University of Toronto Psoriatic Arthritis Registry.
Collapse
Affiliation(s)
- Jooyoung Lee
- Department of Statistics and Actuarial Science, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| | - Richard J Cook
- Department of Statistics and Actuarial Science, University of Waterloo, 200 University Avenue West, Waterloo, ON N2L 3G1, Canada
| |
Collapse
|
2
|
Lee AJ, Marder K, Alcalay RN, Mejia-Santana H, Orr-Urtreger A, Giladi N, Bressman S, Wang Y. Estimation of genetic risk function with covariates in the presence of missing genotypes. Stat Med 2017; 36:3533-3546. [PMID: 28656686 PMCID: PMC5583003 DOI: 10.1002/sim.7376] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Revised: 02/28/2017] [Accepted: 05/30/2017] [Indexed: 12/13/2022]
Abstract
In genetic epidemiological studies, family history data are collected on relatives of study participants and used to estimate the age-specific risk of disease for individuals who carry a causal mutation. However, a family member's genotype data may not be collected because of the high cost of in-person interview to obtain blood sample or death of a relative. Previously, efficient nonparametric genotype-specific risk estimation in censored mixture data has been proposed without considering covariates. With multiple predictive risk factors available, risk estimation requires a multivariate model to account for additional covariates that may affect disease risk simultaneously. Therefore, it is important to consider the role of covariates in genotype-specific distribution estimation using family history data. We propose an estimation method that permits more precise risk prediction by controlling for individual characteristics and incorporating interaction effects with missing genotypes in relatives, and thus, gene-gene interactions and gene-environment interactions can be handled within the framework of a single model. We examine performance of the proposed methods by simulations and apply them to estimate the age-specific cumulative risk of Parkinson's disease (PD) in carriers of the LRRK2 G2019S mutation using first-degree relatives who are at genetic risk for PD. The utility of estimated carrier risk is demonstrated through designing a future clinical trial under various assumptions. Such sample size estimation is seen in the Huntington's disease literature using the length of abnormal expansion of a CAG repeat in the HTT gene but is less common in the PD literature. Copyright © 2017 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Annie J. Lee
- Department of Biostatistics, Mailman School of Public Health,
Columbia University, New York, NY, U.S.A
| | - Karen Marder
- Department of Neurology, College of Physicians and Surgeons,
Columbia University, New York, NY, U.S.A
- Taub Institute for Research on Alzheimer’s Disease and the
Aging Brain, Columbia University, New York, NY, U.S.A
| | - Roy N. Alcalay
- Department of Neurology, College of Physicians and Surgeons,
Columbia University, New York, NY, U.S.A
- Taub Institute for Research on Alzheimer’s Disease and the
Aging Brain, Columbia University, New York, NY, U.S.A
| | - Helen Mejia-Santana
- Department of Neurology, College of Physicians and Surgeons,
Columbia University, New York, NY, U.S.A
| | - Avi Orr-Urtreger
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv,
Israel
- Genetic Institute, Tel Aviv Sourasky Medical Center, Tel Aviv,
Israel
| | - Nir Giladi
- Sackler Faculty of Medicine, Sagol School for Neurosciences, Tel
Aviv University, Tel Aviv, Israel
- Neurological Institute, Tel Aviv Sourasky Medical Center, Tel Aviv,
Israel
| | - Susan Bressman
- Department of Neurology, Mount Sinai Beth Israel Medical Center, New
York, NY, USA
| | - Yuanjia Wang
- Department of Biostatistics, Mailman School of Public Health,
Columbia University, New York, NY, U.S.A
| |
Collapse
|
3
|
Wang Y, Liang B, Tong X, Marder K, Bressman S, Orr-Urtreger A, Giladi N, Zeng D. Efficient Estimation of Nonparametric Genetic Risk Function with Censored Data. Biometrika 2015; 102:515-532. [PMID: 26412864 PMCID: PMC4581539 DOI: 10.1093/biomet/asv030] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
With an increasing number of causal genes discovered for complex human disorders, it is crucial to assess the genetic risk of disease onset for individuals who are carriers of these causal mutations and compare the distribution of age-at-onset with that in non-carriers. In many genetic epidemiological studies aiming at estimating causal gene effect on disease, the age-at-onset of disease is subject to censoring. In addition, some individuals' mutation carrier or non-carrier status can be unknown due to the high cost of in-person ascertainment to collect DNA samples or death in older individuals. Instead, the probability of these individuals' mutation status can be obtained from various sources. When mutation status is missing, the available data take the form of censored mixture data. Recently, various methods have been proposed for risk estimation from such data, but none is efficient for estimating a nonparametric distribution. We propose a fully efficient sieve maximum likelihood estimation method, in which we estimate the logarithm of the hazard ratio between genetic mutation groups using B-splines, while applying nonparametric maximum likelihood estimation for the reference baseline hazard function. Our estimator can be calculated via an expectation-maximization algorithm which is much faster than existing methods. We show that our estimator is consistent and semiparametrically efficient and establish its asymptotic distribution. Simulation studies demonstrate superior performance of the proposed method, which is applied to the estimation of the distribution of the age-at-onset of Parkinson's disease for carriers of mutations in the leucine-rich repeat kinase 2 gene.
Collapse
Affiliation(s)
- Yuanjia Wang
- Department of Biostatistics, Mailman School of Public Health, 722 W168th Street, New York 10032, U.S.A.
| | - Baosheng Liang
- School of Mathematical Sciences, Beijing Normal University, Beijing 100875, China.
| | - Xingwei Tong
- School of Mathematical Sciences, Beijing Normal University, Beijing 100875, China.
| | - Karen Marder
- Department of Neurology and Psychiatry, College of Physicians and Surgeons, Columbia University, New York 10032, U.S.A.
| | - Susan Bressman
- The Alan and Barbara Mirken Department of Neurology, Beth Israel Medical Center, New York, 10003, U.S.A.
| | - Avi Orr-Urtreger
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.
| | - Nir Giladi
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.
| | - Donglin Zeng
- Department of Biostatistics, CB # 7420, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599-7420, U.S.A.
| |
Collapse
|
4
|
Leclerc M, Antoniou AC, Simard J, Lakhal-Chaieb L. Analysis of multivariate failure times in the presence of selection bias with application to breast cancer. J R Stat Soc Ser C Appl Stat 2014. [DOI: 10.1111/rssc.12091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
| | | | - Jacques Simard
- Centre Hospitalier Universitaire de Québec Research Center and Laval University; Québec Canada
| | | | | | | | | |
Collapse
|
5
|
Zhang H, Zeng D, Olschwang S, Yu K. Semiparametric inference on the penetrances of rare genetic mutations based on a case-family design. J Stat Plan Inference 2013; 143:368-377. [PMID: 23329866 PMCID: PMC3544474 DOI: 10.1016/j.jspi.2012.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A formal semiparametric statistical inference framework is proposed for the evaluation of the age-dependent penetrance of a rare genetic mutation, using family data generated under a case-family design, where phenotype and genotype information are collected from first-degree relatives of case probands carrying the targeted mutation. The proposed approach allows for unobserved risk factors that are correlated among family members. Some rigorous large sample properties are established, which show that the proposed estimators were asymptotically semi-parametric efficient. A simulation study is conducted to evaluate the performance of the new approach, which shows the robustness of the proposed semiparamteric approach and its advantage over the corresponding parametric approach. As an illustration, the proposed approach is applied to estimating the age-dependent cancer risk among carriers of the MSH2 or MLH1 mutation.
Collapse
Affiliation(s)
- Hong Zhang
- Institute of Biostatistics, School of Life Science, Fudan University, P.R.C ; Division of Cancer Epidemiology and Genetics, National Cancer Institute, U.S.A
| | | | | | | |
Collapse
|