1
|
Pedersen EM, Agerbo E, Plana-Ripoll O, Steinbach J, Krebs MD, Hougaard DM, Werge T, Nordentoft M, Børglum AD, Musliner KL, Ganna A, Schork AJ, Mortensen PB, McGrath JJ, Privé F, Vilhjálmsson BJ. ADuLT: An efficient and robust time-to-event GWAS. Nat Commun 2023; 14:5553. [PMID: 37689771 PMCID: PMC10492844 DOI: 10.1038/s41467-023-41210-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 08/28/2023] [Indexed: 09/11/2023] Open
Abstract
Proportional hazards models have been proposed to analyse time-to-event phenotypes in genome-wide association studies (GWAS). However, little is known about the ability of proportional hazards models to identify genetic associations under different generative models and when ascertainment is present. Here we propose the age-dependent liability threshold (ADuLT) model as an alternative to a Cox regression based GWAS, here represented by SPACox. We compare ADuLT, SPACox, and standard case-control GWAS in simulations under two generative models and with varying degrees of ascertainment as well as in the iPSYCH cohort. We find Cox regression GWAS to be underpowered when cases are strongly ascertained (cases are oversampled by a factor 5), regardless of the generative model used. ADuLT is robust to ascertainment in all simulated scenarios. Then, we analyse four psychiatric disorders in iPSYCH, ADHD, Autism, Depression, and Schizophrenia, with a strong case-ascertainment. Across these psychiatric disorders, ADuLT identifies 20 independent genome-wide significant associations, case-control GWAS finds 17, and SPACox finds 8, which is consistent with simulation results. As more genetic data are being linked to electronic health records, robust GWAS methods that can make use of age-of-onset information will help increase power in analyses for common health outcomes.
Collapse
Affiliation(s)
- Emil M Pedersen
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark.
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.
| | - Esben Agerbo
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Centre for Integrated Register-based Research at Aarhus University, Aarhus, Denmark
| | - Oleguer Plana-Ripoll
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Department of Clinical Epidemiology, Aarhus University and Aarhus University Hospital, Aarhus, Denmark
| | - Jette Steinbach
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
| | - Morten D Krebs
- Institute of Biological Psychiatry, Mental Health Center - Sct Hans, Copenhagen University Hospital - Mental Health Services CPH, Copenhagen, Denmark
| | - David M Hougaard
- Department for Congenital Disorders, Statens Serum Institut, Copenhagen, Denmark
| | - Thomas Werge
- Institute of Biological Psychiatry, Mental Health Center - Sct Hans, Copenhagen University Hospital - Mental Health Services CPH, Copenhagen, Denmark
- Department of Clinical Sciences, Copenhagen University, Copenhagen, Denmark
- Section for Geogenetics, GLOBE Institute, Faculty of Health and Medical Science, Copenhagen University, Copenhagen, Denmark
| | - Merete Nordentoft
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- CORE- Copenhagen Centre for Research in Mental Health, Mental Health Center-Copenhagen, Copenhagen University Hospital - Mental Health Services CPH, Copenhagen, Denmark
| | - Anders D Børglum
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
- Department of Biomedicine and iSEQ Centre, Aarhus University, Aarhus, Denmark
- Center for Genomics and Personalized Medicine, CGPM, Aarhus University, Aarhus, Denmark
| | - Katherine L Musliner
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Department of Affective Disorders, Aarhus University Hospital-Psychiatry, Aarhus, Denmark
- Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
| | - Andrea Ganna
- Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Andrew J Schork
- Institute of Biological Psychiatry, Mental Health Center - Sct Hans, Copenhagen University Hospital - Mental Health Services CPH, Copenhagen, Denmark
- Section for Geogenetics, GLOBE Institute, Faculty of Health and Medical Science, Copenhagen University, Copenhagen, Denmark
- Neurogenomics Division, The Translational Genomics Research Institute (TGEN), Phoenix, AZ, USA
| | - Preben B Mortensen
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - John J McGrath
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Queensland Brain Institute, University of Queensland, St Lucia, QLD, Australia
- Queensland Centre for Mental Health Research, The Park Centre for Mental Health, Wacol, QLD, Australia
| | - Florian Privé
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark
| | - Bjarni J Vilhjálmsson
- National Centre for Register-Based Research, Aarhus University, Aarhus, Denmark.
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark.
- Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark.
- Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, the Broad Institute of MIT and Harvard, Massachusetts, USA.
| |
Collapse
|
2
|
Pedersen EM, Agerbo E, Plana-Ripoll O, Grove J, Dreier JW, Musliner KL, Bækvad-Hansen M, Athanasiadis G, Schork A, Bybjerg-Grauholm J, Hougaard DM, Werge T, Nordentoft M, Mors O, Dalsgaard S, Christensen J, Børglum AD, Mortensen PB, McGrath JJ, Privé F, Vilhjálmsson BJ. Accounting for age of onset and family history improves power in genome-wide association studies. Am J Hum Genet 2022; 109:417-432. [PMID: 35139346 PMCID: PMC8948165 DOI: 10.1016/j.ajhg.2022.01.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 01/07/2022] [Indexed: 11/01/2022] Open
Abstract
Genome-wide association studies (GWASs) have revolutionized human genetics, allowing researchers to identify thousands of disease-related genes and possible drug targets. However, case-control status does not account for the fact that not all controls may have lived through their period of risk for the disorder of interest. This can be quantified by examining the age-of-onset distribution and the age of the controls or the age of onset for cases. The age-of-onset distribution may also depend on information such as sex and birth year. In addition, family history is not routinely included in the assessment of control status. Here, we present LT-FH++, an extension of the liability threshold model conditioned on family history (LT-FH), which jointly accounts for age of onset and sex as well as family history. Using simulations, we show that, when family history and the age-of-onset distribution are available, the proposed approach yields statistically significant power gains over LT-FH and large power gains over genome-wide association study by proxy (GWAX). We applied our method to four psychiatric disorders available in the iPSYCH data and to mortality in the UK Biobank and found 20 genome-wide significant associations with LT-FH++, compared to ten for LT-FH and eight for a standard case-control GWAS. As more genetic data with linked electronic health records become available to researchers, we expect methods that account for additional health information, such as LT-FH++, to become even more beneficial.
Collapse
Affiliation(s)
- Emil M Pedersen
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark.
| | - Esben Agerbo
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Centre for Integrated Register-Based Research at Aarhus University, 8210 Aarhus, Denmark
| | - Oleguer Plana-Ripoll
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark
| | - Jakob Grove
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark; Department of Biomedicine and Center for Integrative Sequencing, Aarhus University, 8000 Aarhus, Denmark; Center for Genomics and Personalized Medicine, Aarhus University, 8000 Aarhus, Denmark
| | - Julie W Dreier
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Centre for Integrated Register-Based Research at Aarhus University, 8210 Aarhus, Denmark
| | - Katherine L Musliner
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Centre for Integrated Register-Based Research at Aarhus University, 8210 Aarhus, Denmark
| | - Marie Bækvad-Hansen
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, 2300 Copenhagen, Denmark
| | - Georgios Athanasiadis
- Institute of Biological Psychiatry, MHC Sct. Hans, Mental Health Services Copenhagen, 4000 Roskilde, Denmark
| | - Andrew Schork
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Institute of Biological Psychiatry, MHC Sct. Hans, Mental Health Services Copenhagen, 4000 Roskilde, Denmark
| | - Jonas Bybjerg-Grauholm
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, 2300 Copenhagen, Denmark
| | - David M Hougaard
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Center for Neonatal Screening, Department for Congenital Disorders, Statens Serum Institut, 2300 Copenhagen, Denmark
| | - Thomas Werge
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Institute of Biological Psychiatry, MHC Sct. Hans, Mental Health Services Copenhagen, 4000 Roskilde, Denmark; Department of Clinical Medicine, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Merete Nordentoft
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Mental Health Services in the Capital Region of Denmark, Mental Health Center Copenhagen, University of Copenhagen, 2100 Copenhagen, Denmark
| | - Ole Mors
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Psychosis Research Unit, Aarhus University Hospital, 8245 Risskov, Denmark
| | - Søren Dalsgaard
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark
| | - Jakob Christensen
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Department of Neurology, Aarhus University Hospital, 8200 Aarhus, Denmark; Department of Clinical Medicine, Aarhus University, 8200 Aarhus, Denmark
| | - Anders D Børglum
- Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Center for Genomics and Personalized Medicine, Aarhus University, 8000 Aarhus, Denmark; Department of Biomedicine - Human Genetics, Aarhus University, 8000 Aarhus, Denmark
| | - Preben B Mortensen
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Centre for Integrated Register-Based Research at Aarhus University, 8210 Aarhus, Denmark
| | - John J McGrath
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Queensland Brain Institute, University of Queensland, St Lucia, QLD 4072, Australia; Queensland Centre for Mental Health Research, The Park Centre for Mental Health, Wacol, QLD 4076, Australia
| | - Florian Privé
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark
| | - Bjarni J Vilhjálmsson
- National Centre for Register-Based Research, Aarhus University, 8210 Aarhus, Denmark; Lundbeck Foundation Initiative for Integrative Psychiatric Research, 8210 Aarhus, Denmark; Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark.
| |
Collapse
|
3
|
Rao S, Yin L, Xiang Y, So HC. Analysis of genetic differences between psychiatric disorders: exploring pathways and cell types/tissues involved and ability to differentiate the disorders by polygenic scores. Transl Psychiatry 2021; 11:426. [PMID: 34389699 PMCID: PMC8363629 DOI: 10.1038/s41398-021-01545-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 07/13/2021] [Accepted: 08/02/2021] [Indexed: 02/07/2023] Open
Abstract
Although displaying genetic correlations, psychiatric disorders are clinically defined as categorical entities as they each have distinguishing clinical features and may involve different treatments. Identifying differential genetic variations between these disorders may reveal how the disorders differ biologically and help to guide more personalized treatment. Here we presented a statistical framework and comprehensive analysis to identify genetic markers differentially associated with various psychiatric disorders/traits based on GWAS summary statistics, covering 18 psychiatric traits/disorders and 26 comparisons. We also conducted comprehensive analysis to unravel the genes, pathways and SNP functional categories involved, and the cell types and tissues implicated. We also assessed how well one could distinguish between psychiatric disorders by polygenic risk scores (PRS). SNP-based heritabilities (h2snp) were significantly larger than zero for most comparisons. Based on current GWAS data, PRS have mostly modest power to distinguish between psychiatric disorders. For example, we estimated that AUC for distinguishing schizophrenia from major depressive disorder (MDD), bipolar disorder (BPD) from MDD and schizophrenia from BPD were 0.694, 0.602 and 0.618, respectively, while the maximum AUC (based on h2snp) were 0.763, 0.749 and 0.726, respectively. We also uncovered differences in each pair of studied traits in terms of their differences in genetic correlation with comorbid traits. For example, clinically defined MDD appeared to more strongly genetically correlated with other psychiatric disorders and heart disease, when compared to non-clinically defined depression in UK Biobank. Our findings highlight genetic differences between psychiatric disorders and the mechanisms involved. PRS may help differential diagnosis of selected psychiatric disorders in the future with larger GWAS samples.
Collapse
Affiliation(s)
- Shitao Rao
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
- Department of Bioinformatics, Fujian Key Laboratory of Medical Bioinformatics, School of Medical Technology and Engineering, Fujian Medical University, Fuzhou, China
- Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, China
| | - Liangying Yin
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Yong Xiang
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Hon-Cheong So
- School of Biomedical Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong.
- KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Institute of Zoology and The Chinese University of Hong Kong, Kunming, China.
- CUHK Shenzhen Research Institute, Shenzhen, China.
- Department of Psychiatry, The Chinese University of Hong Kong, Shatin, Hong Kong.
- Margaret K.L. Cheung Research Centre for Management of Parkinsonism, The Chinese University of Hong Kong, Shatin, Hong Kong.
- Brain and Mind Institute, The Chinese University of Hong Kong, Shatin, Hong Kong.
- Hong Kong Branch of the Chinese Academy of Sciences (CAS) Center for Excellence in Animal Evolution and Genetics, The Chinese University of Hong Kong, Shatin, Hong Kong.
| |
Collapse
|
4
|
Leonenko G, Baker E, Stevenson-Hoare J, Sierksma A, Fiers M, Williams J, de Strooper B, Escott-Price V. Identifying individuals with high risk of Alzheimer's disease using polygenic risk scores. Nat Commun 2021; 12:4506. [PMID: 34301930 PMCID: PMC8302739 DOI: 10.1038/s41467-021-24082-z] [Citation(s) in RCA: 89] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Accepted: 06/02/2021] [Indexed: 11/09/2022] Open
Abstract
Polygenic Risk Scores (PRS) for AD offer unique possibilities for reliable identification of individuals at high and low risk of AD. However, there is little agreement in the field as to what approach should be used for genetic risk score calculations, how to model the effect of APOE, what the optimal p-value threshold (pT) for SNP selection is and how to compare scores between studies and methods. We show that the best prediction accuracy is achieved with a model with two predictors (APOE and PRS excluding APOE region) with pT<0.1 for SNP selection. Prediction accuracy in a sample across different PRS approaches is similar, but individuals' scores and their associated ranking differ. We show that standardising PRS against the population mean, as opposed to the sample mean, makes the individuals' scores comparable between studies. Our work highlights the best strategies for polygenic profiling when assessing individuals for AD risk.
Collapse
Affiliation(s)
- Ganna Leonenko
- UK Dementia Research Institute, Cardiff University, Cardiff, UK
| | - Emily Baker
- UK Dementia Research Institute, Cardiff University, Cardiff, UK
| | | | - Annerieke Sierksma
- VIB Center for Brain & Disease Research, Leuven, Belgium
- Laboratory for the Research of Neurodegenerative Diseases, Department of Neurosciences, Leuven Brain Institute (LBI), KU Leuven (University of Leuven), Leuven, Belgium
| | - Mark Fiers
- VIB Center for Brain & Disease Research, Leuven, Belgium
- Laboratory for the Research of Neurodegenerative Diseases, Department of Neurosciences, Leuven Brain Institute (LBI), KU Leuven (University of Leuven), Leuven, Belgium
- UK Dementia Research Institute, University College London, London, UK
| | - Julie Williams
- UK Dementia Research Institute, Cardiff University, Cardiff, UK
- Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff, UK
| | - Bart de Strooper
- VIB Center for Brain & Disease Research, Leuven, Belgium
- Laboratory for the Research of Neurodegenerative Diseases, Department of Neurosciences, Leuven Brain Institute (LBI), KU Leuven (University of Leuven), Leuven, Belgium
- UK Dementia Research Institute, University College London, London, UK
| | - Valentina Escott-Price
- UK Dementia Research Institute, Cardiff University, Cardiff, UK.
- Division of Psychological Medicine and Clinical Neurosciences, School of Medicine, Cardiff University, Cardiff, UK.
| |
Collapse
|
5
|
Prediction of Early Childhood Caries Based on Single Nucleotide Polymorphisms Using Neural Networks. Genes (Basel) 2021; 12:genes12040462. [PMID: 33805090 PMCID: PMC8064067 DOI: 10.3390/genes12040462] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 03/11/2021] [Accepted: 03/21/2021] [Indexed: 12/17/2022] Open
Abstract
Background: Several genes and single nucleotide polymorphisms (SNPs) have been associated with early childhood caries. However, they are highly age- and population-dependent and the majority of existing caries prediction models are based on environmental and behavioral factors only and are scarce in infants. Methods: We examined 6 novel and previously analyzed 22 SNPs in the cohort of 95 Polish children (48 caries, 47 caries-free) aged 2–3 years. All polymorphisms were genotyped from DNA extracted from oral epithelium samples. We used Fisher’s exact test, receiver operator characteristic (ROC) curve and uni-/multi-variable logistic regression to test the association of SNPs with the disease, followed by the neural network (NN) analysis. Results: The logistic regression (LogReg) model showed 90% sensitivity and 96% specificity, overall accuracy of 93% (p < 0.0001), and the area under the curve (AUC) was 0.970 (95% CI: 0.912–0.994; p < 0.0001). We found 90.9–98.4% and 73.6–87.2% prediction accuracy in the test and validation predictions, respectively. The strongest predictors were: AMELX_rs17878486 and TUFT1_rs2337360 (in both LogReg and NN), MMP16_rs1042937 (in NN) and ENAM_rs12640848 (in LogReg). Conclusions: Neural network prediction model might be a substantial tool for screening/early preventive treatment of patients at high risk of caries development in the early childhood. The knowledge of potential risk status could allow early targeted training in oral hygiene and modifications of eating habits.
Collapse
|
6
|
Marshall DS, Butler CJ. Potential Distribution of the Biocontrol Agent Toxorhynchites rutilus By 2070. JOURNAL OF THE AMERICAN MOSQUITO CONTROL ASSOCIATION 2020; 36:131-138. [PMID: 33600581 DOI: 10.2987/8756-971x-36.3.131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Climate change projections indicate that mosquito distributions will expand to include new areas of North America, increasing human exposure to mosquito-borne disease. Controlling these vectors is imperative, as mosquito-borne disease incidence will rise in response to expansion of mosquito range and increased seasonality. One means of mosquito control used in the USA is the biocontrol agent, Toxorhynchites rutilus. Climate change will open new habitats for its use by vector control organizations, but the extent of this change in habitat is currently unknown. We used a maximum entropy approach to create species distribution models for Tx. rutilus under 4 climate change scenarios by 2070. Mean temperature of warmest quarter (22.6°C to 29.1°C), annual precipitation (1,025.15 mm to 1,529.40 mm), and precipitation seasonality (≤17.86) are the most important bioclimatic variables for suitable habitat. The center of current possible habitat distribution of Tx. rutilus is in central Tennessee. Depending upon the scenario, we expect centroids to shift north-northeast by 97.68 km to 280.16 km by 2070. The extreme change in area of greater than 50% suitable habitat probability is 141.14% with 99.44% area retained. Our models indicate limited change in current habitat as well as creation of new habitat. These results are promising for North American mosquito control programs for the continued and potential combat of vector mosquitoes using Tx. rutilus.
Collapse
Affiliation(s)
- Daniel S Marshall
- Department of Biology, University of Central Oklahoma, 100 N University Drive Box 89, Edmond, OK 73034
| | - Christopher J Butler
- Department of Biology, University of Central Oklahoma, 100 N University Drive Box 89, Edmond, OK 73034
| |
Collapse
|
7
|
Abstract
Risk prediction models have been developed in many contexts to classify individuals according to a single outcome, such as risk of a disease. Emerging “-omic” biomarkers provide panels of features that can simultaneously predict multiple outcomes from a single biological sample, creating issues of multiplicity reminiscent of exploratory hypothesis testing. Here I propose definitions of some basic criteria for evaluating prediction models of multiple outcomes. I define calibration in the multivariate setting and then distinguish between outcome-wise and individual-wise prediction, and within the latter between joint and panel-wise prediction. I give examples such as screening and early detection in which different senses of prediction may be more appropriate. In each case I propose definitions of sensitivity, specificity, concordance, positive and negative predictive value and relative utility. I link the definitions through a multivariate probit model, showing that the accuracy of a multivariate prediction model can be summarised by its covariance with a liability vector. I illustrate the concepts on a biomarker panel for early detection of eight cancers, and on polygenic risk scores for six common diseases.
Collapse
Affiliation(s)
- Frank Dudbridge
- Frank Dudbridge, Department of Health Sciences, University of Leicester, Leicester LE1 7RH, UK.
| |
Collapse
|
8
|
Lambert SA, Abraham G, Inouye M. Towards clinical utility of polygenic risk scores. Hum Mol Genet 2019; 28:R133-R142. [DOI: 10.1093/hmg/ddz187] [Citation(s) in RCA: 249] [Impact Index Per Article: 49.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Revised: 07/11/2019] [Accepted: 07/24/2019] [Indexed: 02/06/2023] Open
Abstract
Abstract
Prediction of disease risk is an essential part of preventative medicine, often guiding clinical management. Risk prediction typically includes risk factors such as age, sex, family history of disease and lifestyle (e.g. smoking status); however, in recent years, there has been increasing interest to include genomic information into risk models. Polygenic risk scores (PRS) aggregate the effects of many genetic variants across the human genome into a single score and have recently been shown to have predictive value for multiple common diseases. In this review, we summarize the potential use cases for seven common diseases (breast cancer, prostate cancer, coronary artery disease, obesity, type 1 diabetes, type 2 diabetes and Alzheimer’s disease) where PRS has or could have clinical utility. PRS analysis for these diseases frequently revolved around (i) risk prediction performance of a PRS alone and in combination with other non-genetic risk factors, (ii) estimation of lifetime risk trajectories, (iii) the independent information of PRS and family history of disease or monogenic mutations and (iv) estimation of the value of adding a PRS to specific clinical risk prediction scenarios. We summarize open questions regarding PRS usability, ancestry bias and transferability, emphasizing the need for the next wave of studies to focus on the implementation and health-economic value of PRS testing. In conclusion, it is becoming clear that PRS have value in disease risk prediction and there are multiple areas where this may have clinical utility.
Collapse
Affiliation(s)
- Samuel A Lambert
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
- MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
- Cambridge Substantive Site, Health Data Research UK, Wellcome Genome Campus, Hinxton, UK
| | - Gad Abraham
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
- Department of Clinical Pathology, University of Melbourne, Parkville, VIC 3010, Australia
| | - Michael Inouye
- Cambridge Baker Systems Genomics Initiative, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
- Cambridge Baker Systems Genomics Initiative, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
- MRC/BHF Cardiovascular Epidemiology Unit, Department of Public Health and Primary Care, University of Cambridge, Cambridge CB1 8RN, UK
- Cambridge Substantive Site, Health Data Research UK, Wellcome Genome Campus, Hinxton, UK
- Department of Clinical Pathology, University of Melbourne, Parkville, VIC 3010, Australia
| |
Collapse
|
9
|
Dudbridge F, Pashayan N, Yang J. Predictive accuracy of combined genetic and environmental risk scores. Genet Epidemiol 2018; 42:4-19. [PMID: 29178508 PMCID: PMC5847122 DOI: 10.1002/gepi.22092] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2017] [Revised: 07/19/2017] [Accepted: 09/27/2017] [Indexed: 01/19/2023]
Abstract
The substantial heritability of most complex diseases suggests that genetic data could provide useful risk prediction. To date the performance of genetic risk scores has fallen short of the potential implied by heritability, but this can be explained by insufficient sample sizes for estimating highly polygenic models. When risk predictors already exist based on environment or lifestyle, two key questions are to what extent can they be improved by adding genetic information, and what is the ultimate potential of combined genetic and environmental risk scores? Here, we extend previous work on the predictive accuracy of polygenic scores to allow for an environmental score that may be correlated with the polygenic score, for example when the environmental factors mediate the genetic risk. We derive common measures of predictive accuracy and improvement as functions of the training sample size, chip heritabilities of disease and environmental score, and genetic correlation between disease and environmental risk factors. We consider simple addition of the two scores and a weighted sum that accounts for their correlation. Using examples from studies of cardiovascular disease and breast cancer, we show that improvements in discrimination are generally small but reasonable degrees of reclassification could be obtained with current sample sizes. Correlation between genetic and environmental scores has only minor effects on numerical results in realistic scenarios. In the longer term, as the accuracy of polygenic scores improves they will come to dominate the predictive accuracy compared to environmental scores.
Collapse
Affiliation(s)
- Frank Dudbridge
- Department of Health SciencesUniversity of LeicesterLeicesterUnited Kingdom
- Department of Non‐Communicable Disease EpidemiologyLondon School of Hygiene and Tropical MedicineLondonUnited Kingdom
- Department of Public Health and Primary CareUniversity of CambridgeCambridgeUnited Kingdom
- MRC Biostatistics UnitUniversity of CambridgeCambridgeUnited Kingdom
| | - Nora Pashayan
- Department of Applied Health ResearchUniversity College LondonLondonUnited Kingdom
| | - Jian Yang
- Institute for Molecular BioscienceUniversity of QueenslandBrisbaneQueenslandAustralia
- Queensland Brain InstituteUniversity of QueenslandBrisbaneQueenslandAustralia
| |
Collapse
|
10
|
So HC, Sham PC. Exploring the predictive power of polygenic scores derived from genome-wide association studies: a study of 10 complex traits. Bioinformatics 2017; 33:886-892. [PMID: 28065900 DOI: 10.1093/bioinformatics/btw745] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Accepted: 11/21/2016] [Indexed: 12/30/2022] Open
Abstract
Motivation It is hoped that advances in our knowledge in disease genomics will contribute to personalized medicine such as individualized preventive strategies or early diagnoses of diseases. With the growth of genome-wide association studies (GWAS) in the past decade, how far have we reached this goal? In this study we explored the predictive ability of polygenic risk scores (PRSs) derived from GWAS for a range of complex disease and traits. Results We first proposed a new approach to evaluate predictive performances of PRS at arbitrary P -value thresholds. The method was based on corrected estimates of effect sizes, accounting for possible false positives and selection bias. This approach requires no distributional assumptions and only requires summary statistics as input. The validity of the approach was verified in simulations. We explored the predictive power of PRS for ten complex traits, including type 2 diabetes (DM), coronary artery disease (CAD), triglycerides, high- and low-density lipoprotein, total cholesterol, schizophrenia (SCZ), bipolar disorder (BD), major depressive disorder and anxiety disorders. We found that the predictive ability of PRS for CAD and DM were modest (best AUC = 0.608 and 0.607) while for lipid traits the prediction R-squared ranged from 16.1 to 29.8%. For psychiatric disorders, the predictive power for SCZ was estimated to be the highest (best AUC 0.820), followed by BD. Predictive performance of other psychiatric disorders ranged from 0.543 to 0.585. Psychiatric traits tend to have more gradual rise in AUC when significance thresholds increase and achieve the best predictive power at higher P -values than cardiometabolic traits. Contact hcso@cuhk.edu.hk ; pcsham@hku.hk. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hon-Cheong So
- School of Biomedical Sciences, Chinese University of Hong Kong, Shatin, Hong Kong.,KIZ-CUHK Joint Laboratory of Bioresources and Molecular Research of Common Diseases, Kunming Zoology Institute of Zoology and Chinese University of Hong Kong, Hong Kong
| | - Pak C Sham
- Department of Psychiatry, University of Hong Kong, PokFuLam, Hong Kong.,Centre for Genomic Sciences, University of Hong Kong, PokFuLam, Hong Kong.,State Key Laboratory for Cognitive and Brain Sciences, University of Hong Kong, PokFuLam, Hong Kong.,Centre for Reproduction, Development and Growth, University of Hong Kong, PokFuLam, Hong Kong
| |
Collapse
|
11
|
Demler OV, Pencina MJ, Cook NR, D'Agostino RB. Asymptotic distribution of ∆AUC, NRIs, and IDI based on theory of U-statistics. Stat Med 2017. [PMID: 28627112 DOI: 10.1002/sim.7333] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
The change in area under the curve (∆AUC), the integrated discrimination improvement (IDI), and net reclassification index (NRI) are commonly used measures of risk prediction model performance. Some authors have reported good validity of associated methods of estimating their standard errors (SE) and construction of confidence intervals, whereas others have questioned their performance. To address these issues, we unite the ∆AUC, IDI, and three versions of the NRI under the umbrella of the U-statistics family. We rigorously show that the asymptotic behavior of ∆AUC, NRIs, and IDI fits the asymptotic distribution theory developed for U-statistics. We prove that the ∆AUC, NRIs, and IDI are asymptotically normal, unless they compare nested models under the null hypothesis. In the latter case, asymptotic normality and existing SE estimates cannot be applied to ∆AUC, NRIs, or IDI. In the former case, SE formulas proposed in the literature are equivalent to SE formulas obtained from U-statistics theory if we ignore adjustment for estimated parameters. We use Sukhatme-Randles-deWet condition to determine when adjustment for estimated parameters is necessary. We show that adjustment is not necessary for SEs of the ∆AUC and two versions of the NRI when added predictor variables are significant and normally distributed. The SEs of the IDI and three-category NRI should always be adjusted for estimated parameters. These results allow us to define when existing formulas for SE estimates can be used and when resampling methods such as the bootstrap should be used instead when comparing nested models. We also use the U-statistic theory to develop a new SE estimate of ∆AUC. Copyright © 2017 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Olga V Demler
- Division of Preventive Medicine, Brigham and Women's Hospital, 900 Commonwealth Avenue, Boston, MA, 02115, U.S.A
| | - Michael J Pencina
- Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, 27708, U.S.A
| | - Nancy R Cook
- Division of Preventive Medicine, Brigham and Women's Hospital, 900 Commonwealth Avenue, Boston, MA, 02115, U.S.A
| | - Ralph B D'Agostino
- Department of Mathematics and Statistics, Boston University, 111 Cummington Mall, Boston, MA, 02215, U.S.A
| |
Collapse
|
12
|
Butler CJ, Stanila BD, Iverson JB, Stone PA, Bryson M. Projected changes in climatic suitability for Kinosternon turtles by 2050 and 2070. Ecol Evol 2016; 6:7690-7705. [PMID: 27891218 PMCID: PMC5114705 DOI: 10.1002/ece3.2492] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 08/15/2016] [Accepted: 08/24/2016] [Indexed: 01/09/2023] Open
Abstract
Chelonians are expected to be negatively impacted by climate change due to limited vagility and temperature‐dependent sex determination. However, few studies have examined how freshwater turtle distributions may shift under different climate change scenarios. We used a maximum entropy approach to model the distribution of five widespread North American Kinosternon species (K. baurii, K. flavescens, K. hirtipes, K. sonoriense, and K. subrubrum) under four climate change scenarios. We found that areas with suitable climatic conditions for K. baurii and K. hirtipes are expected to decline substantially during the 21st century. In contrast, the area with suitable climate for K. sonoriense will remain essentially unchanged, while areas suitable for K. flavescens and K. subrubrum are expected to substantially increase. The centroid for the distribution of four of the five species shifted northward, while the centroid for K. sonoriense shifted slightly southward. Overall, centroids shifted at a median rate of 37.5 km per decade across all scenarios. Given the limited dispersal ability of turtles, it appears unlikely that range shifts will occur rapidly enough to keep pace with climate change during the 21st century. The ability of chelonians to modify behavioral and physiological responses in response to unfavorable conditions may allow turtles to persist for a time in areas that have become increasingly unsuitable, but this plasticity will likely only delay local extinctions.
Collapse
Affiliation(s)
| | - Brian D Stanila
- Department of Biology University of Central Oklahoma Edmond OK USA
| | | | - Paul A Stone
- Department of Biology University of Central Oklahoma Edmond OK USA
| | - Matthew Bryson
- Department of Biology University of Central Oklahoma Edmond OK USA
| |
Collapse
|
13
|
Yung G, Lin X. Validity of using ad hoc methods to analyze secondary traits in case-control association studies. Genet Epidemiol 2016; 40:732-743. [PMID: 27670932 DOI: 10.1002/gepi.21994] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Revised: 06/23/2016] [Accepted: 06/26/2016] [Indexed: 11/10/2022]
Abstract
Case-control association studies often collect from their subjects information on secondary phenotypes. Reusing the data and studying the association between genes and secondary phenotypes provide an attractive and cost-effective approach that can lead to discovery of new genetic associations. A number of approaches have been proposed, including simple and computationally efficient ad hoc methods that ignore ascertainment or stratify on case-control status. Justification for these approaches relies on the assumption of no covariates and the correct specification of the primary disease model as a logistic model. Both might not be true in practice, for example, in the presence of population stratification or the primary disease model following a probit model. In this paper, we investigate the validity of ad hoc methods in the presence of covariates and possible disease model misspecification. We show that in taking an ad hoc approach, it may be desirable to include covariates that affect the primary disease in the secondary phenotype model, even though these covariates are not necessarily associated with the secondary phenotype. We also show that when the disease is rare, ad hoc methods can lead to severely biased estimation and inference if the true disease model follows a probit model instead of a logistic model. Our results are justified theoretically and via simulations. Applied to real data analysis of genetic associations with cigarette smoking, ad hoc methods collectively identified as highly significant (P<10-5) single nucleotide polymorphisms from over 10 genes, genes that were identified in previous studies of smoking cessation.
Collapse
Affiliation(s)
- Godwin Yung
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Xihong Lin
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| |
Collapse
|
14
|
Zhao Y, Chen G, Yu H, Hu L, Bian Y, Yun D, Chen J, Mao Y, Chen H, Lu D. Development of risk prediction models for glioma based on genome-wide association study findings and comprehensive evaluation of predictive performances. Oncotarget 2016; 9:8311-8325. [PMID: 29492197 PMCID: PMC5823595 DOI: 10.18632/oncotarget.10882] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 06/29/2016] [Indexed: 12/17/2022] Open
Abstract
Over 14 common single nucleotide polymorphisms (SNP) have been consistently identified from genome-wide association studies (GWAS) as associated with glioma risk in European background. The extent to which and how these genetic variants can improve the prediction of glioma risk has was not been investigated. In this study, we employed three independent case-control datasets in Chinese populations, tested GWAS signals in dataset1, validated association results in dataset2, developed prediction models in dataset2 for the consistently replicated SNPs, refined the consistently replicated SNPs in dataset3 and developed tailored models for Chinese populations. For model construction, we aggregated the contribution of multiple SNPs into genetic risk scores (count GRS and weighed GRS) or predicted risks from logistic regression analyses (PRFLR). In dataset2, the area under receiver operating characteristic curves (AUC) of the 5 consistently replicated SNPs by PRFLR(SNPs) was 0.615, higher than those of all GRSs(ranging from 0.607 to 0.611, all P>0.05). The AUC of genetic profile significantly exceeded that of family history (fmc) alone (AUC=0.535, all P<0.001). The best model in our study comprised “PRURA +fmc” (AUC=0.646) in dataset3. Further model assessment analyses provided additional evidence. This study indicates that genetic markers have potential value for risk prediction of glioma.
Collapse
Affiliation(s)
- Yingjie Zhao
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| | - Gong Chen
- Neurosurgery Department of Huashan Hospital, Fudan University, Shanghai, China
| | - Hongjie Yu
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China.,Center for Genetic Epidemiology, School of Life Sciences, Fudan University, Shanghai, China
| | - Lingna Hu
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| | - Yunmeng Bian
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| | - Dapeng Yun
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| | - Juxiang Chen
- Department of Neurosurgery, Changzheng Hospital, Second Military Medical University, Shanghai, China
| | - Ying Mao
- Neurosurgery Department of Huashan Hospital, Fudan University, Shanghai, China
| | - Hongyan Chen
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| | - Daru Lu
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| |
Collapse
|
15
|
Pencina KM, Pencina MJ, D'Agostino RB. What to expect from net reclassification improvement with three categories. Stat Med 2014; 33:4975-87. [PMID: 25176621 DOI: 10.1002/sim.6286] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Revised: 07/22/2014] [Accepted: 07/23/2014] [Indexed: 02/02/2023]
Abstract
The net reclassification improvement (NRI) has become a popular measure of incremental usefulness of markers added to risk prediction models. However, the expected magnitude of the three-category NRI is not well understood, leading researchers to rely on statistical significance. In this paper, we describe a slight modification to the original definition of the NRI, which weighs each reclassification by the number of categories by which a given individual is reclassified. This modification resolves some recent criticisms of the three-category NRI and at the same time has a minimal impact on its magnitude. Then we show that using this modified definition, the event and nonevent NRIs have simple interpretations as sums of changes in sensitivities and specificities calculated at the risk thresholds. We exploit this relationship to arrive at closed-form solutions for the NRI under normality within the event and nonevent subgroups. We observe that the size of the intermediate risk category and the event rate have limited impact on the magnitude of the NRI. As expected, the NRI increases with the strength of the added marker, and this relationship appears fairly proportional for markers with non-weak net effect size (above 0.25). Furthermore, we conclude that using the NRI as a metric, it is harder to improve models that already perform well.
Collapse
Affiliation(s)
- Karol M Pencina
- Statistics and Consulting Unit, Department of Mathematics and Statistics, Boston University, Boston, 02215, MA, U.S.A
| | | | | |
Collapse
|
16
|
York EM, Butler CJ, Lord WD. Global decline in suitable habitat for Angiostrongylus ( = Parastrongylus) cantonensis: the role of climate change. PLoS One 2014; 9:e103831. [PMID: 25122457 PMCID: PMC4133392 DOI: 10.1371/journal.pone.0103831] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2014] [Accepted: 07/02/2014] [Indexed: 11/17/2022] Open
Abstract
Climate change is implicated in the alteration of the ranges of species worldwide. Such shifts in species distributions may introduce parasites/pathogens, hosts, and vectors associated with disease to new areas. The parasite Angiostrongylus ( = Parastrongylus) cantonensis is an invasive species that causes eosinophilic meningitis in humans and neurological abnormalities in domestic/wild animals. Although native to southeastern Asia, A. cantonensis has now been reported from more than 30 countries worldwide. Given the health risks, it is important to describe areas with potentially favorable climate for the establishment of A. cantonensis, as well as areas where this pathogen might become established in the future. We used the program Maxent to develop an ecological niche model for A. cantonensis based on 86 localities obtained from published literature. We then modeled areas of potential A. cantonensis distribution as well as areas projected to have suitable climatic conditions under four Representative Concentration Pathways (RCP) scenarios by the 2050s and the 2070s. The best model contained three bioclimatic variables: mean diurnal temperature range, minimum temperature of coldest month and precipitation of warmest quarter. Potentially suitable habitat for A. cantonensis was located worldwide in tropical and subtropical regions. Under all climate change RCP scenarios, the center of the projected distribution shifted away from the equator at a rate of 68–152 km per decade. However, the extent of areas with highly suitable habitat (>50%) declined by 10.66–15.66% by the 2050s and 13.11–16.11% by the 2070s. These results conflict with previous studies, which have generally found that the prevalence of tropical pathogens will increase during the 21st century. Moreover, it is likely that A. cantonensis will continue to expand its current range in the near future due to introductions and host expansion, whereas climate change will reduce the total geographic area of most suitable climatic conditions during the coming decades.
Collapse
Affiliation(s)
- Emily M York
- W. Roger Webb Forensic Science Institute, University of Central Oklahoma, Edmond, Oklahoma, United States of America; Department of Biology, University of Central Oklahoma, Edmond, Oklahoma, United States of America
| | - Christopher J Butler
- Department of Biology, University of Central Oklahoma, Edmond, Oklahoma, United States of America
| | - Wayne D Lord
- W. Roger Webb Forensic Science Institute, University of Central Oklahoma, Edmond, Oklahoma, United States of America; Department of Biology, University of Central Oklahoma, Edmond, Oklahoma, United States of America
| |
Collapse
|
17
|
Iyegbe C, Campbell D, Butler A, Ajnakina O, Sham P. The emerging molecular architecture of schizophrenia, polygenic risk scores and the clinical implications for GxE research. Soc Psychiatry Psychiatr Epidemiol 2014; 49:169-82. [PMID: 24435092 DOI: 10.1007/s00127-014-0823-2] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/10/2013] [Accepted: 01/08/2014] [Indexed: 02/07/2023]
Abstract
Schizophrenia is a devastating mental disorder. The level of risk in the general population is sustained by the persistence of social, environmental and biological factors, as well as their interactions. Socio-environmental risk factors for schizophrenia are well established and robust. The same can belatedly be said of genetic risk factors for the disorder. Recent progress in schizophrenia genetics is primarily fuelled by genome-wide association, which is able to leverage substantial proportions of additional explained variance previously classified as 'missing'. Here, we provide an outline of the emerging genetic landscape of schizophrenia and demonstrate how this knowledge can be turned into a simple empirical measure of genetic risk, known as a polygenic risk score. We highlight the statistical framework used to assess the clinical potential of the new score and finally, draw relevance to and discuss the clinical implications for the study of gene-environment interaction.
Collapse
Affiliation(s)
- Conrad Iyegbe
- Department of Psychosis Studies, Institute of Psychiatry, King's College, London, UK,
| | | | | | | | | |
Collapse
|
18
|
Yin X, Wineinger NE, Cheng H, Cui Y, Zhou F, Zuo X, Zheng X, Yang S, Schork NJ, Zhang X. Common variants explain a large fraction of the variability in the liability to psoriasis in a Han Chinese population. BMC Genomics 2014; 15:87. [PMID: 24479639 PMCID: PMC3909441 DOI: 10.1186/1471-2164-15-87] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2013] [Accepted: 01/27/2014] [Indexed: 12/14/2022] Open
Abstract
Background Psoriasis is a common inflammatory skin disease with a known genetic component. Our previously published psoriasis genome-wide association study identified dozens of novel susceptibility loci in Han Chinese. However, these markers explained only a small fraction of the estimated heritable component of psoriasis. To better understand the unknown yet likely polygenic architecture in psoriasis, we applied a linear mixed model to quantify the variation in the liability to psoriasis explained by common genetic markers (minor allele frequency > 0.01) in a Han Chinese population. Results We explored the polygenic genetic architecture of psoriasis using genome-wide association data from 2,271 Han Chinese individuals. We estimated that 34.9% (s.e. = 6.0%, P = 9 × 10-9) of the variation in the liability to psoriasis is captured by common genotyped and imputed variants. We discuss these results in the context of the strong association between HLA variants and psoriasis. We also show that the variance explained by each chromosome is linearly correlated to its length (R2 = 0.27, P=0.01), and quantify the impact of a polygenic effect on the prediction and diagnosis of psoriasis. Conclusions Our results suggest that psoriasis has a substantial polygenic component, which not only has implications for the development of genetic diagnostics and prognostics for psoriasis, but also suggests that more individual variants contributing to psoriasis may be detected if sample sizes in future association studies are increased.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Nicholas J Schork
- Institute of Dermatology, Department of Dermatology, The First Affiliated Hospital, Anhui Medical University, Hefei, Anhui Province 230032, China.
| | | |
Collapse
|
19
|
Ajnakina O, Borges S, Di Forti M, Patel Y, Xu X, Green P, Stilo SA, Kolliakou A, Sood P, Marques TR, David AS, Prata D, Dazzan P, Powell J, Pariante C, Mondelli V, Morgan C, Murray RM, Fisher HL, Iyegbe C. Role of Environmental Confounding in the Association between FKBP5 and First-Episode Psychosis. Front Psychiatry 2014; 5:84. [PMID: 25101008 PMCID: PMC4101879 DOI: 10.3389/fpsyt.2014.00084] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Accepted: 07/03/2014] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND Failure to account for the etiological diversity that typically occurs in psychiatric cohorts may increase the potential for confounding as a proportion of genetic variance will be specific to exposures that have varying distributions in cases. This study investigated whether minimizing the potential for such confounding strengthened the evidence for a genetic candidate currently unsupported at the genome-wide level. METHODS Two hundred and ninety-one first-episode psychosis cases from South London, UK and 218 unaffected controls were evaluated for a functional polymorphism at the rs1360780 locus in FKBP5. The relationship between FKBP5 and psychosis was modeled using logistic regression. Cannabis use (Cannabis Experiences Questionnaire) and parental separation (Childhood Experience of Care and Abuse Questionnaire) were included as confounders in the analysis. RESULTS Association at rs1360780 was not detected until the effects of the two environmental factors had been adjusted for in the model (OR = 2.81, 95% CI 1.23-6.43, p = 0.02). A statistical interaction between rs1360780 and parental separation was confirmed by stratified tests (OR = 2.8, p = 0.02 vs. OR = 0.89, p = 0.80). The genetic main effect was directionally consistent with findings in other (stress-related) clinical phenotypes. Moreover, the variation in effect magnitude was explained by the level of power associated with different cannabis constructs used in the model (r = 0.95). CONCLUSION Our results suggest that the extent to which genetic variants in FKBP5 can influence susceptibility to psychosis may depend on other etiological factors. This finding requires further validation in large independent cohorts. Potentially this work could have translational implications; the ability to discriminate between genetic etiologies based on a case-by-case understanding of previous environmental exposures would confer an important clinical advantage that would benefit the delivery of personalizable treatment strategies.
Collapse
Affiliation(s)
- Olesya Ajnakina
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - Susana Borges
- Department of Health Services and Population Research, Institute of Psychiatry, King's College London , London , UK
| | - Marta Di Forti
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - Yogen Patel
- Department of Neuroscience, Institute of Psychiatry, King's College London , London , UK
| | - Xiaohui Xu
- MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London , London , UK
| | - Priscilla Green
- Department of Neuroscience, Institute of Psychiatry, King's College London , London , UK
| | - Simona A Stilo
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - Anna Kolliakou
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - Poonam Sood
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - Tiago Reis Marques
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - Anthony S David
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - Diana Prata
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - Paola Dazzan
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - John Powell
- Department of Neuroscience, Institute of Psychiatry, King's College London , London , UK
| | - Carmine Pariante
- Department of Psychological Medicine, Institute of Psychiatry, King's College London , London , UK
| | - Valeria Mondelli
- Department of Psychological Medicine, Institute of Psychiatry, King's College London , London , UK
| | - Craig Morgan
- Department of Health Services and Population Research, Institute of Psychiatry, King's College London , London , UK
| | - Robin M Murray
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK
| | - Helen L Fisher
- MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London , London , UK
| | - Conrad Iyegbe
- Department of Psychosis Studies, Institute of Psychiatry, King's College London , London , UK ; MRC Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King's College London , London , UK
| |
Collapse
|
20
|
Lee SH, Wray NR. Novel genetic analysis for case-control genome-wide association studies: quantification of power and genomic prediction accuracy. PLoS One 2013; 8:e71494. [PMID: 23977056 PMCID: PMC3747270 DOI: 10.1371/journal.pone.0071494] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2013] [Accepted: 07/05/2013] [Indexed: 11/19/2022] Open
Abstract
Genome-wide association studies (GWAS) are routinely conducted for both quantitative and binary (disease) traits. We present two analytical tools for use in the experimental design of GWAS. Firstly, we present power calculations quantifying power in a unified framework for a range of scenarios. In this context we consider the utility of quantitative scores (e.g. endophenotypes) that may be available on cases only or both cases and controls. Secondly, we consider, the accuracy of prediction of genetic risk from genome-wide SNPs and derive an expression for genomic prediction accuracy using a liability threshold model for disease traits in a case-control design. The expected values based on our derived equations for both power and prediction accuracy agree well with observed estimates from simulations.
Collapse
Affiliation(s)
- Sang Hong Lee
- Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, Australia
- * E-mail:
| | - Naomi R. Wray
- Queensland Brain Institute, The University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
21
|
Utility in prognostic value added by molecular profiles for diffuse large B-cell lymphoma. Blood 2013; 121:3052-4. [PMID: 23580636 DOI: 10.1182/blood-2013-01-477521] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
|
22
|
Abstract
Polygenic scores have recently been used to summarise genetic effects among an ensemble of markers that do not individually achieve significance in a large-scale association study. Markers are selected using an initial training sample and used to construct a score in an independent replication sample by forming the weighted sum of associated alleles within each subject. Association between a trait and this composite score implies that a genetic signal is present among the selected markers, and the score can then be used for prediction of individual trait values. This approach has been used to obtain evidence of a genetic effect when no single markers are significant, to establish a common genetic basis for related disorders, and to construct risk prediction models. In some cases, however, the desired association or prediction has not been achieved. Here, the power and predictive accuracy of a polygenic score are derived from a quantitative genetics model as a function of the sizes of the two samples, explained genetic variance, selection thresholds for including a marker in the score, and methods for weighting effect sizes in the score. Expressions are derived for quantitative and discrete traits, the latter allowing for case/control sampling. A novel approach to estimating the variance explained by a marker panel is also proposed. It is shown that published studies with significant association of polygenic scores have been well powered, whereas those with negative results can be explained by low sample size. It is also shown that useful levels of prediction may only be approached when predictors are estimated from very large samples, up to an order of magnitude greater than currently available. Therefore, polygenic scores currently have more utility for association testing than predicting complex traits, but prediction will become more feasible as sample sizes continue to grow. Recently there has been much interest in combining multiple genetic markers into a single score for predicting disease risk. Even if many of the individual markers have no detected effect, the combined score could be a strong predictor of disease. This has allowed researchers to demonstrate that some diseases have a strong genetic basis, even if few actual genes have been identified, and it has also revealed a common genetic basis for distinct diseases. These analyses have so far been performed opportunistically, with mixed results. Here I derive formulae based on the heritability of disease and size of the study, allowing researchers to plan their analyses from a more informed position. I show that discouraging results in some previous studies were due to the low number of subjects studied, but a modest increase in study size would allow more successful analysis. However, I also show that, for genetics to become useful for predicting individual risk of disease, hundreds of thousands of subjects may be needed to estimate the gene effects. This is larger than most existing studies, but will become more common in the near future, so that gene scores will become more useful for predicting disease than has appeared to date.
Collapse
Affiliation(s)
- Frank Dudbridge
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, United Kingdom.
| |
Collapse
|
23
|
Elosua R, Lucas G, Lluis-Ganella C. Genetics and Cardiovascular Risk Prediction: A Step Toward Personalized Medicine? CURRENT CARDIOVASCULAR RISK REPORTS 2013. [DOI: 10.1007/s12170-012-0285-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
24
|
Informed conditioning on clinical covariates increases power in case-control association studies. PLoS Genet 2012; 8:e1003032. [PMID: 23144628 PMCID: PMC3493452 DOI: 10.1371/journal.pgen.1003032] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2012] [Accepted: 08/26/2012] [Indexed: 01/23/2023] Open
Abstract
Genetic case-control association studies often include data on clinical covariates, such as body mass index (BMI), smoking status, or age, that may modify the underlying genetic risk of case or control samples. For example, in type 2 diabetes, odds ratios for established variants estimated from low–BMI cases are larger than those estimated from high–BMI cases. An unanswered question is how to use this information to maximize statistical power in case-control studies that ascertain individuals on the basis of phenotype (case-control ascertainment) or phenotype and clinical covariates (case-control-covariate ascertainment). While current approaches improve power in studies with random ascertainment, they often lose power under case-control ascertainment and fail to capture available power increases under case-control-covariate ascertainment. We show that an informed conditioning approach, based on the liability threshold model with parameters informed by external epidemiological information, fully accounts for disease prevalence and non-random ascertainment of phenotype as well as covariates and provides a substantial increase in power while maintaining a properly controlled false-positive rate. Our method outperforms standard case-control association tests with or without covariates, tests of gene x covariate interaction, and previously proposed tests for dealing with covariates in ascertained data, with especially large improvements in the case of case-control-covariate ascertainment. We investigate empirical case-control studies of type 2 diabetes, prostate cancer, lung cancer, breast cancer, rheumatoid arthritis, age-related macular degeneration, and end-stage kidney disease over a total of 89,726 samples. In these datasets, informed conditioning outperforms logistic regression for 115 of the 157 known associated variants investigated (P-value = 1×10−9). The improvement varied across diseases with a 16% median increase in χ2 test statistics and a commensurate increase in power. This suggests that applying our method to existing and future association studies of these diseases may identify novel disease loci. This work describes a new methodology for analyzing genome-wide case-control association studies of diseases with strong correlations to clinical covariates, such as age in prostate cancer and body mass index in type 2 diabetes. Currently, researchers either ignore these clinical covariates or apply approaches that ignore the disease's prevalence and the study's ascertainment strategy. We take an alternative approach, leveraging external prevalence information from the epidemiological literature and constructing a statistic based on the classic liability threshold model of disease. Our approach not only improves the power of studies that ascertain individuals randomly or based on the disease phenotype, but also improves the power of studies that ascertain individuals based on both the disease phenotype and clinical covariates. We apply our statistic to seven datasets over six different diseases and a variety of clinical covariates. We found that there was a substantial improvement in test statistics relative to current approaches at known associated variants. This suggests that novel loci may be identified by applying our method to existing and future association studies of these diseases.
Collapse
|
25
|
Wei C, Schaid DJ, Lu Q. Trees Assembling Mann-Whitney approach for detecting genome-wide joint association among low-marginal-effect loci. Genet Epidemiol 2012; 37:84-91. [PMID: 23135745 DOI: 10.1002/gepi.21693] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2012] [Revised: 07/17/2012] [Accepted: 09/28/2012] [Indexed: 11/07/2022]
Abstract
Common complex diseases are likely influenced by the interplay of hundreds, or even thousands, of genetic variants. Converging evidence shows that genetic variants with low marginal effects (LMEs) play an important role in disease development. Despite their potential significance, discovering LME genetic variants and assessing their joint association on high-dimensional data (e.g., genome-wide data) remain a great challenge. To facilitate joint association analysis among a large ensemble of LME genetic variants, we proposed a computationally efficient and powerful approach, which we call Trees Assembling Mann-Whitney (TAMW). Through simulation studies and an empirical data application, we found that TAMW outperformed multifactor dimensionality reduction (MDR) and the likelihood ratio-based Mann-Whitney approach (LRMW) when the underlying complex disease involves multiple LME loci and their interactions. For instance, in a simulation with 20 interacting LME loci, TAMW attained a higher power (power = 0.931) than both MDR (power = 0.599) and LRMW (power = 0.704). In an empirical study of 29 known Crohn's disease (CD) loci, TAMW also identified a stronger joint association with CD than those detected by MDR and LRMW. Finally, we applied TAMW to Wellcome Trust CD GWAS to conduct a genome-wide analysis. The analysis of 459K single nucleotide polymorphisms was completed in 40 hrs using parallel computing, and revealed a joint association predisposing to CD (P-value = 2.763 × 10(-19)). Further analysis of the newly discovered association suggested that 13 genes, such as ATG16L1 and LACC1, may play an important role in CD pathophysiological and etiological processes.
Collapse
Affiliation(s)
- Changshuai Wei
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, Michigan 48824, USA
| | | | | |
Collapse
|
26
|
Do CB, Hinds DA, Francke U, Eriksson N. Comparison of family history and SNPs for predicting risk of complex disease. PLoS Genet 2012; 8:e1002973. [PMID: 23071447 PMCID: PMC3469463 DOI: 10.1371/journal.pgen.1002973] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2012] [Accepted: 08/08/2012] [Indexed: 12/18/2022] Open
Abstract
The clinical utility of family history and genetic tests is generally well understood for simple Mendelian disorders and rare subforms of complex diseases that are directly attributable to highly penetrant genetic variants. However, little is presently known regarding the performance of these methods in situations where disease susceptibility depends on the cumulative contribution of multiple genetic factors of moderate or low penetrance. Using quantitative genetic theory, we develop a model for studying the predictive ability of family history and single nucleotide polymorphism (SNP)–based methods for assessing risk of polygenic disorders. We show that family history is most useful for highly common, heritable conditions (e.g., coronary artery disease), where it explains roughly 20%–30% of disease heritability, on par with the most successful SNP models based on associations discovered to date. In contrast, we find that for diseases of moderate or low frequency (e.g., Crohn disease) family history accounts for less than 4% of disease heritability, substantially lagging behind SNPs in almost all cases. These results indicate that, for a broad range of diseases, already identified SNP associations may be better predictors of risk than their family history–based counterparts, despite the large fraction of missing heritability that remains to be explained. Our model illustrates the difficulty of using either family history or SNPs for standalone disease prediction. On the other hand, we show that, unlike family history, SNP–based tests can reveal extreme likelihood ratios for a relatively large percentage of individuals, thus providing potentially valuable adjunctive evidence in a differential diagnosis. In clinical practice, obtaining a detailed family history is often considered the standard-of-care for characterizing the inherited component of an individual's disease risk. Recently, genetic risk assessments based on the cumulative effect of known single nucleotide polymorphism (SNP) disease associations have been proposed as another potentially useful source of information. To date, however, little is known regarding the predictive power of each approach. In this study, we develop models based on quantitative genetic theory to analyze and compare family history and SNP–based models. Our models explain the impact of disease frequency and heritability on performance for each method, and reveal a wide range of scenarios (16 out of the 23 diseases considered) where SNP associations may already be better predictors of risk than family history. Our results confirm the difficulty of obtaining accurate prediction when SNP or family history–based methods are used alone, and they show the benefits of combining information from the two approaches. They also suggest that, in some situations, SNP associations may be potentially useful as supporting evidence alongside other types of clinical information. To our knowledge, this study is the first broad comparison of family history– and SNP–based methods across a wide range of health conditions.
Collapse
|
27
|
Golan D, Rosset S. Comment on "the predictive capacity of personal genome sequencing". Sci Transl Med 2012; 4:135le4; author reply 135lr3. [PMID: 22623737 DOI: 10.1126/scitranslmed.3004197] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
|
28
|
Dreyfuss JM, Levner D, Galagan JE, Church GM, Ramoni MF. How accurate can genetic predictions be? BMC Genomics 2012; 13:340. [PMID: 22827772 PMCID: PMC3534619 DOI: 10.1186/1471-2164-13-340] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Accepted: 07/01/2012] [Indexed: 01/08/2023] Open
Abstract
Background Pre-symptomatic prediction of disease and drug response based on genetic testing is a critical component of personalized medicine. Previous work has demonstrated that the predictive capacity of genetic testing is constrained by the heritability and prevalence of the tested trait, although these constraints have only been approximated under the assumption of a normally distributed genetic risk distribution. Results Here, we mathematically derive the absolute limits that these factors impose on test accuracy in the absence of any distributional assumptions on risk. We present these limits in terms of the best-case receiver-operating characteristic (ROC) curve, consisting of the best-case test sensitivities and specificities, and the AUC (area under the curve) measure of accuracy. We apply our method to genetic prediction of type 2 diabetes and breast cancer, and we additionally show the best possible accuracy that can be obtained from integrated predictors, which can incorporate non-genetic features. Conclusion Knowledge of such limits is valuable in understanding the implications of genetic testing even before additional associations are identified.
Collapse
|
29
|
Aschard H, Chen J, Cornelis MC, Chibnik LB, Karlson EW, Kraft P. Inclusion of gene-gene and gene-environment interactions unlikely to dramatically improve risk prediction for complex diseases. Am J Hum Genet 2012; 90:962-72. [PMID: 22633398 PMCID: PMC3370279 DOI: 10.1016/j.ajhg.2012.04.017] [Citation(s) in RCA: 80] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Revised: 03/23/2012] [Accepted: 04/10/2012] [Indexed: 11/25/2022] Open
Abstract
Genome-wide association studies have identified hundreds of common genetic variants associated with the risk of multifactorial diseases. However, their impact on discrimination and risk prediction is limited. It has been suggested that the identification of gene-gene (G-G) and gene-environment (G-E) interactions would improve disease prediction and facilitate prevention. We conducted a simulation study to explore the potential improvement in discrimination if G-G and G-E interactions exist and are known. We used three diseases (breast cancer, type 2 diabetes, and rheumatoid arthritis) as motivating examples. We show that the inclusion of G-G and G-E interaction effects in risk-prediction models is unlikely to dramatically improve the discrimination ability of these models.
Collapse
Affiliation(s)
- Hugues Aschard
- Program in Molecular and Genetic Epidemiology, Harvard School of Public Health, Boston, MA 02115, USA.
| | | | | | | | | | | |
Collapse
|
30
|
Darabi H, Czene K, Zhao W, Liu J, Hall P, Humphreys K. Breast cancer risk prediction and individualised screening based on common genetic variation and breast density measurement. Breast Cancer Res 2012; 14:R25. [PMID: 22314178 PMCID: PMC3496143 DOI: 10.1186/bcr3110] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2011] [Revised: 01/12/2012] [Accepted: 02/07/2012] [Indexed: 02/06/2023] Open
Abstract
Introduction Over the last decade several breast cancer risk alleles have been identified which has led to an increased interest in individualised risk prediction for clinical purposes. Methods We investigate the performance of an up-to-date 18 breast cancer risk single-nucleotide polymorphisms (SNPs), together with mammographic percentage density (PD), body mass index (BMI) and clinical risk factors in predicting absolute risk of breast cancer, empirically, in a well characterised Swedish case-control study of postmenopausal women. We examined the efficiency of various prediction models at a population level for individualised screening by extending a recently proposed analytical approach for estimating number of cases captured. Results The performance of a risk prediction model based on an initial set of seven breast cancer risk SNPs is improved by additionally including eleven more recently established breast cancer risk SNPs (P = 4.69 × 10-4). Adding mammographic PD, BMI and all 18 SNPs to a Swedish Gail model improved the discriminatory accuracy (the AUC statistic) from 55% to 62%. The net reclassification improvement was used to assess improvement in classification of women into low, intermediate, and high categories of 5-year risk (P = 8.93 × 10-9). For scenarios we considered, we estimated that an individualised screening strategy based on risk models incorporating clinical risk factors, mammographic density and SNPs, captures 10% more cases than a screening strategy using the same resources, based on age alone. Estimates of numbers of cases captured by screening stratified by age provide insight into how individualised screening programs might appear in practice. Conclusions Taken together, genetic risk factors and mammographic density offer moderate improvements to clinical risk factor models for predicting breast cancer.
Collapse
Affiliation(s)
- Hatef Darabi
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, P,O, Box 281, Stockholm 177 71, Sweden.
| | | | | | | | | | | |
Collapse
|
31
|
Abstract
Biomarkers are the measurable characteristics of an individual that may represent risk factors for a disease or outcome, or that may be indicators of disease progression or of treatment-associated changes. In general, the process by which biomarkers, once identified, might be translated into clinical practice has received scant attention in recent psychiatric literature. A body of work in diagnostic development suggests a framework for evaluating and validating novel biomarkers, but this work may be unfamiliar to clinical and translational researchers in psychiatry. Therefore, this review focuses on the steps that might follow the identification of putative biomarkers. It first addresses standard approaches to characterizing biomarker performance, followed by demonstrations of how a putative biomarker might be shown to have clinical relevance. Finally, it addresses ways in which a biomarker-based test might be validated for clinical application in terms of efficacy and cost-effectiveness.
Collapse
|
32
|
Torkamani A, Scott-Van Zeeland AA, Topol EJ, Schork NJ. Annotating individual human genomes. Genomics 2011; 98:233-41. [PMID: 21839162 DOI: 10.1016/j.ygeno.2011.07.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2011] [Accepted: 07/26/2011] [Indexed: 02/03/2023]
Abstract
Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants.
Collapse
|
33
|
So HC, Kwan JSH, Cherny SS, Sham PC. Risk prediction of complex diseases from family history and known susceptibility loci, with applications for cancer screening. Am J Hum Genet 2011; 88:548-65. [PMID: 21529750 DOI: 10.1016/j.ajhg.2011.04.001] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2010] [Revised: 03/27/2011] [Accepted: 04/04/2011] [Indexed: 01/18/2023] Open
Abstract
Risk prediction based on genomic profiles has raised a lot of attention recently. However, family history is usually ignored in genetic risk prediction. In this study we proposed a statistical framework for risk prediction given an individual's genotype profile and family history. Genotype information about the relatives can also be incorporated. We allow risk prediction given the current age and follow-up period and consider competing risks of mortality. The framework allows easy extension to any family size and structure. In addition, the predicted risk at any percentile and the risk distribution graphs can be computed analytically. We applied the method to risk prediction for breast and prostate cancers by using known susceptibility loci from genome-wide association studies. For breast cancer, in the population the 10-year risk at age 50 ranged from 1.1% at the 5th percentile to 4.7% at the 95th percentile. If we consider the average 10-year risk at age 50 (2.39%) as the threshold for screening, the screening age ranged from 62 at the 20th percentile to 38 at the 95th percentile (and some never reach the threshold). For women with one affected first-degree relative, the 10-year risks ranged from 2.6% (at the 5th percentile) to 8.1% (at the 95th percentile). For prostate cancer, the corresponding 10-year risks at age 60 varied from 1.8% to 14.9% in the population and from 4.2% to 23.2% in those with an affected first-degree relative. We suggest that for some diseases genetic testing that incorporates family history can stratify people into diverse risk categories and might be useful in targeted prevention and screening.
Collapse
Affiliation(s)
- Hon-Cheong So
- Department of Psychiatry, University of Hong Kong, Hong Kong SAR, China
| | | | | | | |
Collapse
|