1
|
Liyanage JSS, Estepp JH, Srivastava K, Li Y, Mori M, Kang G. GMEPS: a fast and efficient likelihood approach for genome-wide mediation analysis under extreme phenotype sequencing. Stat Appl Genet Mol Biol 2022; 21:sagmb-2021-0071. [PMID: 35266368 DOI: 10.1515/sagmb-2021-0071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 02/17/2022] [Indexed: 11/15/2022]
Abstract
Due to many advantages such as higher statistical power of detecting the association of genetic variants in human disorders and cost saving, extreme phenotype sequencing (EPS) is a rapidly emerging study design in epidemiological and clinical studies investigating how genetic variations associate with complex phenotypes. However, the investigation of the mediation effect of genetic variants on phenotypes is strictly restrictive under the EPS design because existing methods cannot well accommodate the non-random extreme tails sampling process incurred by the EPS design. In this paper, we propose a likelihood approach for testing the mediation effect of genetic variants through continuous and binary mediators on a continuous phenotype under the EPS design (GMEPS). Besides implementing in EPS design, it can also be utilized as a general mediation analysis procedure. Extensive simulations and two real data applications of a genome-wide association study of benign ethnic neutropenia under EPS design and a candidate-gene study of neurocognitive performance in patients with sickle cell disease under random sampling design demonstrate the superiority of GMEPS under the EPS design over widely used mediation analysis procedures, while demonstrating compatible capabilities under the general random sampling framework.
Collapse
Affiliation(s)
- Janaka S S Liyanage
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis 38105, TN, USA
| | - Jeremie H Estepp
- Departments of Global Pediatric Medicine and Hematology, St. Jude Children's Research Hospital, Memphis 38105, TN, USA
| | - Kumar Srivastava
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis 38105, TN, USA
| | - Yun Li
- Department of Biostatistics, Department of Genetics, Department of Computer Science, The University of North Carolina at Chapel Hill, Chapel Hill 27599, NC, USA
| | - Motomi Mori
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis 38105, TN, USA
| | - Guolian Kang
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis 38105, TN, USA
| |
Collapse
|
2
|
Onifade M, Roy-Gagnon MH, Parent MÉ, Burkett KM. Comparison of mixed model based approaches for correcting for population substructure with application to extreme phenotype sampling. BMC Genomics 2022; 23:98. [PMID: 35120458 PMCID: PMC8815214 DOI: 10.1186/s12864-022-08297-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Accepted: 01/06/2022] [Indexed: 11/10/2022] Open
Abstract
Background Mixed models are used to correct for confounding due to population stratification and hidden relatedness in genome-wide association studies. This class of models includes linear mixed models and generalized linear mixed models. Existing mixed model approaches to correct for population substructure have been previously investigated with both continuous and case-control response variables. However, they have not been investigated in the context of extreme phenotype sampling (EPS), where genetic covariates are only collected on samples having extreme response variable values. In this work, we compare the performance of existing binary trait mixed model approaches (GMMAT, LEAP and CARAT) on EPS data. Since linear mixed models are commonly used even with binary traits, we also evaluate the performance of a popular linear mixed model implementation (GEMMA). Results We used simulation studies to estimate the type I error rate and power of all approaches assuming a population with substructure. Our simulation results show that for a common candidate variant, both LEAP and GMMAT control the type I error rate while CARAT’s rate remains inflated. We applied all methods to a real dataset from a Québec, Canada, case-control study that is known to have population substructure. We observe similar type I error control with the analysis on the Québec dataset. For rare variants, the false positive rate remains inflated even after correction with mixed model approaches. For methods that control the type I error rate, the estimated power is comparable. Conclusions The methods compared in this study differ in their type I error control. Therefore, when data are from an EPS study, care should be taken to ensure that the models underlying the methodology are suitable to the sampling strategy and to the minor allele frequency of the candidate SNPs. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-022-08297-y).
Collapse
Affiliation(s)
- Maryam Onifade
- Department of Mathematics and Statistics, University of Ottawa, Ottawa, Canada
| | | | - Marie-Élise Parent
- Centre Armand-Frappier Santé Biotechnologie, Institut national de la recherche scientifique, Université du Québec, Laval, Canada
| | - Kelly M Burkett
- Department of Mathematics and Statistics, University of Ottawa, Ottawa, Canada.
| |
Collapse
|
3
|
Liu Y, Xia J, McKay J, Tsavachidis S, Xiao X, Spitz MR, Cheng C, Byun J, Hong W, Li Y, Zhu D, Song Z, Rosenberg SM, Scheurer ME, Kheradmand F, Pikielny CW, Lusk CM, Schwartz AG, Wistuba II, Cho MH, Silverman EK, Bailey-Wilson J, Pinney SM, Anderson M, Kupert E, Gaba C, Mandal D, You M, de Andrade M, Yang P, Liloglou T, Davies MPA, Lissowska J, Swiatkowska B, Zaridze D, Mukeria A, Janout V, Holcatova I, Mates D, Stojsic J, Scelo G, Brennan P, Liu G, Field JK, Hung RJ, Christiani DC, Amos CI. Rare deleterious germline variants and risk of lung cancer. NPJ Precis Oncol 2021; 5:12. [PMID: 33594163 PMCID: PMC7887261 DOI: 10.1038/s41698-021-00146-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 12/11/2020] [Indexed: 01/19/2023] Open
Abstract
Recent studies suggest that rare variants exhibit stronger effect sizes and might play a crucial role in the etiology of lung cancers (LC). Whole exome plus targeted sequencing of germline DNA was performed on 1045 LC cases and 885 controls in the discovery set. To unveil the inherited causal variants, we focused on rare and predicted deleterious variants and small indels enriched in cases or controls. Promising candidates were further validated in a series of 26,803 LCs and 555,107 controls. During discovery, we identified 25 rare deleterious variants associated with LC susceptibility, including 13 reported in ClinVar. Of the five validated candidates, we discovered two pathogenic variants in known LC susceptibility loci, ATM p.V2716A (Odds Ratio [OR] 19.55, 95%CI 5.04-75.6) and MPZL2 p.I24M frameshift deletion (OR 3.88, 95%CI 1.71-8.8); and three in novel LC susceptibility genes, POMC c.*28delT at 3' UTR (OR 4.33, 95%CI 2.03-9.24), STAU2 p.N364M frameshift deletion (OR 4.48, 95%CI 1.73-11.55), and MLNR p.Q334V frameshift deletion (OR 2.69, 95%CI 1.33-5.43). The potential cancer-promoting role of selected candidate genes and variants was further supported by endogenous DNA damage assays. Our analyses led to the identification of new rare deleterious variants with LC susceptibility. However, in-depth mechanistic studies are still needed to evaluate the pathogenic effects of these specific alleles.
Collapse
Grants
- R01 CA060691 NCI NIH HHS
- U19 CA203654 NCI NIH HHS
- R01 CA084354 NCI NIH HHS
- R01 HL110883 NHLBI NIH HHS
- U01 CA076293 NCI NIH HHS
- R01 CA080127 NCI NIH HHS
- R01 CA141769 NCI NIH HHS
- P30 ES006096 NIEHS NIH HHS
- P50 CA090578 NCI NIH HHS
- P30 CA022453 NCI NIH HHS
- S10 RR024574 NCRR NIH HHS
- HHSN261201300011C NCI NIH HHS
- R01 CA134682 NCI NIH HHS
- R01 CA134433 NCI NIH HHS
- R01 HL113264 NHLBI NIH HHS
- R01 HL082487 NHLBI NIH HHS
- R01 CA250905 NCI NIH HHS
- U19 CA148127 NCI NIH HHS
- P20 GM103534 NIGMS NIH HHS
- R01 CA092824 NCI NIH HHS
- R01 CA087895 NCI NIH HHS
- U01 HL089897 NHLBI NIH HHS
- K07 CA181480 NCI NIH HHS
- HHSN268201100011I NHLBI NIH HHS
- HHSN268201100011C NHLBI NIH HHS
- R01 CA127219 NCI NIH HHS
- R01 CA074386 NCI NIH HHS
- P30 CA023108 NCI NIH HHS
- U01 HL089856 NHLBI NIH HHS
- P30 ES030285 NIEHS NIH HHS
- P30 CA125123 NCI NIH HHS
- DP1 AG072751 NIA NIH HHS
- U01 CA243483 NCI NIH HHS
- HHSN268200782096C NHLBI NIH HHS
- HHSN268201200007C NHLBI NIH HHS
- N01HG65404 NHGRI NIH HHS
- R35 GM122598 NIGMS NIH HHS
- U01 CA209414 NCI NIH HHS
- R03 CA077118 NCI NIH HHS
- 001 World Health Organization
- DP1 CA174424 NCI NIH HHS
- This work was supported by grants from the National Institutes of Health (R01CA127219, R01CA141769, R01CA060691, R01CA87895, R01CA80127, R01CA84354, R01CA134682, R01CA134433, R01CA074386, R01CA092824, R01CA250905, R01HL113264, R01HL082487, R01HL110883, R03CA77118, P20GM103534, P30CA125123, P30CA023108, P30CA022453, P30ES006096, P50CA090578, U01CA243483, U01HL089856, U01HL089897, U01CA76293, U19CA148127, U01CA209414, K07CA181480, N01-HG-65404, HHSN268200782096C, HHSN261201300011I, HHSN268201100011, HHSN268201 200007C, DP1-CA174424, DP1-AG072751, CA125123, RR024574, Intramural Research Program of the National Human Genome Research Institute (JEB-W), and Herrick Foundation. Dr. Amos is an Established Research Scholar of the Cancer Prevention Research Institute of Texas (RR170048). We also want to acknowledge the Cytometry and Cell Sorting Core support by the Cancer Prevention and Research Institute of Texas Core Facility (RP180672). At Toronto, the study is supported by The Canadian Cancer Society Research Institute (# 020214) to R. H., Ontario Institute for Cancer Research to R. H, and the Alan Brown Chair to G. L. and Lusi Wong Programs at the Princess Margaret Hospital Foundation. The Liverpool Lung Project is supported by Roy Castle Lung Cancer Foundation.
Collapse
Affiliation(s)
- Yanhong Liu
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
| | - Jun Xia
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - James McKay
- International Agency for Research on Cancer, Lyon, France
| | - Spiridon Tsavachidis
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
| | - Xiangjun Xiao
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - Margaret R Spitz
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
| | - Chao Cheng
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - Jinyoung Byun
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - Wei Hong
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - Yafang Li
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - Dakai Zhu
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - Zhuoyi Song
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA
| | - Susan M Rosenberg
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA
| | - Michael E Scheurer
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
- Department of Pediatrics, Baylor College of Medicine, Houston, TX, USA
| | - Farrah Kheradmand
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX, USA
- Michael E. DeBakey Veterans Affairs Medical Center, Houston, TX, USA
| | - Claudio W Pikielny
- Department of Biomedical Data Science, Geisel School of Medicine, Dartmouth College, Lebanon, NH, USA
| | - Christine M Lusk
- Karmanos Cancer Institute, Wayne State University, Detroit, MI, USA
| | - Ann G Schwartz
- Karmanos Cancer Institute, Wayne State University, Detroit, MI, USA
| | - Ignacio I Wistuba
- Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Michael H Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Susan M Pinney
- University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | | | - Elena Kupert
- University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Colette Gaba
- The University of Toledo College of Medicine, Toledo, OH, USA
| | - Diptasri Mandal
- Louisiana State University Health Sciences Center, New Orleans, LA, USA
| | - Ming You
- Medical College of Wisconsin, Milwaukee, WI, USA
| | | | - Ping Yang
- Mayo Clinic College of Medicine, Scottsdale, AZ, USA
| | - Triantafillos Liloglou
- Roy Castle Lung Cancer Research Programme, The University of Liverpool, Department of Molecular and Clinical Cancer Medicine, Liverpool, UK
| | - Michael P A Davies
- Roy Castle Lung Cancer Research Programme, The University of Liverpool, Department of Molecular and Clinical Cancer Medicine, Liverpool, UK
| | - Jolanta Lissowska
- M. Sklodowska-Curie National Research Institute of Oncology, Warsaw, Poland
| | - Beata Swiatkowska
- Nofer Institute of Occupational Medicine, Department of Environmental Epidemiology, Lodz, Poland
| | - David Zaridze
- Russian N.N. Blokhin Cancer Research Centre, Moscow, Russian Federation
| | - Anush Mukeria
- Russian N.N. Blokhin Cancer Research Centre, Moscow, Russian Federation
| | - Vladimir Janout
- Faculty of Health Sciences, Palacky University, Olomouc, Czech Republic
| | - Ivana Holcatova
- Institute of Public Health and Preventive Medicine, Charles University, 2nd Faculty of Medicine, Prague, Czech Republic
| | - Dana Mates
- National Institute of Public Health, Bucharest, Romania
| | - Jelena Stojsic
- Department of Thoracopulmonary Pathology, Service of Pathology, Clinical Center of Serbia, Belgrade, Serbia
| | | | - Paul Brennan
- International Agency for Research on Cancer, Lyon, France
| | - Geoffrey Liu
- Princess Margaret Cancer Center, Toronto, ON, Canada
| | - John K Field
- Roy Castle Lung Cancer Research Programme, The University of Liverpool, Department of Molecular and Clinical Cancer Medicine, Liverpool, UK
| | - Rayjean J Hung
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada
| | | | - Christopher I Amos
- Dan L. Duncan Comprehensive Cancer Center, Department of Medicine, Baylor College of Medicine, Houston, TX, USA.
- Institute for Clinical and Translational Research, Baylor College of Medicine, Houston, TX, USA.
| |
Collapse
|
4
|
Amanat S, Requena T, Lopez-Escamez JA. A Systematic Review of Extreme Phenotype Strategies to Search for Rare Variants in Genetic Studies of Complex Disorders. Genes (Basel) 2020; 11:genes11090987. [PMID: 32854191 PMCID: PMC7564972 DOI: 10.3390/genes11090987] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 08/14/2020] [Accepted: 08/19/2020] [Indexed: 12/12/2022] Open
Abstract
Exome sequencing has been commonly used to characterize rare diseases by selecting multiplex families or singletons with an extreme phenotype (EP) and searching for rare variants in coding regions. The EP strategy covers both extreme ends of a disease spectrum and it has been also used to investigate the contribution of rare variants to the heritability of complex clinical traits. We conducted a systematic review to find evidence supporting the use of EP strategies in the search for rare variants in genetic studies of complex diseases and highlight the contribution of rare variations to the genetic structure of polygenic conditions. After assessing the quality of the retrieved records, we selected 19 genetic studies considering EPs to demonstrate genetic association. All studies successfully identified several rare or de novo variants, and many novel candidate genes were also identified by selecting an EP. There is enough evidence to support that the EP approach for patients with an early onset of a disease can contribute to the identification of rare variants in candidate genes or pathways involved in complex diseases. EP patients may contribute to a better understanding of the underlying genetic architecture of common heterogeneous disorders such as tinnitus or age-related hearing loss.
Collapse
Affiliation(s)
- Sana Amanat
- Otology & Neurotology Group CTS495, Department of Genomic Medicine, GENYO—Centre for Genomics and Oncological Research—Pfizer/University of Granada/Junta de Andalucía, PTS, 18016 Granada, Spain;
| | - Teresa Requena
- Centre for Discovery Brain Sciences, Edinburgh Medical School: Biomedical Sciences, University of Edinburgh, Edinburgh EH8 9JZ, UK;
| | - Jose Antonio Lopez-Escamez
- Otology & Neurotology Group CTS495, Department of Genomic Medicine, GENYO—Centre for Genomics and Oncological Research—Pfizer/University of Granada/Junta de Andalucía, PTS, 18016 Granada, Spain;
- Department of Otolaryngology, Instituto de Investigación Biosanitaria ibs.GRANADA, Hospital Universitario Virgen de las Nieves, Universidad de Granada, 18016 Granada, Spain
- Department of Surgery, Division of Otolaryngology, Universidad de Granada, 18016 Granada, Spain
- Correspondence: ; Tel.: +34-958-715-500-160
| |
Collapse
|
5
|
Bi W, Li Y, Smeltzer MP, Gao G, Zhao S, Kang G. STEPS: an efficient prospective likelihood approach to genetic association analyses of secondary traits in extreme phenotype sequencing. Biostatistics 2020; 21:33-49. [PMID: 30007308 PMCID: PMC8559722 DOI: 10.1093/biostatistics/kxy030] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Revised: 05/16/2018] [Accepted: 06/02/2018] [Indexed: 11/13/2022] Open
Abstract
It has been well acknowledged that methods for secondary trait (ST) association analyses under a case-control design (ST$_{\text{CC}}$) should carefully consider the sampling process to avoid biased risk estimates. A similar situation also exists in the extreme phenotype sequencing (EPS) designs, which is to select subjects with extreme values of continuous primary phenotype for sequencing. EPS designs are commonly used in modern epidemiological and clinical studies such as the well-known National Heart, Lung, and Blood Institute Exome Sequencing Project. Although naïve generalized regression or ST$_{\text{CC}}$ method could be applied, their validity is questionable due to difference in statistical designs. Herein, we propose a general prospective likelihood framework to perform association testing for binary and continuous STs under EPS designs (STEPS), which can also incorporate covariates and interaction terms. We provide a computationally efficient and robust algorithm to obtain the maximum likelihood estimates. We also present two empirical mathematical formulas for power/sample size calculations to facilitate planning of binary/continuous STs association analyses under EPS designs. Extensive simulations and application to a genome-wide association study of benign ethnic neutropenia under an EPS design demonstrate the superiority of STEPS over all its alternatives above.
Collapse
Affiliation(s)
- Wenjian Bi
- Department of Biostatistics, St. Jude Children’s Research
Hospital, Memphis, TN 38105, USA
| | - Yun Li
- Department of Genetics, University of North Carolina, Chapel
Hill, NC 27599, USA
- Department of Biostatistics, University of North Carolina, Chapel
Hill, NC 27599, USA
- Department of Computer Science, University of North Carolina,
Chapel Hill, NC 27599, USA
| | - Matthew P Smeltzer
- Division of Epidemiology, Biostatistics, and Environmental Health, School of
Public Health, University of Memphis, Memphis, TN 38152, USA
| | - Guimin Gao
- Department of Public Health Sciences, University of Chicago,
Chicago, IL 60637, USA
| | - Shengli Zhao
- School of Statistics, Qufu Normal University, Qufu 273165, PR
China
| | - Guolian Kang
- Department of Biostatistics, St. Jude Children’s Research
Hospital, Memphis, TN 38105, USA
| |
Collapse
|
6
|
Roca-Ayats N, Martínez-Gil N, Cozar M, Gerousi M, Garcia-Giralt N, Ovejero D, Mellibovsky L, Nogués X, Díez-Pérez A, Grinberg D, Balcells S. Functional characterization of the C7ORF76 genomic region, a prominent GWAS signal for osteoporosis in 7q21.3. Bone 2019; 123:39-47. [PMID: 30878523 DOI: 10.1016/j.bone.2019.03.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 03/04/2019] [Accepted: 03/12/2019] [Indexed: 12/21/2022]
Abstract
Genome-wide association studies (GWAS) have repeatedly identified genetic variants associated with bone mineral density (BMD) and osteoporotic fracture in non-coding regions of C7ORF76, a poorly studied gene of unknown function. The aim of the present study was to elucidate the causality and molecular mechanisms underlying the association. We re-sequenced the genomic region in two extreme BMD groups from the BARCOS cohort of postmenopausal women to search for functionally relevant variants. Eight selected variants were tested for association in the complete cohort and 2 of them (rs4342521 and rs10085588) were found significantly associated with lumbar spine BMD and nominally associated with osteoporotic fracture. cis-eQTL analyses of these 2 SNPs, together with SNP rs4727338 (GWAS lead SNP in Estrada et al., Nat Genet. 44:491-501, 2012), performed in human primary osteoblasts, disclosed a statistically significant influence on the expression of the proximal neighbouring gene SLC25A13 and a tendency on the distal SHFM1. We then studied the functionality of a putative upstream regulatory element (UPE), containing rs10085588. Luciferase reporter assays showed transactivation capability with a strong allele-dependent effect. Finally, 4C-seq experiments in osteoblastic cell lines showed that the UPE interacted with different tissue-specific enhancers and a lncRNA (LOC100506136) in the region. In summary, this study is the first one to analyse in depth the functionality of C7ORF76 genomic region. We provide functional regulatory evidence for the rs10085588, which may be a causal SNP within the 7q21.3 GWAS signal for osteoporosis.
Collapse
Affiliation(s)
- Neus Roca-Ayats
- Department of Genetics, Microbiology and Statistics, Facultat de Biologia, Universitat de Barcelona, Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, IBUB, IRSJD, Barcelona, Catalonia, Spain
| | - Núria Martínez-Gil
- Department of Genetics, Microbiology and Statistics, Facultat de Biologia, Universitat de Barcelona, Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, IBUB, IRSJD, Barcelona, Catalonia, Spain
| | - Mónica Cozar
- Department of Genetics, Microbiology and Statistics, Facultat de Biologia, Universitat de Barcelona, Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, IBUB, IRSJD, Barcelona, Catalonia, Spain
| | - Marina Gerousi
- Department of Genetics, Microbiology and Statistics, Facultat de Biologia, Universitat de Barcelona, Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, IBUB, IRSJD, Barcelona, Catalonia, Spain
| | - Natàlia Garcia-Giralt
- Musculoskeletal Research Group, IMIM (Hospital del Mar Medical Research Institute), Centro de Investigación Biomédica en Red en Fragilidad y Envejecimiento Saludable (CIBERFES), ISCIII, Barcelona, Catalonia, Spain
| | - Diana Ovejero
- National Research Council, Institute of Clinical Physiology, Lecce, Italy
| | - Leonardo Mellibovsky
- Musculoskeletal Research Group, IMIM (Hospital del Mar Medical Research Institute), Centro de Investigación Biomédica en Red en Fragilidad y Envejecimiento Saludable (CIBERFES), ISCIII, Barcelona, Catalonia, Spain
| | - Xavier Nogués
- Musculoskeletal Research Group, IMIM (Hospital del Mar Medical Research Institute), Centro de Investigación Biomédica en Red en Fragilidad y Envejecimiento Saludable (CIBERFES), ISCIII, Barcelona, Catalonia, Spain
| | - Adolfo Díez-Pérez
- Musculoskeletal Research Group, IMIM (Hospital del Mar Medical Research Institute), Centro de Investigación Biomédica en Red en Fragilidad y Envejecimiento Saludable (CIBERFES), ISCIII, Barcelona, Catalonia, Spain
| | - Daniel Grinberg
- Department of Genetics, Microbiology and Statistics, Facultat de Biologia, Universitat de Barcelona, Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, IBUB, IRSJD, Barcelona, Catalonia, Spain
| | - Susanna Balcells
- Department of Genetics, Microbiology and Statistics, Facultat de Biologia, Universitat de Barcelona, Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), ISCIII, IBUB, IRSJD, Barcelona, Catalonia, Spain.
| |
Collapse
|
7
|
Luo Z, Li X, Zhu M, Tang J, Li Z, Zhou X, Song G, Liu Z, Zhou H, Zhang W. Identification of novel variants associated with warfarin stable dosage by use of a two-stage extreme phenotype strategy. J Thromb Haemost 2017; 15:28-37. [PMID: 27740732 DOI: 10.1111/jth.13542] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2016] [Indexed: 12/25/2022]
Abstract
Essentials Required warfarin doses for mechanical heart valves vary greatly. A two-stage extreme phenotype design was used to identify novel warfarin dose associated mutation. We identified a group of variants significantly associated with extreme warfarin dose. Four novel identified mutations account for 2.2% of warfarin dose discrepancies. SUMMARY Background The variation among patients in warfarin response complicates the management of warfarin therapy, and an improper therapeutic dose usually results in serious adverse events. Objective To use a two-stage extreme phenotype strategy in order to discover novel warfarin dose-associated mutations in heart valve replacement patients. Patients/method A total of 1617 stable-dose patients were enrolled and divided randomly into two cohorts. Stage I patients were genotyped into three groups on the basis of VKORC1-1639G>A and CYP2C9*3 polymorphisms; only patients with the therapeutic dose at the upper or lower 5% of each genotype group were selected as extreme-dose patients for resequencing of the targeted regions. Evaluation of the accuracy of the sequence data and the potential value of the stage I-identified significant mutations were conducted in a validation cohort of 420 subjects. Results A group of mutations were found to be significantly associated with the extreme warfarin dose. The validation work finally identified four novel mutations, i.e. DNMT3A rs2304429 (24.74%), CYP1A1 rs3826041 (47.35%), STX1B rs72800847 (7.01%), and NQO1 rs10517 (36.11%), which independently and significantly contributed to the overall variability in the warfarin dose. After addition of these four mutations, the estimated regression equation was able to account for 56.2% (R2Adj = 0.562) of the overall variability in the warfarin maintenance dose, with a predictive accuracy of 62.4%. Conclusion Our study provides evidence linking genetic variations in STX1B, DNMT3A and CYP1A1 to warfarin maintenance dose. The newly identified mutations together account for 2.2% of warfarin dose discrepancy.
Collapse
Affiliation(s)
- Z Luo
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, China
- Hunan Key Laboratory of Pharmacogenetics, Institute of Clinical Pharmacology, Central South University, Changsha, China
| | - X Li
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, China
| | - M Zhu
- School of Mathematics and Statistics, Central South University, Changsha, China
| | - J Tang
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, China
- Hunan Key Laboratory of Pharmacogenetics, Institute of Clinical Pharmacology, Central South University, Changsha, China
| | - Z Li
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, China
- Hunan Key Laboratory of Pharmacogenetics, Institute of Clinical Pharmacology, Central South University, Changsha, China
| | - X Zhou
- Department of Cardio-Thoracic Surgery, the Second Xiangya Hospital Hospital of Central South University, Changsha, China
| | - G Song
- Department of Cardio-Thoracic Surgery, the Second Xiangya Hospital Hospital of Central South University, Changsha, China
| | - Z Liu
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, China
- Hunan Key Laboratory of Pharmacogenetics, Institute of Clinical Pharmacology, Central South University, Changsha, China
| | - H Zhou
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, China
- Hunan Key Laboratory of Pharmacogenetics, Institute of Clinical Pharmacology, Central South University, Changsha, China
| | - W Zhang
- Department of Clinical Pharmacology, Xiangya Hospital, Central South University, Changsha, China
- Hunan Key Laboratory of Pharmacogenetics, Institute of Clinical Pharmacology, Central South University, Changsha, China
| |
Collapse
|
8
|
Xu C, Wu K, Zhang JG, Shen H, Deng HW. Low-, high-coverage, and two-stage DNA sequencing in the design of the genetic association study. Genet Epidemiol 2016; 41:187-197. [PMID: 27813156 DOI: 10.1002/gepi.22015] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2016] [Revised: 08/22/2016] [Accepted: 09/19/2016] [Indexed: 12/29/2022]
Abstract
Next-generation sequencing-based genetic association study (GAS) is a powerful tool to identify candidate disease variants and genomic regions. Although low-coverage sequencing offers low cost but inadequacy in calling rare variants, high coverage is able to detect essentially every variant but at a high cost. Two-stage sequencing may be an economical way to conduct GAS without losing power. In two-stage sequencing, an affordable number of samples are sequenced at high coverage as the reference panel, then to impute in a larger sample is sequenced at low coverage. As unit sequencing costs continue to decrease, investigators can now conduct GAS with more flexible sequencing depths. Here, we systematically evaluate the effect of the read depth and sample size on the variant discovery power and association power for study designs using low-coverage, high-coverage, and two-stage sequencing. We consider 12 low-coverage, 12 high-coverage, and 51 two-stage design scenarios with the read depth varying from 0.5× to 80×. With state-of-the-art simulation and analysis packages and in-house scripts, we simulate the complete study process from DNA sequencing to SNP (single nucleotide polymorphism) calling and association testing. Our results show that with appropriate allocation of sequencing effort, two-stage sequencing is an effective approach for conducting GAS. We provide practical guidelines for investigators to plan the optimum sequencing-based GAS including two-stage sequencing design given their specific constraints of sequencing investment.
Collapse
Affiliation(s)
- Chao Xu
- Department of Biostatistics and Bioinformatics, Center for Bioinformatics and Genomics, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Kehao Wu
- Department of Biostatistics and Bioinformatics, Center for Bioinformatics and Genomics, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Ji-Gang Zhang
- Department of Biostatistics and Bioinformatics, Center for Bioinformatics and Genomics, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Hui Shen
- Department of Biostatistics and Bioinformatics, Center for Bioinformatics and Genomics, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| | - Hong-Wen Deng
- Department of Biostatistics and Bioinformatics, Center for Bioinformatics and Genomics, School of Public Health and Tropical Medicine, Tulane University, New Orleans, LA, USA
| |
Collapse
|
9
|
Focused Analysis of Exome Sequencing Data for Rare Germline Mutations in Familial and Sporadic Lung Cancer. J Thorac Oncol 2016; 11:52-61. [PMID: 26762739 DOI: 10.1016/j.jtho.2015.09.015] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Revised: 09/21/2015] [Accepted: 09/25/2015] [Indexed: 01/05/2023]
Abstract
INTRODUCTION The association between smoking-induced chronic obstructive pulmonary disease (COPD) and lung cancer (LC) is well documented. Recent genome-wide association studies (GWAS) have identified 28 susceptibility loci for LC, 10 for COPD, 32 for smoking behavior, and 63 for pulmonary function, totaling 107 nonoverlapping loci. Given that common variants have been found to be associated with LC in genome-wide association studies, exome sequencing of these high-priority regions has great potential to identify novel rare causal variants. METHODS To search for disease-causing rare germline mutations, we used a variation of the extreme phenotype approach to select 48 patients with sporadic LC who reported histories of heavy smoking-37 of whom also exhibited carefully documented severe COPD (in whom smoking is considered the overwhelming determinant)-and 54 unique familial LC cases from families with at least three first-degree relatives with LC (who are likely enriched for genomic effects). RESULTS By focusing on exome profiles of the 107 target loci, we identified two key rare mutations. A heterozygous p.Arg696Cys variant in the coiled-coil domain containing 147 (CCDC147) gene at 10q25.1 was identified in one sporadic and two familial cases. The minor allele frequency (MAF) of this variant in the 1000 Genomes database is 0.0026. The p.Val26Met variant in the dopamine β-hydroxylase (DBH) gene at 9q34.2 was identified in two sporadic cases; the minor allele frequency of this mutation is 0.0034 according to the 1000 Genomes database. We also observed three suggestive rare mutations on 15q25.1: iron-responsive element binding protein neuronal 2 (IREB2); cholinergic receptor, nicotinic, alpha 5 (neuronal) (CHRNA5); and cholinergic receptor, nicotinic, beta 4 (CHRNB4). CONCLUSIONS Our results demonstrated highly disruptive risk-conferring CCDC147 and DBH mutations.
Collapse
|
10
|
Boggis EM, Milo M, Walters K. eQuIPS: eQTL Analysis Using Informed Partitioning of SNPs - A Fully Bayesian Approach. Genet Epidemiol 2016; 40:273-83. [DOI: 10.1002/gepi.21961] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2015] [Revised: 12/18/2015] [Accepted: 12/18/2015] [Indexed: 11/11/2022]
Affiliation(s)
- E. M. Boggis
- School of Mathematics and Statistics; University of Sheffield; Sheffield United Kingdom
| | - M. Milo
- Department of Biomedical Science; University of Sheffield; Sheffield United Kingdom
| | - K. Walters
- School of Mathematics and Statistics; University of Sheffield; Sheffield United Kingdom
| |
Collapse
|
11
|
Evaluation of a two-step iterative resampling procedure for internal validation of genome-wide association studies. J Hum Genet 2015; 60:729-38. [PMID: 26377241 DOI: 10.1038/jhg.2015.110] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2015] [Revised: 06/14/2015] [Accepted: 08/09/2015] [Indexed: 12/31/2022]
Abstract
Genome-wide association studies (GWAS) have successfully identified many common genetic variants associated with complex diseases over the past decade. The 'gold standard' method for validating the top single nucleotide polymorphisms (SNPs) identified in GWAS is to independently replicate the findings in similar or diverse large-scale external cohorts. However, for rare diseases, it can be difficult to find an external validation cohort within a reasonable timeframe. In such situations, resampling methods, such as the two-step iterative resampling (TSIR) approach have been used to identify SNPs associated with the outcome of interest. However, the TSIR approach involves choosing several parameters in each step, which can influence the performance of the approach. In this paper, we undertook extensive simulation studies to assess the effect of choice of different parameters on the type I error and power for both binary and continuous phenotypes and also compared the TSIR approach with the traditional one-stage (OS) and two-stage (TS) GWAS analysis. We illustrate the usefulness of the TSIR approach by applying it to a GWAS of childhood cancer survivors. Our results indicate that the TSIR approach with an at least 70:30 split and a cutoff of discovering and replicating SNPs at least 20 times in 100 replications provides conservative type I error control and has near 'optimal' power for internally validated SNPs. Its performance is comparable with the TS GWAS for which an external validation cohort is available with only slight reduction in power in some situations. It has almost the same power as OS GWAS with conservative type I error which leads to fewer false positive findings. TSIR is a powerful and efficient method for identifying and internally validating SNPs for GWAS when independent cohorts for external validation may not be available.
Collapse
|
12
|
Phenotypic extremes in rare variant study designs. Eur J Hum Genet 2015; 24:924-30. [PMID: 26350511 DOI: 10.1038/ejhg.2015.197] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2015] [Revised: 07/17/2015] [Accepted: 08/04/2015] [Indexed: 12/16/2022] Open
Abstract
Currently, next-generation sequencing studies aim to identify rare and low-frequency variation that may contribute to disease. For a given effect size, as the allele frequency decreases, the power to detect genes or variants of interest also decreases. Although many methods have been proposed for the analysis of such data, study design and analytic issues still persist in data interpretation. In this study we present sequencing data for ABCA1 that has known rare variants associated with high-density lipoprotein cholesterol (HDL-C). We contrast empirical findings from two study designs: a phenotypic extreme sample and a population-based random sample. We found differing strengths of association with HDL-C across the two study designs (P=0.0006 with n=701 phenotypic extremes vs P=0.03 with n=1600 randomly sampled individuals). To explore this apparent difference in evidence for association, we performed a simulation study focused on the impact of phenotypic selection on power. We demonstrate that the power gain for an extreme phenotypic selection study design is much greater in rare variant studies than for studies of common variants. Our study confirms that studying phenotypic extremes is critical in rare variant studies because it boosts power in two ways: the typical increases from extreme sampling and increasing the proportion of relevant functional variants ascertained and thereby tested for association. Furthermore, we show that when combining statistical evidence through meta-analysis from an extreme-selected sample and a second separate population-based random sample, power is lower when a traditional sample size weighting is used compared with weighting by the noncentrality parameter.
Collapse
|
13
|
Kang G. Power and sample size of two-stage extreme phenotype sequencing design for next generation sequencing studies. BMC Bioinformatics 2013. [PMCID: PMC3856562 DOI: 10.1186/1471-2105-14-s17-a16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
14
|
Abstract
Genetic variation influences the response of an individual to drug treatments. Understanding this variation has the potential to make therapy safer and more effective by determining selection and dosing of drugs for an individual patient. In the context of cancer, tumours may have specific disease-defining mutations, but a patient's germline genetic variation will also affect drug response (both efficacy and toxicity), and here we focus on how to study this variation. Advances in sequencing technologies, statistical genetics analysis methods and clinical trial designs have shown promise for the discovery of variants associated with drug response. We discuss the application of germline genetics analysis methods to cancer pharmacogenomics with a focus on the special considerations for study design.
Collapse
|