1
|
Li CW, Sachidanandam R, Jayaprakash A, Yi Z, Zhang W, Stefan-Lifshitz M, Concepcion E, Tomer Y. Identification of New Rare Variants Associated With Familial Autoimmune Thyroid Diseases by Deep Sequencing of Linked Loci. J Clin Endocrinol Metab 2021; 106:e4680-e4687. [PMID: 34143178 PMCID: PMC8530708 DOI: 10.1210/clinem/dgab440] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Indexed: 11/19/2022]
Abstract
CONTEXT Genetic risk factors play a major role in the pathoetiology of autoimmune thyroid diseases (AITD). So far, only common risk variants have been identified in AITD susceptibility genes. Recently, rare genetic variants have emerged as important contributors to complex diseases, and we hypothesized that rare variants play a key role in the genetic susceptibility to AITD. OBJECTIVE We aimed to identify new rare variants that are associated with familial AITD. METHODS We performed deep sequencing of 3 previously mapped AITD-linked loci (10q, 12q, and 14q) in a dataset of 34 families in which AITD clustered (familial AITD). RESULTS We identified 13 rare variants, located in the inositol polyphosphate multikinase (IPMK) gene, that were associated with AITD (ie, both Graves' disease [GD] and Hashimoto's thyroiditis [HT]); 2 rare variants, within the dihydrolipoamide S-succinyltransferase (DLST) and zinc-finger FYVE domain-containing protein (ZFYVE1) genes, that were associated with GD only; and 3 rare variants, within the phosphoglycerate mutase 1 pseudogene 5 (PGAM1P5), LOC105369879, and methionine aminopeptidase 2 (METAP2) genes, that were associated with HT only. CONCLUSION Our study demonstrates that, in addition to common variants, rare variants also contribute to the genetic susceptibility to AITD. We identified new rare variants in 6 AITD susceptibility genes that predispose to familial AITD. Of these, 3 genes, IPMK, ZFYVE1, and METAP2, are mechanistically involved in immune pathways and have been previously shown to be associated with autoimmunity. These genes predispose to thyroid autoimmunity and may serve as potential therapeutic targets in the future.
Collapse
Affiliation(s)
- Cheuk Wun Li
- Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Ravi Sachidanandam
- Department of Oncological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Anitha Jayaprakash
- Department of Oncological Sciences, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Zhengzi Yi
- Department of Medicine Bioinformatics Core, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Weijia Zhang
- Department of Medicine Bioinformatics Core, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | | | - Erlinda Concepcion
- Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA
| | - Yaron Tomer
- Department of Medicine, Albert Einstein College of Medicine, Bronx, NY 10461, USA
- Correspondence: Yaron Tomer, MD, Department of Medicine, Albert Einstein College of Medicine, 1300 Morris Park Ave, Bronx, NY 10461, USA.
| |
Collapse
|
2
|
Gao C, Sha Q, Zhang S, Zhang K. MF-TOWmuT: Testing an optimally weighted combination of common and rare variants with multiple traits using family data. Genet Epidemiol 2020; 45:64-81. [PMID: 33047835 DOI: 10.1002/gepi.22355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 08/03/2020] [Accepted: 08/18/2020] [Indexed: 11/11/2022]
Abstract
With rapid advancements of sequencing technologies and accumulations of electronic health records, a large number of genetic variants and multiple correlated human complex traits have become available in many genetic association studies. Thus, it becomes necessary and important to develop new methods that can jointly analyze the association between multiple genetic variants and multiple traits. Compared with methods that only use a single marker or trait, the joint analysis of multiple genetic variants and multiple traits is more powerful since such an analysis can fully incorporate the correlation structure of genetic variants and/or traits and their mutual dependence patterns. However, most of existing methods that simultaneously analyze multiple genetic variants and multiple traits are only applicable to unrelated samples. We develop a new method called MF-TOWmuT to detect association of multiple phenotypes and multiple genetic variants in a genomic region with family samples. MF-TOWmuT is based on an optimally weighted combination of variants. Our method can be applied to both rare and common variants and both qualitative and quantitative traits. Our simulation results show that (1) the type I error of MF-TOWmuT is preserved; (2) MF-TOWmuT outperforms two existing methods such as Multiple Family-based Quasi-Likelihood Score Test and Multivariate Family-based Rare Variant Association Test in terms of power. We also illustrate the usefulness of MF-TOWmuT by analyzing genotypic and phenotipic data from the Genetics of Kidneys in Diabetes study. R program is available at https://github.com/gaochengPRC/MF-TOWmuT.
Collapse
Affiliation(s)
- Cheng Gao
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| | - Shuanglin Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| | - Kui Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan, USA
| |
Collapse
|
3
|
Turkmen AS, Lin S. Detecting X-linked common and rare variant effects in family-based sequencing studies. Genet Epidemiol 2020; 45:36-45. [PMID: 32864779 DOI: 10.1002/gepi.22352] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 06/26/2020] [Accepted: 08/03/2020] [Indexed: 11/08/2022]
Abstract
The breakthroughs in next generation sequencing have allowed us to access data consisting of both common and rare variants, and in particular to investigate the impact of rare genetic variation on complex diseases. Although rare genetic variants are thought to be important components in explaining genetic mechanisms of many diseases, discovering these variants remains challenging, and most studies are restricted to population-based designs. Further, despite the shift in the field of genome-wide association studies (GWAS) towards studying rare variants due to the "missing heritability" phenomenon, little is known about rare X-linked variants associated with complex diseases. For instance, there is evidence that X-linked genes are highly involved in brain development and cognition when compared with autosomal genes; however, like most GWAS for other complex traits, previous GWAS for mental diseases have provided poor resources to deal with identification of rare variant associations on X-chromosome. In this paper, we address the two issues described above by proposing a method that can be used to test X-linked variants using sequencing data on families. Our method is much more general than existing methods, as it can be applied to detect both common and rare variants, and is applicable to autosomes as well. Our simulation study shows that the method is efficient, and exhibits good operational characteristics. An application to the University of Miami Study on Genetics of Autism and Related Disorders also yielded encouraging results.
Collapse
Affiliation(s)
- Asuman S Turkmen
- Statistics Department, The Ohio State University, Columbus, Ohio.,Statistics Department, The Ohio State University, Newark, Ohio
| | - Shili Lin
- Statistics Department, The Ohio State University, Columbus, Ohio
| |
Collapse
|
4
|
Alfares A, Alsubaie L, Aloraini T, Alaskar A, Althagafi A, Alahmad A, Rashid M, Alswaid A, Alothaim A, Eyaid W, Ababneh F, Albalwi M, Alotaibi R, Almutairi M, Altharawi N, Alsamer A, Abdelhakim M, Kafkas S, Mineta K, Cheung N, Abdallah AM, Büchmann-Møller S, Fukasawa Y, Zhao X, Rajan I, Hoehndorf R, Al Mutairi F, Gojobori T, Alfadhel M. What is the right sequencing approach? Solo VS extended family analysis in consanguineous populations. BMC Med Genomics 2020; 13:103. [PMID: 32680510 PMCID: PMC7368798 DOI: 10.1186/s12920-020-00743-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Accepted: 06/19/2020] [Indexed: 02/04/2023] Open
Abstract
Background Testing strategies is crucial for genetics clinics and testing laboratories. In this study, we tried to compare the hit rate between solo and trio and trio plus testing and between trio and sibship testing. Finally, we studied the impact of extended family analysis, mainly in complex and unsolved cases. Methods Three cohorts were used for this analysis: one cohort to assess the hit rate between solo, trio and trio plus testing, another cohort to examine the impact of the testing strategy of sibship genome vs trio-based analysis, and a third cohort to test the impact of an extended family analysis of up to eight family members to lower the number of candidate variants. Results The hit rates in solo, trio and trio plus testing were 39, 40, and 41%, respectively. The total number of candidate variants in the sibship testing strategy was 117 variants compared to 59 variants in the trio-based analysis. We noticed that the average number of coding candidate variants in trio-based analysis was 1192 variants and 26,454 noncoding variants, and this number was lowered by 50–75% after adding additional family members, with up to two coding and 66 noncoding homozygous variants only, in families with eight family members. Conclusion There was no difference in the hit rate between solo and extended family members. Trio-based analysis was a better approach than sibship testing, even in a consanguineous population. Finally, each additional family member helped to narrow down the number of variants by 50–75%. Our findings could help clinicians, researchers and testing laboratories select the most cost-effective and appropriate sequencing approach for their patients. Furthermore, using extended family analysis is a very useful tool for complex cases with novel genes.
Collapse
Affiliation(s)
- Ahmed Alfares
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia. .,Department of Pediatrics, College of Medicine, Qassim University, Qassim, Saudi Arabia. .,Qassim University, Department of Pediatrics, Almulyda, Saudi Arabia.
| | - Lamia Alsubaie
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Abdullah International Medical Research Center, Riyadh, Saudi Arabia
| | - Taghrid Aloraini
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Aljoharah Alaskar
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Azza Althagafi
- Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Ahmed Alahmad
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Mamoon Rashid
- King Abdullah International Medical Research Center, Riyadh, Saudi Arabia
| | - Abdulrahman Alswaid
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Ali Alothaim
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Wafaa Eyaid
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Faroug Ababneh
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Abdullah International Medical Research Center, Riyadh, Saudi Arabia
| | - Mohammed Albalwi
- Department of Pathology and Laboratory Medicine, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Raniah Alotaibi
- King Abdullah International Medical Research Center, Riyadh, Saudi Arabia.,Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Mashael Almutairi
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Nouf Altharawi
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Alhanouf Alsamer
- Department of Clinical Laboratory Sciences, College of Applied Medical Sciences, King Saud bin Abdulaziz University for Health Sciences, Riyadh, Saudi Arabia
| | - Marwa Abdelhakim
- Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Senay Kafkas
- Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Katsuhiko Mineta
- Computer, Electrical & Mathematical Sciences and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Nicole Cheung
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Abdallah M Abdallah
- Department of Basic Medical Sciences, College of Medicine, QU Health, Qatar University, Doha, Qatar
| | - Stine Büchmann-Møller
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Yoshinori Fukasawa
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Xiang Zhao
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Issaac Rajan
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Robert Hoehndorf
- King Abdullah University of Science and Technology (KAUST), Core Labs, Thuwal, 23955-6900, Saudi Arabia
| | - Fuad Al Mutairi
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| | - Takashi Gojobori
- Biological and Environmental Science and Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Majid Alfadhel
- Division of Genetics, Department of Pediatrics, King Abdulaziz Medical City, Riyadh, Saudi Arabia.,King Abdullah International Medical Research Center, Riyadh, Saudi Arabia.,King Saud bin Abdulaziz University for Health Sciences, King Abdulaziz Medical City, Riyadh, Saudi Arabia
| |
Collapse
|
5
|
Chiu CY, Zhang B, Wang S, Shao J, Lakhal-Chaieb ML, Cook RJ, Wilson AF, Bailey-Wilson JE, Xiong M, Fan R. Gene-based association analysis of survival traits via functional regression-based mixed effect cox models for related samples. Genet Epidemiol 2019; 43:952-965. [PMID: 31502722 DOI: 10.1002/gepi.22254] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Revised: 06/26/2019] [Accepted: 07/16/2019] [Indexed: 01/09/2023]
Abstract
The importance to integrate survival analysis into genetics and genomics is widely recognized, but only a small number of statisticians have produced relevant work toward this study direction. For unrelated population data, functional regression (FR) models have been developed to test for association between a quantitative/dichotomous/survival trait and genetic variants in a gene region. In major gene association analysis, these models have higher power than sequence kernel association tests. In this paper, we extend this approach to analyze censored traits for family data or related samples using FR based mixed effect Cox models (FamCoxME). The FamCoxME model effect of major gene as fixed mean via functional data analysis techniques, the local gene or polygene variations or both as random, and the correlation of pedigree members by kinship coefficients or genetic relationship matrix or both. The association between the censored trait and the major gene is tested by likelihood ratio tests (FamCoxME FR LRT). Simulation results indicate that the LRT control the type I error rates accurately/conservatively and have good power levels when both local gene or polygene variations are modeled. The proposed methods were applied to analyze a breast cancer data set from the Consortium of Investigators of Modifiers of BRCA1 and BRCA2 (CIMBA). The FamCoxME provides a new tool for gene-based analysis of family-based studies or related samples.
Collapse
Affiliation(s)
- Chi-Yang Chiu
- Division of Biostatistics, Department of Preventive Medicine, University of Tennessee Health Science Center, Memphis, Tennessee
| | - Bingsong Zhang
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, District of Columbia
| | - Shuqi Wang
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, District of Columbia
| | - Jingyi Shao
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, District of Columbia
| | | | - Richard J Cook
- Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, Ontario, Canada
| | - Alexander F Wilson
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Joan E Bailey-Wilson
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland
| | - Momiao Xiong
- Department of Biostatistics, Human Genetics Center, University of Texas-Houston, Houston, Texas
| | - Ruzong Fan
- Department of Biostatistics, Bioinformatics, and Biomathematics, Georgetown University Medical Center, Washington, District of Columbia
| |
Collapse
|
6
|
Guo Y, Zhou Y. A modified association test for rare and common variants based on affected sib-pair design. J Theor Biol 2019; 467:1-6. [PMID: 30707975 DOI: 10.1016/j.jtbi.2019.01.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 01/08/2019] [Indexed: 11/18/2022]
Abstract
Current genome-wide association analysis has identified a great number of rare and common variants associated with common complex traits, however, more effective approaches for detecting associations between rare and common variants with common diseases are still demanded. Approaches for detecting rare variant association analysis will compromise the power when detecting the effects of rare and common variants simultaneously. In this paper, we extend an existing method of testing for rare variant association based on affected sib pairs (TOW-sib) and propose a variable weight test for rare and common variants association based on affected sib pairs (abbreviated as VW-TOWsib). The VW-TOWsib can be used to achieve the purpose of detecting the association of rare and common variants with complex diseases. Simulation results in various scenarios show that our proposed method is more powerful than existing methods for detecting effects of rare and common variants. At the same time, the VW-TOWsib also performs well as a method for rare variant association analysis.
Collapse
Affiliation(s)
- Yixing Guo
- Department of Statistics, School of Mathematical Sciences, Heilongjiang University and Heilongjiang Provincial Key Laboratory of the Theory and Computation of Complex Systems, Harbin 150080, China
| | - Ying Zhou
- Department of Statistics, School of Mathematical Sciences, Heilongjiang University and Heilongjiang Provincial Key Laboratory of the Theory and Computation of Complex Systems, Harbin 150080, China.
| |
Collapse
|
7
|
Lee S, Choi S, Qiao D, Cho M, Silverman EK, Park T, Won S. WISARD: workbench for integrated superfast association studies for related datasets. BMC Med Genomics 2018; 11:39. [PMID: 29697360 PMCID: PMC5918457 DOI: 10.1186/s12920-018-0345-y] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND A Mendelian transmission produces phenotypic and genetic relatedness between family members, giving family-based analytical methods an important role in genetic epidemiological studies-from heritability estimations to genetic association analyses. With the advance in genotyping technologies, whole-genome sequence data can be utilized for genetic epidemiological studies, and family-based samples may become more useful for detecting de novo mutations. However, genetic analyses employing family-based samples usually suffer from the complexity of the computational/statistical algorithms, and certain types of family designs, such as incorporating data from extended families, have rarely been used. RESULTS We present a Workbench for Integrated Superfast Association studies for Related Data (WISARD) programmed in C/C++. WISARD enables the fast and a comprehensive analysis of SNP-chip and next-generation sequencing data on extended families, with applications from designing genetic studies to summarizing analysis results. In addition, WISARD can automatically be run in a fully multithreaded manner, and the integration of R software for visualization makes it more accessible to non-experts. CONCLUSIONS Comparison with existing toolsets showed that WISARD is computationally suitable for integrated analysis of related subjects, and demonstrated that WISARD outperforms existing toolsets. WISARD has also been successfully utilized to analyze the large-scale massive sequencing dataset of chronic obstructive pulmonary disease data (COPD), and we identified multiple genes associated with COPD, which demonstrates its practical value.
Collapse
Affiliation(s)
- Sungyoung Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea
| | - Sungkyoung Choi
- Department of Pharmacology, Yonsei University College of Medicine, Seoul, South Korea
| | - Dandi Qiao
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael Cho
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Taesung Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea. .,Department of Statistics, Seoul National University, 1 Kwanak-ro, Kwanak-gu, Seoul, 151-742, South Korea.
| | - Sungho Won
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, South Korea. .,Department of Public Health Sciences, Graduate School of Public Health, Seoul National University, 1 Kwanak-ro, Kwanak-gu, Seoul, 151-742, South Korea. .,Institute of Health and Environment, Seoul National University, Seoul, South Korea.
| |
Collapse
|
8
|
Zhou H, Blangero J, Dyer TD, Chan KHK, Lange K, Sobel EM. Fast Genome-Wide QTL Association Mapping on Pedigree and Population Data. Genet Epidemiol 2017; 41:174-186. [PMID: 27943406 PMCID: PMC5340631 DOI: 10.1002/gepi.21988] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Revised: 05/02/2016] [Accepted: 05/08/2016] [Indexed: 01/14/2023]
Abstract
Since most analysis software for genome-wide association studies (GWAS) currently exploit only unrelated individuals, there is a need for efficient applications that can handle general pedigree data or mixtures of both population and pedigree data. Even datasets thought to consist of only unrelated individuals may include cryptic relationships that can lead to false positives if not discovered and controlled for. In addition, family designs possess compelling advantages. They are better equipped to detect rare variants, control for population stratification, and facilitate the study of parent-of-origin effects. Pedigrees selected for extreme trait values often segregate a single gene with strong effect. Finally, many pedigrees are available as an important legacy from the era of linkage analysis. Unfortunately, pedigree likelihoods are notoriously hard to compute. In this paper, we reexamine the computational bottlenecks and implement ultra-fast pedigree-based GWAS analysis. Kinship coefficients can either be based on explicitly provided pedigrees or automatically estimated from dense markers. Our strategy (a) works for random sample data, pedigree data, or a mix of both; (b) entails no loss of power; (c) allows for any number of covariate adjustments, including correction for population stratification; (d) allows for testing SNPs under additive, dominant, and recessive models; and (e) accommodates both univariate and multivariate quantitative traits. On a typical personal computer (six CPU cores at 2.67 GHz), analyzing a univariate HDL (high-density lipoprotein) trait from the San Antonio Family Heart Study (935,392 SNPs on 1,388 individuals in 124 pedigrees) takes less than 2 min and 1.5 GB of memory. Complete multivariate QTL analysis of the three time-points of the longitudinal HDL multivariate trait takes less than 5 min and 1.5 GB of memory. The algorithm is implemented as the Ped-GWAS Analysis (Option 29) in the Mendel statistical genetics package, which is freely available for Macintosh, Linux, and Windows platforms from http://genetics.ucla.edu/software/mendel.
Collapse
Affiliation(s)
- Hua Zhou
- Department of Biostatistics, University of California, Los Angeles, California, United States of America
| | - John Blangero
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Texas, United States of America
| | - Thomas D Dyer
- South Texas Diabetes and Obesity Institute, University of Texas Rio Grande Valley, Texas, United States of America
| | - Kei-Hang K Chan
- Department of Human Genetics, University of California, Los Angeles, California, United States of America
- Department of Epidemiology, University of California, Los Angeles, California, United States of America
| | - Kenneth Lange
- Department of Human Genetics, University of California, Los Angeles, California, United States of America
- Department of Biomathematics, University of California, Los Angeles, California, United States of America
- Department of Statistics, University of California, Los Angeles, California, United States of America
| | - Eric M Sobel
- Department of Human Genetics, University of California, Los Angeles, California, United States of America
| |
Collapse
|
9
|
Zhu H, Wang Z, Wang X, Sha Q. A novel statistical method for rare-variant association studies in general pedigrees. BMC Proc 2016; 10:193-196. [PMID: 27980635 PMCID: PMC5133499 DOI: 10.1186/s12919-016-0029-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Both population-based and family-based designs are commonly used in genetic association studies to identify rare variants that underlie complex diseases. For any type of study design, the statistical power will be improved if rare variants can be enriched in the samples. Family-based designs, with ascertainment based on phenotype, may enrich the sample for causal rare variants and thus can be more powerful than population-based designs. Therefore, it is important to develop family-based statistical methods that can account for ascertainment. In this paper, we develop a novel statistical method for rare-variant association studies in general pedigrees for quantitative traits. This method uses a retrospective view that treats the traits as fixed and the genotypes as random, which allows us to account for complex and undefined ascertainment of families. We then apply the newly developed method to the Genetic Analysis Workshop 19 data set and compare the power of the new method with two other methods for general pedigrees. The results show that the newly proposed method increases power in most of the cases we consider, more than the other two methods.
Collapse
Affiliation(s)
- Huanhuan Zhu
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Drive, Houghton, MI 49931 USA
| | - Zhenchuan Wang
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Drive, Houghton, MI 49931 USA
| | - Xuexia Wang
- Department of Mathematics, University of North Texas, 1155 Union Circle #311430, Denton, TX 76203-5017 USA
| | - Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, 1400 Townsend Drive, Houghton, MI 49931 USA
| |
Collapse
|
10
|
Valcarcel A, Grinde K, Cook K, Green A, Tintle N. A multistep approach to single nucleotide polymorphism-set analysis: an evaluation of power and type I error of gene-based tests of association after pathway-based association tests. BMC Proc 2016; 10:349-355. [PMID: 27980661 PMCID: PMC5133510 DOI: 10.1186/s12919-016-0055-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
The aggregation of functionally associated variants given a priori biological information can aid in the discovery of rare variants associated with complex diseases. Many methods exist that aggregate rare variants into a set and compute a single p value summarizing association between the set of rare variants and a phenotype of interest. These methods are often called gene-based, rare variant tests of association because the variants in the set are often all contained within the same gene. A reasonable extension of these approaches involves aggregating variants across an even larger set of variants (eg, all variants contained in genes within a pathway). Testing sets of variants such as pathways for association with a disease phenotype reduces multiple testing penalties, may increase power, and allows for straightforward biological interpretation. However, a significant variant-set association test does not indicate precisely which variants contained within that set are causal. Because pathways often contain many variants, it may be helpful to follow-up significant pathway tests by conducting gene-based tests on each gene in that pathway to narrow in on the region of causal variants. In this paper, we propose such a multistep approach for variant-set analysis that can also account for covariates and complex pedigree structure. We demonstrate this approach on simulated phenotypes from Genetic Analysis Workshop 19. We find generally better power for the multistep approach when compared to a more conventional, single-step approach that simply runs gene-based tests of association on each gene across the genome. Further work is necessary to evaluate the multistep approach on different data sets with different characteristics.
Collapse
Affiliation(s)
- Alessandra Valcarcel
- Department of Statistics, University of Connecticut, 2390 Alumni Drive, Storrs, CT 06269 USA
| | - Kelsey Grinde
- Department of Biostatistics, University of Washington, NE Pacific St, Seattle, WA 98195 USA
| | - Kaitlyn Cook
- Department of Mathematics and Statistics, Carleton College, 1 N College St, Northfield, MN 55057 USA
| | - Alden Green
- Department of Statistics, Harvard University, Massachusetts Hall, Cambridge, MA 02138 USA
| | - Nathan Tintle
- Department of Mathematics, Statistics and Computer Science, Dordt College, 498 4th Ave. NE, Dordt College, Sioux Center, IA 51250 USA
| |
Collapse
|
11
|
Detecting multiple variants associated with disease based on sequencing data of case-parent trios. J Hum Genet 2016; 61:851-860. [PMID: 27278787 DOI: 10.1038/jhg.2016.63] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Revised: 05/02/2016] [Accepted: 05/03/2016] [Indexed: 01/13/2023]
Abstract
With the advance of next-generation sequencing technology, the rare variants join the common ones in explaining more proportions of heritability. The coexistence of variants of common with rare, causal with neutral and deleterious with protective is a norm and should be appropriately addressed. Some existing methods suffer from low power when one or more forms of coexistence present, impeding their applications in practice. In this paper, for case-parent trios, pseudocontrols are constructed using the nontransmitted alleles of the parents. The Kullback-Leibler divergence is utilized to measure the difference between the distributions of variants in a genetic region for the affected children and pseudocontrols, and two nonparametric test statistics KLTT and cKLTT are proposed. Extensive simulations show that they are robust to the opposite directions of the causal variants and the amount of neutral variants, and have superiority over the existing methods when both rare and common variants are involved. Furthermore, their efficiency is demonstrated in the application to the data from Framingham Heart Study.
Collapse
|
12
|
Wang L, Choi S, Lee S, Park T, Won S. Comparing family-based rare variant association tests for dichotomous phenotypes. BMC Proc 2016; 10:181-186. [PMID: 27980633 PMCID: PMC5133528 DOI: 10.1186/s12919-016-0027-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Background It has been repeatedly stressed that family-based samples suffer less from genetic heterogeneity and that association analyses with family-based samples are expected to be powerful for detecting susceptibility loci for rare disease. Various approaches for rare-variant analysis with family-based samples have been proposed. Methods In this report, performances of the existing methods were compared with the simulated data set provided as part of Genetic Analysis Workshop 19 (GAW19). We considered the rare variant transmission disequilibrium test (RV-TDT), generalized estimating equations-based kernel association (GEE-KM) test, an extended combined multivariate and collapsing test for pedigree data (known as Pedigree Combined Multivariate and Collapsing [PedCMC]), gene-level kernel and burden association tests with disease status for pedigree data (PedGene), and the family-based rare variant association test (FARVAT). Results The results show that PedGene and FARVAT are usually the most efficient, and the optimal test statistic provided by FARVAT is robust under different disease models. Furthermore, FARVAT was implemented with C++, which is more computationally faster than other methods. Conclusions Considering both statistical and computational efficiency, we conclude that FARVAT is a good choice for rare-variant analysis with extended families.
Collapse
Affiliation(s)
- Longfei Wang
- Interdisciplinary Program in bioinformatics, Seoul National University, Seoul, 151-742 Korea
| | - Sungkyoung Choi
- Interdisciplinary Program in bioinformatics, Seoul National University, Seoul, 151-742 Korea
| | - Sungyoung Lee
- Interdisciplinary Program in bioinformatics, Seoul National University, Seoul, 151-742 Korea
| | - Taesung Park
- Interdisciplinary Program in bioinformatics, Seoul National University, Seoul, 151-742 Korea ; Department of Statistics, Seoul National University, Seoul, 151-742 Korea
| | - Sungho Won
- Interdisciplinary Program in bioinformatics, Seoul National University, Seoul, 151-742 Korea ; Department of Public Health Science, Seoul National University, Seoul, 151-742 Korea ; Institute of Health Environment, Seoul National University, Seoul, 151-742 Korea
| |
Collapse
|
13
|
Sippy R, Kolesar JM, Darst BF, Engelman CD. Prioritization of family member sequencing for the detection of rare variants. BMC Proc 2016; 10:227-231. [PMID: 27980641 PMCID: PMC5133500 DOI: 10.1186/s12919-016-0035-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The advent of affordable sequencing has enabled researchers to discover many variants contributing to disease, including rare variants. There are methods for determining the most informative individuals for sequencing, but the application of these methods is more complex when working with families. Sets of large families can be beneficial in finding rare variants, but it may be unfeasible to sequence all members of these family sets. METHODS Using simulated data from the Genetic Analysis Workshop 19, we apply multiple regression to identify cases and controls. To find the best controls for each case, we used kinship coefficients to match within families. Selected cases and controls were analyzed for rare variants, collapsed by gene, associated with hypertension using the family-based rare variant association test (FARVAT). RESULTS The gene with the strongest simulated effect, MAP4, did not meet the Bonferroni corrected significance threshold. However, analysis of cases and controls using our selection method substantially improved the significance of MAP4, despite the reduction in sample size. CONCLUSIONS Taking the additional steps to select the optimal cases and controls from large family data sets can help ensure that only informative individuals are included in analysis and may improve the ability to detect rare variants.
Collapse
Affiliation(s)
- Rachel Sippy
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 WARF Building, Madison, WI 53726 USA
| | - Jill M Kolesar
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 WARF Building, Madison, WI 53726 USA
- School of Pharmacy, University of Wisconsin-Madison, 777 Highland Avenue, Madison, WI 53705 USA
| | - Burcu F Darst
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 WARF Building, Madison, WI 53726 USA
| | - Corinne D Engelman
- Department of Population Health Sciences, University of Wisconsin-Madison, 610 WARF Building, Madison, WI 53726 USA
| |
Collapse
|
14
|
Increasing Generality and Power of Rare-Variant Tests by Utilizing Extended Pedigrees. Am J Hum Genet 2016; 99:846-859. [PMID: 27666371 DOI: 10.1016/j.ajhg.2016.08.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Accepted: 08/17/2016] [Indexed: 11/24/2022] Open
Abstract
Recently, multiple studies have performed whole-exome or whole-genome sequencing to identify groups of rare variants associated with complex traits and diseases. They have primarily utilized case-control study designs that often require thousands of individuals to reach acceptable statistical power. Family-based studies can be more powerful because a rare variant can be enriched in an extended pedigree and segregate with the phenotype. Although many methods have been proposed for using family data to discover rare variants involved in a disease, a majority of them focus on a specific pedigree structure and are designed to analyze either binary or continuously measured outcomes. In this article, we propose RareIBD, a general and powerful approach to identifying rare variants involved in disease susceptibility. Our method can be applied to large extended families of arbitrary structure, including pedigrees with only affected individuals. The method accommodates both binary and quantitative traits. A series of simulation experiments suggest that RareIBD is a powerful test that outperforms existing approaches. In addition, our method accounts for individuals in top generations, which are not usually genotyped in extended families. In contrast to available statistical tests, RareIBD generates accurate p values even when genetic data from these individuals are missing. We applied RareIBD, as well as other methods, to two extended family datasets generated by different genotyping technologies and representing different ethnicities. The analysis of real data confirmed that RareIBD is the only method that properly controls type I error.
Collapse
|
15
|
The impact of genotype calling errors on family-based studies. Sci Rep 2016; 6:28323. [PMID: 27328765 PMCID: PMC4916415 DOI: 10.1038/srep28323] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Accepted: 05/31/2016] [Indexed: 02/07/2023] Open
Abstract
Family-based sequencing studies have unique advantages in enriching rare variants, controlling population stratification, and improving genotype calling. Standard genotype calling algorithms are less likely to call rare variants correctly, often mistakenly calling heterozygotes as reference homozygotes. The consequences of such non-random errors on association tests for rare variants are unclear, particularly in transmission-based tests. In this study, we investigated the impact of genotyping errors on rare variant association tests of family-based sequence data. We performed a comprehensive analysis to study how genotype calling errors affect type I error and statistical power of transmission-based association tests using a variety of realistic parameters in family-based sequencing studies. In simulation studies, we found that biased genotype calling errors yielded not only an inflation of type I error but also a power loss of association tests. We further confirmed our observation using exome sequence data from an autism project. We concluded that non-symmetric genotype calling errors need careful consideration in the analysis of family-based sequence data and we provided practical guidance on ameliorating the test bias.
Collapse
|
16
|
Choi S, Lee S, Qiao D, Hardin M, Cho MH, Silverman EK, Park T, Won S. FARVATX: Family-Based Rare Variant Association Test for X-Linked Genes. Genet Epidemiol 2016; 40:475-85. [PMID: 27325607 DOI: 10.1002/gepi.21979] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2015] [Revised: 03/05/2016] [Accepted: 04/04/2016] [Indexed: 11/06/2022]
Abstract
Although the X chromosome has many genes that are functionally related to human diseases, the complicated biological properties of the X chromosome have prevented efficient genetic association analyses, and only a few significantly associated X-linked variants have been reported for complex traits. For instance, dosage compensation of X-linked genes is often achieved via the inactivation of one allele in each X-linked variant in females; however, some X-linked variants can escape this X chromosome inactivation. Efficient genetic analyses cannot be conducted without prior knowledge about the gene expression process of X-linked variants, and misspecified information can lead to power loss. In this report, we propose new statistical methods for rare X-linked variant genetic association analysis of dichotomous phenotypes with family-based samples. The proposed methods are computationally efficient and can complete X-linked analyses within a few hours. Simulation studies demonstrate the statistical efficiency of the proposed methods, which were then applied to rare-variant association analysis of the X chromosome in chronic obstructive pulmonary disease. Some promising significant X-linked genes were identified, illustrating the practical importance of the proposed methods.
Collapse
Affiliation(s)
- Sungkyoung Choi
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea
| | - Sungyoung Lee
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea
| | - Dandi Qiao
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Megan Hardin
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts, United States of America
| | - Michael H Cho
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts, United States of America
| | - Edwin K Silverman
- Channing Division of Network Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts, United States of America.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, Massachusetts, United States of America
| | - Taesung Park
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea.,Department of Statistics, Seoul National University, Seoul, Korea
| | - Sungho Won
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea.,Department of Public Health Science, Seoul National University, Seoul, Korea.,Institute of Health and Environment, Seoul National University, Seoul, Korea
| |
Collapse
|
17
|
Vsevolozhskaya OA, Zaykin DV, Barondess DA, Tong X, Jadhav S, Lu Q. Uncovering Local Trends in Genetic Effects of Multiple Phenotypes via Functional Linear Models. Genet Epidemiol 2016; 40:210-221. [PMID: 27027515 PMCID: PMC4817279 DOI: 10.1002/gepi.21955] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2015] [Revised: 12/04/2015] [Accepted: 12/14/2015] [Indexed: 12/27/2022]
Abstract
Recent technological advances equipped researchers with capabilities that go beyond traditional genotyping of loci known to be polymorphic in a general population. Genetic sequences of study participants can now be assessed directly. This capability removed technology-driven bias toward scoring predominantly common polymorphisms and let researchers reveal a wealth of rare and sample-specific variants. Although the relative contributions of rare and common polymorphisms to trait variation are being debated, researchers are faced with the need for new statistical tools for simultaneous evaluation of all variants within a region. Several research groups demonstrated flexibility and good statistical power of the functional linear model approach. In this work we extend previous developments to allow inclusion of multiple traits and adjustment for additional covariates. Our functional approach is unique in that it provides a nuanced depiction of effects and interactions for the variables in the model by representing them as curves varying over a genetic region. We demonstrate flexibility and competitive power of our approach by contrasting its performance with commonly used statistical tools and illustrate its potential for discovery and characterization of genetic architecture of complex traits using sequencing data from the Dallas Heart Study.
Collapse
Affiliation(s)
| | - Dmitri V. Zaykin
- National Institute of Environmental Health Sciences, National Institutes of Health, USA
| | - David A. Barondess
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, USA
| | - Xiaoren Tong
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, USA
| | - Sneha Jadhav
- Department of Statistics, Michigan State University, East Lansing, USA
| | - Qing Lu
- Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, USA
| |
Collapse
|
18
|
Abstract
Empirical studies and evolutionary theory support a role for rare variants in the etiology of complex traits. Given this motivation and increasing affordability of whole-exome and whole-genome sequencing, methods for rare variant association have been an active area of research for the past decade. Here, we provide a survey of the current literature and developments from the Genetics Analysis Workshop 19 (GAW19) Collapsing Rare Variants working group. In particular, we present the generalized linear regression framework and associated score statistic for the 2 major types of methods: burden and variance components methods. We further show that by simply modifying weights within these frameworks we arrive at many of the popular existing methods, for example, the cohort allelic sums test and sequence kernel association test. Meta-analysis techniques are also described. Next, we describe the 6 contributions from the GAW19 Collapsing Rare Variants working group. These included development of new methods, such as a retrospective likelihood for family data, a method using genomic structure to compare cases and controls, a haplotype-based meta-analysis, and a permutation-based method for combining different statistical tests. In addition, one contribution compared a mega-analysis of family-based and population-based data to meta-analysis. Finally, the power of existing family-based methods for binary traits was compared. We conclude with suggestions for open research questions.
Collapse
Affiliation(s)
- Stephanie A Santorico
- Department of Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO, 80217-3364, USA.
| | - Audrey E Hendricks
- Department of Mathematical and Statistical Sciences, University of Colorado Denver, Denver, CO, 80217-3364, USA.
| |
Collapse
|
19
|
Lakhal-Chaieb L, Oualkacha K, Richards BJ, Greenwood CM. A rare variant association test in family-based designs and non-normal quantitative traits. Stat Med 2015; 35:905-21. [DOI: 10.1002/sim.6750] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Revised: 09/04/2015] [Accepted: 09/05/2015] [Indexed: 12/13/2022]
Affiliation(s)
- Lajmi Lakhal-Chaieb
- Département de mathématiques et statistique; Université Laval; Québec G1V 0A6 Québec Canada
| | - Karim Oualkacha
- Département de mathématiques; Université de Québec À Montréal; Montreal Québec Canada
| | - Brent J. Richards
- Lady Davis Institute for Medical Research; Jewish General Hospital; Montreal Québec Canada
- Department of Epidemiology, Biostatistics and Occupational Health; McGill University; Montreal Québec Canada
- Department of Twin Research; King's College London; London U.K
| | - Celia M.T. Greenwood
- Lady Davis Institute for Medical Research; Jewish General Hospital; Montreal Québec Canada
- Department of Epidemiology, Biostatistics and Occupational Health; McGill University; Montreal Québec Canada
- Departments of Oncology and Human Genetics; McGill University; Montreal Québec Canada
| |
Collapse
|
20
|
A statistical approach for rare-variant association testing in affected sibships. Am J Hum Genet 2015; 96:543-54. [PMID: 25799106 DOI: 10.1016/j.ajhg.2015.01.020] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2014] [Accepted: 01/30/2015] [Indexed: 11/21/2022] Open
Abstract
Sequencing and exome-chip technologies have motivated development of novel statistical tests to identify rare genetic variation that influences complex diseases. Although many rare-variant association tests exist for case-control or cross-sectional studies, far fewer methods exist for testing association in families. This is unfortunate, because cosegregation of rare variation and disease status in families can amplify association signals for rare variants. Many researchers have begun sequencing (or genotyping via exome chips) familial samples that were either recently collected or previously collected for linkage studies. Because many linkage studies of complex diseases sampled affected sibships, we propose a strategy for association testing of rare variants for use in this study design. The logic behind our approach is that rare susceptibility variants should be found more often on regions shared identical by descent by affected sibling pairs than on regions not shared identical by descent. We propose both burden and variance-component tests of rare variation that are applicable to affected sibships of arbitrary size and that do not require genotype information from unaffected siblings or independent controls. Our approaches are robust to population stratification and produce analytic p values, thereby enabling our approach to scale easily to genome-wide studies of rare variation. We illustrate our methods by using simulated data and exome chip data from sibships ascertained for hypertension collected as part of the Genetic Epidemiology Network of Arteriopathy (GENOA) study.
Collapse
|
21
|
Uh HW, Beekman M, Meulenbelt I, Houwing-Duistermaat JJ. Genotype-Based Score Test for Association Testing in Families. STATISTICS IN BIOSCIENCES 2015; 7:394-416. [PMID: 26473021 PMCID: PMC4596911 DOI: 10.1007/s12561-015-9128-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2013] [Revised: 11/20/2014] [Accepted: 02/17/2015] [Indexed: 11/29/2022]
Abstract
The multiplex-case and control design in which multiple cases are sampled from the same family is considered. In such studies phenotype information of the un-genotyped relatives might be available. We intend to use additional family information when performing genetic association tests. A score test is revisited to provide a flexible framework to accommodate various genetic models and to improve power of the association test by adding available family information. The proposed test accounts for correlations induced by multiple cases from the same pedigree, directly deals with X-linked SNPs in mixed-sex-related samples, and incorporates additional phenotypic information such as the number of (un-genotyped) siblings and parents with similar symptoms by assigning the weights to (genotyped) multiplex cases. In addition, the score test directly incorporates posterior probabilities of imputed genotypes, which leads to an efficiency measure that reflects imputation uncertainty on the test conducted. The proposed test is applied to real applications for illustration. Its efficiency is demonstrated via simulations.
Collapse
Affiliation(s)
- Hae-Won Uh
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, S-5-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
| | - Marian Beekman
- Department of Molecular Epidemiology, Leiden University Medical Center, S-5-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
| | - Ingrid Meulenbelt
- Department of Molecular Epidemiology, Leiden University Medical Center, S-5-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
| | - Jeanine J Houwing-Duistermaat
- Department of Medical Statistics and Bioinformatics, Leiden University Medical Center, S-5-P, P.O. Box 9600, 2300 RC Leiden, The Netherlands
| |
Collapse
|
22
|
Abstract
Family data and rare variants are two key features of whole genome sequencing analysis for hunting the missing heritability of common human diseases. Recently, Zhu and Xiong proposed the generalized T2 tests that combine rare variant analysis and family data analysis. In similar fashion, we developed the extended T2 tests for longitudinal whole genome sequencing data for family-based association studies. The new methods simultaneously incorporate three correlation sources: from linkage disequilibrium, from pedigree structure, and from the repeated measures of covariates. We assess and compare these methods using the simulated data from Genetic Analysis Workshop 18. We show that, in general, the extended T2 tests incorporating longitudinal repeated measures have higher power than the single-time-point T2 tests in detecting hypertension-associated genome segments.
Collapse
Affiliation(s)
- Yiwei Liu
- Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609-2280, USA
| | - Jing Xuan
- Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609-2280, USA
| | - Zheyang Wu
- Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institute Road, Worcester, MA 01609-2280, USA
| |
Collapse
|
23
|
Chen H, Malzahn D, Balliu B, Li C, Bailey JN. Testing genetic association with rare and common variants in family data. Genet Epidemiol 2014; 38 Suppl 1:S37-43. [PMID: 25112186 DOI: 10.1002/gepi.21823] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
With the advance of next-generation sequencing technologies in recent years, rare genetic variant data have now become available for genetic epidemiology studies. For family samples, however, only a few statistical methods for association analysis of rare genetic variants have been developed. Rare variant approaches are of great interest, particularly for family data, because samples enriched for trait-relevant variants can be ascertained and rare variants are putatively enriched through segregation. To facilitate the evaluation of existing and new rare variant testing approaches for analyzing family data, Genetic Analysis Workshop 18 (GAW18) provided genotype and next-generation sequencing data and longitudinal blood pressure traits from extended pedigrees of Mexican American families from the San Antonio Family Study. Our GAW18 group members analyzed real and simulated phenotype data from GAW18 by using generalized linear mixed-effects models or principal components to adjust for familial correlation or by testing binary traits using a correction factor for familial effects. With one exception, approaches dealt with the extended pedigrees in their original state using information based on the kinship matrix or alternative genetic similarity measures. For simulated data our group demonstrated that the family-based kernel machine score test is superior in power to family-based single-marker or burden tests, except in a few specific scenarios. For real data three contributions identified significant associations. They substantially reduced the number of tests before performing the association analysis. We conclude from our real data analyses that further development of strategies for targeted testing or more focused screening of genetic variants is strongly desirable.
Collapse
Affiliation(s)
- Han Chen
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, United States of America
| | | | | | | | | |
Collapse
|
24
|
Affiliation(s)
- Jennifer H Barrett
- Section of Epidemiology and Biostatistics, Leeds Institute of Cancer and Pathology, University of Leeds, Leeds, UK
| |
Collapse
|
25
|
Choi S, Lee S, Cichon S, Nöthen MM, Lange C, Park T, Won S. FARVAT: a family-based rare variant association test. ACTA ACUST UNITED AC 2014; 30:3197-205. [PMID: 25075118 DOI: 10.1093/bioinformatics/btu496] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
MOTIVATION Individuals in each family are genetically more homogeneous than unrelated individuals, and family-based designs are often recommended for the analysis of rare variants. However, despite the importance of family-based samples analysis, few statistical methods for rare variant association analysis are available. RESULTS In this report, we propose a FAmily-based Rare Variant Association Test (FARVAT). FARVAT is based on the quasi-likelihood of whole families, and is statistically and computationally efficient for the extended families. FARVAT assumed that families were ascertained with the disease status of family members, and incorporation of the estimated genetic relationship matrix to the proposed method provided robustness under the presence of the population substructure. Depending on the choice of working matrix, our method could be a burden test or a variance component test, and could be extended to the SKAT-O-type statistic. FARVAT was implemented in C++, and application of the proposed method to schizophrenia data and simulated data for GAW17 illustrated its practical importance. AVAILABILITY The software calculates various statistics for the analysis of related samples, and it is freely downloadable from http://healthstats.snu.ac.kr/software/farvat. CONTACT won1@snu.ac.kr or tspark@stats.snu.ac.kr SUPPLEMENTARY INFORMATION supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sungkyoung Choi
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Sungyoung Lee
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Sven Cichon
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Markus M Nöthen
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Christoph Lange
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Taesung Park
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| | - Sungho Won
- Interdisciplinary Program in bioinformatics, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea, Institute of Human Genetics, University of Bonn, D-53127 Bonn, Germany, Department of Biostatistics, Harvard School of Public Health, 677 Huntington Avenue. Boston, MA 02115, USA, Harvard Medical School, 25 Shattuck St, Boston, MA 02115, USA, Center for Genomic Medicine, Brigham and Women's Hospital, 75 Francis Street, Boston MA 02115, USA, Department of Biostatistics, Harvard School of Public Health, 667 Huntington Ave, Boston, MA 02115, USA, Institute for Genomic Mathematics, University of Bonn, D-53127 Bonn, Germany, German Center for Neurodegenerative Diseases, D-53127 Bonn, Germany, Department of Statistics, Seoul National University 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea and Department of Public Health Science, Seoul National University, 1 Kwanak-ro Kwanak-gu, Seoul 151-742, Korea
| |
Collapse
|
26
|
Guo W, Shugart YY. The power comparison of the haplotype-based collapsing tests and the variant-based collapsing tests for detecting rare variants in pedigrees. BMC Genomics 2014; 15:632. [PMID: 25070353 PMCID: PMC4131059 DOI: 10.1186/1471-2164-15-632] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2013] [Accepted: 07/18/2014] [Indexed: 11/20/2022] Open
Abstract
Background Both common and rare genetic variants have been shown to contribute to the etiology of complex diseases. Recent genome-wide association studies (GWAS) have successfully investigated how common variants contribute to the genetic factors associated with common human diseases. However, understanding the impact of rare variants, which are abundant in the human population (one in every 17 bases), remains challenging. A number of statistical tests have been developed to analyze collapsed rare variants identified by association tests. Here, we propose a haplotype-based approach. This work inspired by an existing statistical framework of the pedigree disequilibrium test (PDT), which uses genetic data to assess the effects of variants in general pedigrees. We aim to compare the performance between the haplotype-based approach and the rare variant-based approach for detecting rare causal variants in pedigrees. Results Extensive simulations in the sequencing setting were carried out to evaluate and compare the haplotype-based approach with the rare variant methods that drew on a more conventional collapsing strategy. As assessed through a variety of scenarios, the haplotype-based pedigree tests had enhanced statistical power compared with the rare variants based pedigree tests when the disease of interest was mainly caused by rare haplotypes (with multiple rare alleles), and vice versa when disease was caused by rare variants acting independently. For most of other situations when disease was caused both by haplotypes with multiple rare alleles and by rare variants with similar effects, these two approaches provided similar power in testing for association. Conclusions The haplotype-based approach was designed to assess the role of rare and potentially causal haplotypes. The proposed rare variants-based pedigree tests were designed to assess the role of rare and potentially causal variants. This study clearly documented the situations under which either method performs better than the other. All tests have been implemented in a software, which was submitted to the Comprehensive R Archive Network (CRAN) for general use as a computer program named rvHPDT.
Collapse
Affiliation(s)
| | - Yin Yao Shugart
- Division of Intramural Division Program, National Institute of Mental Health, National Institute of Health, 35 Convent Drive, Bethesda, MD 20892, USA.
| |
Collapse
|
27
|
Ding X, Su S, Nandakumar K, Wang X, Fardo DW. A 2-step penalized regression method for family-based next-generation sequencing association studies. BMC Proc 2014; 8:S25. [PMID: 25519315 PMCID: PMC4143756 DOI: 10.1186/1753-6561-8-s1-s25] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Large-scale genetic studies are often composed of related participants, and utilizing familial relationships can be cumbersome and computationally challenging. We present an approach to efficiently handle sequencing data from complex pedigrees that incorporates information from rare variants as well as common variants. Our method employs a 2-step procedure that sequentially regresses out correlation from familial relatedness and then uses the resulting phenotypic residuals in a penalized regression framework to test for associations with variants within genetic units. The operating characteristics of this approach are detailed using simulation data based on a large, multigenerational cohort.
Collapse
Affiliation(s)
- Xiuhua Ding
- Department of Biostatistics, University of Kentucky College of Public Health, 111 Washington Ave, Lexington, KY 40536-0003, USA
| | - Shaoyong Su
- Institute of Public and Preventive Health, Department of Pediatrics, Georgia Health Sciences University, School of Medicine, Georgia Prevention Center, HS-1640, Augusta, GA 30912, USA
| | - Kannabiran Nandakumar
- Department of Biostatistics, University of Kentucky College of Public Health, 111 Washington Ave, Lexington, KY 40536-0003, USA
| | - Xiaoling Wang
- Institute of Public and Preventive Health, Department of Pediatrics, Georgia Health Sciences University, School of Medicine, Georgia Prevention Center, HS-1640, Augusta, GA 30912, USA
| | - David W Fardo
- Department of Biostatistics, University of Kentucky College of Public Health, 111 Washington Ave, Lexington, KY 40536-0003, USA
| |
Collapse
|
28
|
Hainline A, Alvarez C, Luedtke A, Greco B, Beck A, Tintle NL. Evaluation of the power and type I error of recently proposed family-based tests of association for rare variants. BMC Proc 2014; 8:S36. [PMID: 25519321 PMCID: PMC4143711 DOI: 10.1186/1753-6561-8-s1-s36] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Until very recently, few methods existed to analyze rare-variant association with binary phenotypes in complex pedigrees. We consider a set of recently proposed methods applied to the simulated and real hypertension phenotype as part of the Genetic Analysis Workshop 18. Minimal power of the methods is observed for genes containing variants with weak effects on the phenotype. Application of the methods to the real hypertension phenotype yielded no genes meeting a strict Bonferroni cutoff of significance. Some prior literature connects 3 of the 5 most associated genes (p <1 × 10−4) to hypertension or related phenotypes. Further methodological development is needed to extend these methods to handle covariates, and to explore more powerful test alternatives.
Collapse
Affiliation(s)
- Allison Hainline
- Department of Statistics, Baylor University, 1311 S 5th St., Waco, TX 76798, USA
| | - Carolina Alvarez
- Department of Biostatistics, Florida International University, 11200 SW 8th St., Miami, FL 33199, USA
| | - Alexander Luedtke
- Divison of Biostatistics, University of California, Berkeley, 101 Sproul Hall, Berkeley, CA 94720, USA
| | - Brian Greco
- Department of Mathematics and Statistics, Grinnell College, 733 Broad St., Grinnell, IA 50112, USA
| | - Andrew Beck
- Department of Mathematics, Loyola University Chicago, 1032 W. Sheridan Rd, Chicago, IL 60660, USA
| | - Nathan L Tintle
- Department of Mathematics, Statistics and Computer Science, 498 4th Ave. NE, Dordt College, Sioux Center, IA 51250, USA
| |
Collapse
|
29
|
Greco B, Luedtke A, Hainline A, Alvarez C, Beck A, Tintle NL. Application of family-based tests of association for rare variants to pathways. BMC Proc 2014; 8:S105. [PMID: 25519359 PMCID: PMC4143675 DOI: 10.1186/1753-6561-8-s1-s105] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
Pathway analysis approaches for sequence data typically either operate in a single stage (all variants within all genes in the pathway are combined into a single, very large set of variants that can then be analyzed using standard "gene-based" test statistics) or in 2-stages (gene-based p values are computed for all genes in the pathway, and then the gene-based p values are combined into a single pathway p value). To date, little consideration has been given to the performance of gene-based tests (typically designed for a smaller number of single-nucleotide variants [SNVs]) when the number of SNVs in the gene or in the pathway is very large and the genotypes come from sequence data organized in large pedigrees. We consider recently proposed gene-based tests for rare variants from complex pedigrees that test for association between a large set of SNVs and a qualitative phenotype of interest (1-stage analyses) as well as 2-stage approaches. We find that many of these methods show inflated type I errors when the number of SNVs in the gene or the pathway is large (>200 SNVs) and when using standard approaches to estimate the genotype covariance matrix. Alternative methods are needed when testing very large sets of SNVs in 1-stage approaches.
Collapse
Affiliation(s)
- Brian Greco
- Department of Mathematics and Statistics, Grinnell College, 1115 8th Ave, Grinnell, IA 50112, USA
| | - Alexander Luedtke
- Division of Biostatistics, UC Berkeley, 367 Evans Hall, Berkeley, CA 94720, USA
| | - Allison Hainline
- Department of Statistics, Baylor University, 1511 S. 5th St, Waco, TX 76798, USA
| | - Carolina Alvarez
- Department of Biostatistics, Florida International University, 11200 SW 8th St., Miami, FL 33199, USA
| | - Andrew Beck
- Department of Mathematics, Loyola University Chicago, 1052 W Loyola Ave, Chicago, IL 60660, USA
| | - Nathan L Tintle
- Department of Mathematics, Statistics and Computer Science, 498 4th Ave. NE, Dordt College, Sioux Center, IA 51250, USA
| |
Collapse
|
30
|
Abstract
Genome-wide association studies are very powerful in determining the genetic variants affecting complex diseases. Most of the available methods are very useful in detecting association between common variants and complex diseases. Recently, methods to detect rare variants in association with complex diseases have been developed with the increasingly available sequencing data from next-generation sequencing. In this paper, we evaluate and compare several of these recent methods for performing statistical association using whole genome sequencing data in pedigrees. Specifically, functional principal component analysis (FPCA), extended combined multivariate and collapsing (CMC) method for families, a generalized T(2) method, and chi-square minimum approach were compared by analyzing all the genetic variants, common and rare, of both the real data set and the simulated data set provided as part of Genetic Analysis Workshop 18.
Collapse
Affiliation(s)
- George Mathew
- Department of Mathematics, Missouri State University, 901 South National Avenue, Springfield, Missouri 65897, USA
| | - Varghese George
- Department of Biostatistics & Epidemiology, Georgia Regents University, 1469 Laney Walker Boulevard, Augusta, Georgia 30912-4900, USA
| | - Hongyan Xu
- Department of Biostatistics & Epidemiology, Georgia Regents University, 1469 Laney Walker Boulevard, Augusta, Georgia 30912-4900, USA
| |
Collapse
|
31
|
Sampson JN, Wheeler B, Li P, Shi J. Leveraging local identity-by-descent increases the power of case/control GWAS with related individuals. Ann Appl Stat 2014; 8:974-998. [DOI: 10.1214/14-aoas715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
32
|
Affiliation(s)
- Harvey L Levy
- Division of Genetics and Genomics, Boston Children's Hospital Boston, Massachusetts ; Department of Pediatrics, Harvard Medical School Boston, Massachusetts
| |
Collapse
|
33
|
Test of rare variant association based on affected sib-pairs. Eur J Hum Genet 2014; 23:229-37. [PMID: 24667785 DOI: 10.1038/ejhg.2014.43] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2013] [Revised: 11/06/2013] [Accepted: 12/30/2013] [Indexed: 11/08/2022] Open
Abstract
With the development of sequencing techniques, there is increasing interest to detect associations between rare variants and complex traits. Quite a few statistical methods to detect associations between rare variants and complex traits have been developed for unrelated individuals. Statistical methods for detecting rare variant associations under family-based designs have not received as much attention as methods for unrelated individuals. Recent studies show that rare disease variants will be enriched in family data and thus family-based designs may improve power to detect rare variant associations. In this article, we propose a novel test to test association between the optimally weighted combination of variants and trait of interests for affected sib-pairs. The optimal weights are analytically derived and can be calculated from sampled genotypes and phenotypes. Based on the optimal weights, the proposed method is robust to the directions of the effects of causal variants and is less affected by neutral variants than existing methods are. Our simulation results show that, in all the cases, the proposed method is substantially more powerful than existing methods based on unrelated individuals and existing methods based on affected sib-pairs.
Collapse
|
34
|
Preston MD, Dudbridge F. Utilising family-based designs for detecting rare variant disease associations. Ann Hum Genet 2014; 78:129-40. [PMID: 24571231 PMCID: PMC4292528 DOI: 10.1111/ahg.12051] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2013] [Accepted: 11/17/2013] [Indexed: 01/04/2023]
Abstract
Rare genetic variants are thought to be important components in the causality of many diseases but discovering these associations is challenging. We demonstrate how best to use family-based designs to improve the power to detect rare variant disease associations. We show that using genetic data from enriched families (those pedigrees with greater than one affected member) increases the power and sensitivity of existing case-control rare variant tests. However, we show that transmission- (or within-family-) based tests do not benefit from this enrichment. This means that, in studies where a limited amount of genotyping is available, choosing a single case from each of many pedigrees has greater power than selecting multiple cases from fewer pedigrees. Finally, we show how a pseudo-case-control design allows a greater range of statistical tests to be applied to family data.
Collapse
Affiliation(s)
- Mark D Preston
- London School of Hygiene and Tropical MedicineKeppel Street, London, WC1E 7HT, UK
| | - Frank Dudbridge
- London School of Hygiene and Tropical MedicineKeppel Street, London, WC1E 7HT, UK
| |
Collapse
|
35
|
Li B, Liu DJ, Leal SM. Identifying rare variants associated with complex traits via sequencing. ACTA ACUST UNITED AC 2014; Chapter 1:Unit 1.26. [PMID: 23853079 DOI: 10.1002/0471142905.hg0126s78] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Although genome-wide association studies have been successful in detecting associations with common variants, there is currently an increasing interest in identifying low-frequency and rare variants associated with complex traits. Next-generation sequencing technologies make it feasible to survey the full spectrum of genetic variation in coding regions or the entire genome. The association analysis for rare variants is challenging, and traditional methods are ineffective, however, due to the low frequency of rare variants, coupled with allelic heterogeneity. Recently a battery of new statistical methods has been proposed for identifying rare variants associated with complex traits. These methods test for associations by aggregating multiple rare variants across a gene or a genomic region or among a group of variants in the genome. In this unit, we describe key concepts for rare variant association for complex traits, survey some of the recent methods, discuss their statistical power under various scenarios, and provide practical guidance on analyzing next-generation sequencing data for identifying rare variants associated with complex traits.
Collapse
Affiliation(s)
- Bingshan Li
- Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, USA
| | | | | |
Collapse
|
36
|
Seneviratne C, Franklin J, Beckett K, Ma JZ, Ait-Daoud N, Payne TJ, Johnson BA, Li MD. Association, interaction, and replication analysis of genes encoding serotonin transporter and 5-HT3 receptor subunits A and B in alcohol dependence. Hum Genet 2013; 132:1165-76. [PMID: 23757001 PMCID: PMC3775919 DOI: 10.1007/s00439-013-1319-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2013] [Accepted: 05/26/2013] [Indexed: 12/12/2022]
Abstract
On the basis of the converging evidence showing regulation of drinking behavior by 5-HT3AB receptors and the serotonin transporter, we hypothesized that the interactive effects of genetic variations in the genes HTR3A, HTR3B, and SLC6A4 confer greater susceptibility to alcohol dependence (AD) than do their effects individually. We examined the associations of AD with 22 SNPs across HTR3A, HTR3B, and two functional variants in SLC6A4 in 500 AD and 280 healthy control individuals of European descent. We found that the alleles of the low-frequency SNPs rs33940208:T in HTR3A and rs2276305:A in HTR3B were inversely and nominally significantly associated with AD with odds ratio (OR) and 95 % confidence interval of 0.212 and 0.073, 0.616 (P = 0.004) and 0.261 and 0.088, 0.777 (P = 0.016), respectively. Further, our gene-by-gene interaction analysis revealed that two four-variant models that differed by only one SNP carried a risk for AD (empirical P < 1 × 10(-6) for prediction accuracy of the two models based on 10(6) permutations). Subsequent analysis of these two interaction models revealed an OR of 2.71 and 2.80, respectively, for AD (P < 0.001) in carriers of genotype combinations 5'-HTTLPR:LL/LS(SLC6A4)-rs1042173:TT/TG(SLC6A4)-rs1176744:AC(HTR3B)-rs3782025:AG(HTR3B) and 5'-HTTLPR:LL/LS(SLC6A4)-rs10160548:GT/TT(HTR3A)-rs1176744:AC(HTR3B)-rs3782025:AG(HTR3B). Combining all five genotypes resulted in an OR of 3.095 (P = 2.0 × 10(-4)) for AD. Inspired by these findings, we conducted the analysis in an independent sample, OZ-ALC-GWAS (N = 6699), obtained from the NIH dbGAP database, which confirmed the findings, not only for all three risk genotype combinations (Z = 4.384, P = 1.0 × 10(-5); Z = 3.155, P = 1.6 × 10(-3); and Z = 3.389, P = 7.0 × 10(-4), respectively), but also protective effects for rs33940208:T (χ (2) = 3.316, P = 0.0686) and rs2276305:A (χ (2) = 7.224, P = 0.007). These findings reveal significant interactive effects among variants in SLC6A4-HTR3A-HTR3B affecting AD. Further studies are needed to confirm these findings and characterize the molecular mechanisms underlying these effects.
Collapse
Affiliation(s)
- Chamindi Seneviratne
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| | - Jason Franklin
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| | - Katherine Beckett
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA:
| | - Jennie Z. Ma
- Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
| | - Nassima Ait-Daoud
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| | - Thomas J. Payne
- ACT Center for Tobacco Treatment, Education and Research, Department of Otolaryngology, University of Mississippi Medical Center, Jackson, USA
| | - Bankole A. Johnson
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| | - Ming D. Li
- Department of Psychiatry and Neurobehavioral Sciences, University of Virginia, 1670 Discovery Drive, Charlottesville, VA 22911, USA
| |
Collapse
|
37
|
Fang S, Zhang S, Sha Q. Detecting association of rare variants by testing an optimally weighted combination of variants for quantitative traits in general families. Ann Hum Genet 2013; 77:524-34. [PMID: 23968488 DOI: 10.1111/ahg.12038] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2012] [Accepted: 07/10/2013] [Indexed: 12/01/2022]
Abstract
Although next-generation sequencing technology allows sequencing the whole genome of large groups of individuals, the development of powerful statistical methods for rare variant association studies is still underway. Even though many statistical methods have been developed for mapping rare variants, most of these methods are for unrelated individuals only, whereas family data have been shown to improve power to detect rare variants. The majority of the existing methods for unrelated individuals is essentially testing the effect of a weighted combination of variants with different weighting schemes. The performance of these methods depends on the weights being used. Recently, researchers proposed a test for Testing the effect of an Optimally Weighted combination of variants (TOW) for unrelated individuals. In this article, we extend our previously developed TOW for unrelated individuals to family-based data and propose a novel test for Testing the effect of an Optimally Weighted combination of variants for Family-based designs (TOW-F). The optimal weights are analytically derived. The results of extensive simulation studies show that TOW-F is robust to population stratification in a wide range of population structures, is robust to the direction and magnitude of the effects of causal variants, and is relatively robust to the percentage of neutral variants.
Collapse
Affiliation(s)
- Shurong Fang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA
| | | | | |
Collapse
|
38
|
Matullo G, Di Gaetano C, Guarrera S. Next generation sequencing and rare genetic variants: from human population studies to medical genetics. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 2013; 54:518-532. [PMID: 23922201 DOI: 10.1002/em.21799] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2013] [Revised: 05/31/2013] [Accepted: 06/09/2013] [Indexed: 06/02/2023]
Abstract
The allelic frequency spectrum emerging from several Next Generation Sequencing (NGS) projects is revealing important details about evolutionary and demographic forces that shaped the human genome. Herein, we discuss some of the achievements of the use of low-frequency and rare variants from NGS studies. The majority of variants that affect protein-coding regions are recent and rare. Often, the novel rare variants are enriched for deleterious alleles and are population-specific, making them suitable for the study of disease susceptibility. To investigate this kind of variation and its effects in association studies, very large sample sizes will be necessary to achieve sufficient statistical power. Moreover, as these variants are typically population-specific, the replication of disease associations across populations could be very difficult due to population stratification. Therefore, the design of experiments focusing on the identification of rare variants and their effects should be carefully planned. Although several successes have already been achieved through NGS for genetic epidemiology, pharmacogenetic and clinical purposes, with improvements of the sequencing technology and decreased costs, further advances are expected in the near future.
Collapse
Affiliation(s)
- Giuseppe Matullo
- Dipartimento di Scienze Mediche, Università di Torino, Torino, Italy.
| | | | | |
Collapse
|
39
|
DeRycke MS, Gunawardena SR, Middha S, Asmann YW, Schaid DJ, McDonnell SK, Riska SM, Eckloff BW, Cunningham JM, Fridley BL, Serie DJ, Bamlet WR, Cicek MS, Jenkins MA, Duggan DJ, Buchanan D, Clendenning M, Haile RW, Woods MO, Gallinger SN, Casey G, Potter JD, Newcomb PA, Le Marchand L, Lindor NM, Thibodeau SN, Goode EL. Identification of novel variants in colorectal cancer families by high-throughput exome sequencing. Cancer Epidemiol Biomarkers Prev 2013; 22:1239-51. [PMID: 23637064 PMCID: PMC3704223 DOI: 10.1158/1055-9965.epi-12-1226] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Colorectal cancer (CRC) in densely affected families without Lynch Syndrome may be due to mutations in undiscovered genetic loci. Familial linkage analyses have yielded disparate results; the use of exome sequencing in coding regions may identify novel segregating variants. METHODS We completed exome sequencing on 40 affected cases from 16 multicase pedigrees to identify novel loci. Variants shared among all sequenced cases within each family were identified and filtered to exclude common variants and single-nucleotide variants (SNV) predicted to be benign. RESULTS We identified 32 nonsense or splice-site SNVs, 375 missense SNVs, 1,394 synonymous or noncoding SNVs, and 50 indels in the 16 families. Of particular interest are two validated and replicated missense variants in CENPE and KIF23, which are both located within previously reported CRC linkage regions, on chromosomes 1 and 15, respectively. CONCLUSIONS Whole-exome sequencing identified DNA variants in multiple genes. Additional sequencing of these genes in additional samples will further elucidate the role of variants in these regions in CRC susceptibility. IMPACT Exome sequencing of familial CRC cases can identify novel rare variants that may influence disease risk.
Collapse
Affiliation(s)
- Melissa S. DeRycke
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Shanaka R. Gunawardena
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Sumit Middha
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Yan W Asmann
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Daniel J. Schaid
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Shannon K. McDonnell
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Shaun M. Riska
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Bruce W Eckloff
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Julie M. Cunningham
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Brooke L. Fridley
- Department of Biostatistics, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Daniel J. Serie
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - William R. Bamlet
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Mine S. Cicek
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Mark A. Jenkins
- Centre for Molecular, Environmental, Genetic and Analytic Epidemiology, University of Melbourne, Victoria 3010, Australia
| | - David J. Duggan
- Translational Genomics Research Institute, Phoenix, AZ, 85004, USA
| | - Daniel Buchanan
- Cancer and Population Studies Group, Queensland Institute of Medical Research, Queensland, Australia
| | - Mark Clendenning
- Cancer and Population Studies Group, Queensland Institute of Medical Research, Queensland, Australia
| | - Robert W. Haile
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Michael O. Woods
- Discipline of Genetics, Faculty of Medicine, Memorial University of Newfoundland, St. Johns, NL, Canada
| | | | - Graham Casey
- Department of Preventive Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - John D. Potter
- Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Polly A. Newcomb
- Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA, 98109, USA
| | - Loic Le Marchand
- Department of Epidemiology, University of Hawaii, Honolulu, HI, USA
| | - Noralane M. Lindor
- Department of Health Sciences Research, Mayo Clinic, Scottsdale, AZ 85259, USA
| | - Stephen N. Thibodeau
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| | - Ellen L. Goode
- Departments of Health Sciences Research, Biomedical Statistics and Informatics, Laboratory Medicine and Pathology, Medical Genetics, Medical Genomics Technology and Advanced Genomics Technology Center, Mayo Clinic College of Medicine, Rochester, MN, 55905, USA
| |
Collapse
|
40
|
Thomas DC. Some surprising twists on the road to discovering the contribution of rare variants to complex diseases. Hum Hered 2013; 74:113-7. [PMID: 23594489 DOI: 10.1159/000347020] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
|
41
|
Trakadis YJ. Patient-controlled encrypted genomic data: an approach to advance clinical genomics. BMC Med Genomics 2012; 5:31. [PMID: 22818218 PMCID: PMC3439266 DOI: 10.1186/1755-8794-5-31] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2011] [Accepted: 06/30/2012] [Indexed: 12/21/2022] Open
Abstract
Background The revolution in DNA sequencing technologies over the past decade has made it feasible to sequence an individual’s whole genome at a relatively low cost. The potential value of the information generated by genomic technologies for medicine and society is enormous. However, in order for exome sequencing, and eventually whole genome sequencing, to be implemented clinically, a number of major challenges need to be overcome. For instance, obtaining meaningful informed-consent, managing incidental findings and the great volume of data generated (including multiple findings with uncertain clinical significance), re-interpreting the genomic data and providing additional counselling to patients as genetic knowledge evolves are issues that need to be addressed. It appears that medical genetics is shifting from the present “phenotype-first” medical model to a “data-first” model which leads to multiple complexities. Discussion This manuscript discusses the different challenges associated with integrating genomic technologies into clinical practice and describes a “phenotype-first” approach, namely, “Individualized Mutation-weighed Phenotype Search”, and its benefits. The proposed approach allows for a more efficient prioritization of the genes to be tested in a clinical lab based on both the patient’s phenotype and his/her entire genomic data. It simplifies “informed-consent” for clinical use of genomic technologies and helps to protect the patient’s autonomy and privacy. Overall, this approach could potentially render widespread use of genomic technologies, in the immediate future, practical, ethical and clinically useful. Summary The “Individualized Mutation-weighed Phenotype Search” approach allows for an incremental integration of genomic technologies into clinical practice. It ensures that we do not over-medicalize genomic data but, rather, continue our current medical model which is based on serving the patient’s concerns. Service should not be solely driven by technology but rather by the medical needs and the extent to which a technology can be safely and effectively utilized.
Collapse
Affiliation(s)
- Yannis J Trakadis
- Department of Medical Genetics, Montreal Children's Hospital-McGill University Health Centre, 2300 Tupper, Montreal, QC, Canada.
| |
Collapse
|