51
|
Simulation of Finnish population history, guided by empirical genetic data, to assess power of rare-variant tests in Finland. Am J Hum Genet 2014; 94:710-20. [PMID: 24768551 DOI: 10.1016/j.ajhg.2014.03.019] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2013] [Accepted: 03/27/2014] [Indexed: 12/18/2022] Open
Abstract
Finnish samples have been extensively utilized in studying single-gene disorders, where the founder effect has clearly aided in discovery, and more recently in genome-wide association studies of complex traits, where the founder effect has had less obvious impacts. As the field starts to explore rare variants' contribution to polygenic traits, it is of great importance to characterize and confirm the Finnish founder effect in sequencing data and to assess its implications for rare-variant association studies. Here, we employ forward simulation, guided by empirical deep resequencing data, to model the genetic architecture of quantitative polygenic traits in both the general European and the Finnish populations simultaneously. We demonstrate that power of rare-variant association tests is higher in the Finnish population, especially when variants' phenotypic effects are tightly coupled with fitness effects and therefore reflect a greater contribution of rarer variants. SKAT-O, variable-threshold tests, and single-variant tests are more powerful than other rare-variant methods in the Finnish population across a range of genetic models. We also compare the relative power and efficiency of exome array genotyping to those of high-coverage exome sequencing. At a fixed cost, less expensive genotyping strategies have far greater power than sequencing; in a fixed number of samples, however, genotyping arrays miss a substantial portion of genetic signals detected in sequencing, even in the Finnish founder population. As genetic studies probe sequence variation at greater depth in more diverse populations, our simulation approach provides a framework for evaluating various study designs for gene discovery.
Collapse
|
52
|
Moore BS, Mirshahi UL, Yost EA, Stepanchick AN, Bedrin MD, Styer AM, Jackson KK, Still CD, Breitwieser GE, Gerhard GS, Carey DJ, Mirshahi T. Long-term weight-loss in gastric bypass patients carrying melanocortin 4 receptor variants. PLoS One 2014; 9:e93629. [PMID: 24705671 PMCID: PMC3976318 DOI: 10.1371/journal.pone.0093629] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Accepted: 02/06/2014] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND The melanocortin 4 receptor (MC4R) critically regulates feeding and satiety. Rare variants in MC4R are predominantly found in obese individuals. Though some rare variants in MC4R discovered in patients have defects in localization, ligand binding and signaling to cAMP, many have no recognized defects. SUBJECTS/METHODS In our cohort of 1433 obese subjects that underwent Roux-en-Y Gastric Bypass (RYGB) surgery, we found fifteen variants of MC4R. We matched rare variant carriers to patients with the MC4R reference alleles for gender, age, starting BMI and T2D to determine the variant effect on weight-loss post-RYGB. In vitro, we determined expression of mutant receptors by ELISA and western blot, and cAMP production by microscopy. RESULTS While carrying a rare MC4R allele is associated with obesity, carriers of rare variants exhibited comparable weight-loss after RYGB to non-carriers. However, subjects carrying three of these variants, V95I, I137T or L250Q, lost less weight after surgery. In vitro, the R305Q mutation caused a defect in cell surface expression while only the I137T and C326R mutations showed impaired cAMP signaling. Despite these apparent differences, there was no correlation between in vitro signaling and pre- or post-surgery clinical phenotype. CONCLUSIONS These data suggest that subtle differences in receptor signaling conferred by rare MC4R variants combined with additional factors predispose carriers to obesity. In the absence of complete MC4R deficiency, these differences can be overcome by the powerful weight-reducing effects of bariatric surgery. In a complex disorder such as obesity, genetic variants that cause subtle defects that have cumulative effects can be overcome after appropriate clinical intervention.
Collapse
Affiliation(s)
- Bryn S. Moore
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Uyenlinh L. Mirshahi
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Evan A. Yost
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Ann N. Stepanchick
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Michael D. Bedrin
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Amanda M. Styer
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Kathryn K. Jackson
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Christopher D. Still
- Geisinger Obesity Institute, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Gerda E. Breitwieser
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Glenn S. Gerhard
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - David J. Carey
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
| | - Tooraj Mirshahi
- Weis Center for Research, Geisinger Clinic, Danville, Pennsylvania, United States of America
- Geisinger Obesity Institute, Geisinger Clinic, Danville, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
53
|
Kim MJ, Oksenberg N, Hoffmann TJ, Vaisse C, Ahituv N. Functional characterization of SIM1-associated enhancers. Hum Mol Genet 2014; 23:1700-8. [PMID: 24203700 PMCID: PMC3943516 DOI: 10.1093/hmg/ddt559] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2013] [Revised: 10/15/2013] [Accepted: 10/31/2013] [Indexed: 12/20/2022] Open
Abstract
Haploinsufficiency of the single-minded homology 1 (SIM1) gene in humans and mice leads to severe obesity, suggesting that altered expression of SIM1, by way of regulatory elements such as enhancers, could predispose individuals to obesity. Here, we identified transcriptional enhancers that could regulate SIM1, using comparative genomics coupled with zebrafish and mouse transgenic enhancer assays. Owing to the dual role of Sim1 in hypothalamic development and in adult energy homeostasis, the enhancer activity of these sequences was annotated from embryonic to adult age. Of the seventeen tested sequences, two SIM1 candidate enhancers (SCE2 and SCE8) were found to have brain-enhancer activity in zebrafish. Both SCE2 and SCE8 also exhibited embryonic brain-enhancer expression in mice, and time course analysis of SCE2 activity showed overlapping expression with Sim1 from embryonic to adult age, notably in the hypothalamus in adult mice. Using a deletion series, we identified the critical region in SCE2 that is needed for enhancer activity in the developing brain. Sequencing this region in obese and lean cohorts revealed a higher prevalence of single nucleotide polymorphisms (SNPs) that were unique to obese individuals, with one variant reducing developmental-enhancer activity in zebrafish. In summary, we have characterized two brain enhancers in the SIM1 locus and identified a set of obesity-specific SNPs within one of them, which may predispose individuals to obesity.
Collapse
Affiliation(s)
- Mee J. Kim
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
| | - Nir Oksenberg
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
| | - Thomas J. Hoffmann
- Institute for Human Genetics
- Department of Epidemiology and Biostatistics and
| | - Christian Vaisse
- Institute for Human Genetics
- Diabetes Center, University of California San Francisco, San Francisco, CA, USA
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences
- Institute for Human Genetics
| |
Collapse
|
54
|
Albuquerque D, Estévez MN, Víbora PB, Giralt PS, Balsera AM, Cortés PG, López MJ, Luego LM, Gervasini G, Hernández SB, Arroyo-Díez J, Vacas MA, Nóbrega C, Manco L, Rodríguez-López R. Novel Variants in theMC4RandLEPRGenes among Severely Obese Children from the Iberian Population. Ann Hum Genet 2014; 78:195-207. [DOI: 10.1111/ahg.12058] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2013] [Accepted: 01/21/2014] [Indexed: 12/12/2022]
Affiliation(s)
- David Albuquerque
- Research Centre for Anthropology and Health (CIAS); Department of Life Sciences; University of Coimbra; Portugal
| | | | - Pilar Beato Víbora
- Department of Dietician; Endocrinologist Service; Infanta Cristina Hospital; Badajoz Spain
| | - Plácida Sánchez Giralt
- Department of Dietician; Endocrinologist Service; Infanta Cristina Hospital; Badajoz Spain
| | | | - Pedro Gil Cortés
- Department of Dietician; Endocrinologist Service; Infanta Cristina Hospital; Badajoz Spain
| | - Mercedes Jiménez López
- Department of Medical & Surgical Therapeutics; Medical School; University of Extremadura; Badajoz Spain
| | - Luis Miguel Luego
- Department of Dietician; Endocrinologist Service; Infanta Cristina Hospital; Badajoz Spain
| | - Guillermo Gervasini
- Department of Medical & Surgical Therapeutics; Medical School; University of Extremadura; Badajoz Spain
| | | | | | | | - Clévio Nóbrega
- Center for Neurosciences & Cell Biology; University of Coimbra; Portugal
| | - Licínio Manco
- Research Centre for Anthropology and Health (CIAS); Department of Life Sciences; University of Coimbra; Portugal
| | | |
Collapse
|
55
|
Li B, Liu DJ, Leal SM. Identifying rare variants associated with complex traits via sequencing. ACTA ACUST UNITED AC 2014; Chapter 1:Unit 1.26. [PMID: 23853079 DOI: 10.1002/0471142905.hg0126s78] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Although genome-wide association studies have been successful in detecting associations with common variants, there is currently an increasing interest in identifying low-frequency and rare variants associated with complex traits. Next-generation sequencing technologies make it feasible to survey the full spectrum of genetic variation in coding regions or the entire genome. The association analysis for rare variants is challenging, and traditional methods are ineffective, however, due to the low frequency of rare variants, coupled with allelic heterogeneity. Recently a battery of new statistical methods has been proposed for identifying rare variants associated with complex traits. These methods test for associations by aggregating multiple rare variants across a gene or a genomic region or among a group of variants in the genome. In this unit, we describe key concepts for rare variant association for complex traits, survey some of the recent methods, discuss their statistical power under various scenarios, and provide practical guidance on analyzing next-generation sequencing data for identifying rare variants associated with complex traits.
Collapse
Affiliation(s)
- Bingshan Li
- Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, USA
| | | | | |
Collapse
|
56
|
Zakharov S, Teoh GHK, Salim A, Thalamuthu A. A method to incorporate prior information into score test for genetic association studies. BMC Bioinformatics 2014; 15:24. [PMID: 24450486 PMCID: PMC3904928 DOI: 10.1186/1471-2105-15-24] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2012] [Accepted: 01/17/2014] [Indexed: 12/13/2022] Open
Abstract
Background The interest of the scientific community in investigating the impact of rare variants on complex traits has stimulated the development of novel statistical methodologies for association studies. The fact that many of the recently proposed methods for association studies suffer from low power to identify a genetic association motivates the incorporation of prior knowledge into statistical tests. Results In this article we propose a methodology to incorporate prior information into the region-based score test. Within our framework prior information is used to partition variants within a region into several groups, following which asymptotically independent group statistics are constructed and then combined into a global test statistic. Under the null hypothesis the distribution of our test statistic has lower degrees of freedom compared with those of the region-based score statistic. Theoretical power comparison, population genetics simulations and results from analysis of the GAW17 sequencing data set suggest that under some scenarios our method may perform as well as or outperform the score test and other competing methods. Conclusions An approach which uses prior information to improve the power of the region-based score test is proposed. Theoretical power comparison, population genetics simulations and the results of GAW17 data analysis showed that for some scenarios power of our method is on the level with or higher than those of the score test and other methods.
Collapse
Affiliation(s)
- Sergii Zakharov
- Human Genetics, Genome Institute of Singapore, 60 Biopolis Street, #02-01 Genome, Singapore 138672, Singapore.
| | | | | | | |
Collapse
|
57
|
Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci U S A 2014; 111:E455-64. [PMID: 24443550 DOI: 10.1073/pnas.1322563111] [Citation(s) in RCA: 418] [Impact Index Per Article: 41.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Genetic studies have revealed thousands of loci predisposing to hundreds of human diseases and traits, revealing important biological pathways and defining novel therapeutic hypotheses. However, the genes discovered to date typically explain less than half of the apparent heritability. Because efforts have largely focused on common genetic variants, one hypothesis is that much of the missing heritability is due to rare genetic variants. Studies of common variants are typically referred to as genomewide association studies, whereas studies of rare variants are often simply called sequencing studies. Because they are actually closely related, we use the terms common variant association study (CVAS) and rare variant association study (RVAS). In this paper, we outline the similarities and differences between RVAS and CVAS and describe a conceptual framework for the design of RVAS. We apply the framework to address key questions about the sample sizes needed to detect association, the relative merits of testing disruptive alleles vs. missense alleles, frequency thresholds for filtering alleles, the value of predictors of the functional impact of missense alleles, the potential utility of isolated populations, the value of gene-set analysis, and the utility of de novo mutations. The optimal design depends critically on the selection coefficient against deleterious alleles and thus varies across genes. The analysis shows that common variant and rare variant studies require similarly large sample collections. In particular, a well-powered RVAS should involve discovery sets with at least 25,000 cases, together with a substantial replication set.
Collapse
|
58
|
Lee EB, Mattson MP. The neuropathology of obesity: insights from human disease. Acta Neuropathol 2014; 127:3-28. [PMID: 24096619 DOI: 10.1007/s00401-013-1190-x] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2013] [Revised: 09/27/2013] [Accepted: 09/28/2013] [Indexed: 02/06/2023]
Abstract
Obesity, a pathologic state defined by excess adipose tissue, is a significant public health problem as it affects a large proportion of individuals and is linked with increased risk for numerous chronic diseases. Obesity is the result of fundamental changes associated with modern society including overnutrition and sedentary lifestyles. Proper energy homeostasis is dependent on normal brain function as the master metabolic regulator, which integrates peripheral signals, modulates autonomic outflow and controls feeding behavior. Therefore, many human brain diseases are associated with obesity. This review explores the neuropathology of obesity by examining brain diseases which either cause or are influenced by obesity. First, several genetic and acquired brain diseases are discussed as a means to understand the central regulation of peripheral metabolism. These diseases range from monogenetic causes of obesity (leptin deficiency, MC4R deficiency, Bardet-Biedl syndrome and others) to complex neurodevelopmental disorders (Prader-Willi syndrome and Sim1 deficiency) and neurodegenerative conditions (frontotemporal dementia and Gourmand's syndrome) and serve to highlight the central regulatory mechanisms which have evolved to maintain energy homeostasis. Next, to examine the effect of obesity on the brain, chronic neuropathologic conditions (epilepsy, multiple sclerosis and Alzheimer's disease) are discussed as examples of obesity leading to maladaptive processes which exacerbate chronic disease. Thus, obesity is associated with multiple pathways including abnormal metabolism, altered hormonal signaling and increased inflammation which act in concert to promote downstream neuropathology. Finally, the effect of anti-obesity interventions is discussed in terms of brain structure and function. Together, understanding human diseases and anti-obesity interventions leads to insights into the bidirectional interaction between peripheral metabolism and central brain function, highlighting the need for continued clinicopathologic and mechanistic studies of the neuropathology of obesity.
Collapse
|
59
|
Lohmueller KE, Sparsø T, Li Q, Andersson E, Korneliussen T, Albrechtsen A, Banasik K, Grarup N, Hallgrimsdottir I, Kiil K, Kilpeläinen TO, Krarup NT, Pers TH, Sanchez G, Hu Y, Degiorgio M, Jørgensen T, Sandbæk A, Lauritzen T, Brunak S, Kristiansen K, Li Y, Hansen T, Wang J, Nielsen R, Pedersen O. Whole-exome sequencing of 2,000 Danish individuals and the role of rare coding variants in type 2 diabetes. Am J Hum Genet 2013; 93:1072-86. [PMID: 24290377 DOI: 10.1016/j.ajhg.2013.11.005] [Citation(s) in RCA: 116] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Revised: 10/16/2013] [Accepted: 11/04/2013] [Indexed: 12/15/2022] Open
Abstract
It has been hypothesized that, in aggregate, rare variants in coding regions of genes explain a substantial fraction of the heritability of common diseases. We sequenced the exomes of 1,000 Danish cases with common forms of type 2 diabetes (including body mass index > 27.5 kg/m(2) and hypertension) and 1,000 healthy controls to an average depth of 56×. Our simulations suggest that our study had the statistical power to detect at least one causal gene (a gene containing causal mutations) if the heritability of these common diseases was explained by rare variants in the coding regions of a limited number of genes. We applied a series of gene-based tests to detect such susceptibility genes. However, no gene showed a significant association with disease risk after we corrected for the number of genes analyzed. Thus, we could reject a model for the genetic architecture of type 2 diabetes where rare nonsynonymous variants clustered in a modest number of genes (fewer than 20) are responsible for the majority of disease risk.
Collapse
Affiliation(s)
- Kirk E Lohmueller
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
60
|
Evaluating empirical bounds on complex disease genetic architecture. Nat Genet 2013; 45:1418-27. [PMID: 24141362 DOI: 10.1038/ng.2804] [Citation(s) in RCA: 106] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2013] [Accepted: 09/30/2013] [Indexed: 12/13/2022]
Abstract
The genetic architecture of human diseases governs the success of genetic mapping and the future of personalized medicine. Although numerous studies have queried the genetic basis of common disease, contradictory hypotheses have been advocated about features of genetic architecture (for example, the contribution of rare versus common variants). We developed an integrated simulation framework, calibrated to empirical data, to enable the systematic evaluation of such hypotheses. For type 2 diabetes (T2D), two simple parameters--(i) the target size for causal mutation and (ii) the coupling between selection and phenotypic effect--define a broad space of architectures. Whereas extreme models are excluded by the combination of epidemiology, linkage and genome-wide association studies, many models remain consistent, including those where rare variants explain either little (<25%) or most (>80%) of T2D heritability. Ongoing sequencing and genotyping studies will further constrain the space of possible architectures, but very large samples (for example, >250,000 unselected individuals) will be required to localize most of the heritability underlying T2D and other traits characterized by these models.
Collapse
|
61
|
Kussmann M, Morine MJ, Hager J, Sonderegger B, Kaput J. Perspective: a systems approach to diabetes research. Front Genet 2013; 4:205. [PMID: 24187547 PMCID: PMC3807566 DOI: 10.3389/fgene.2013.00205] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2013] [Accepted: 09/24/2013] [Indexed: 12/17/2022] Open
Abstract
We review here the status of human type 2 diabetes studies from a genetic, epidemiological, and clinical (intervention) perspective. Most studies limit analyses to one or a few omic technologies providing data of components of physiological processes. Since all chronic diseases are multifactorial and arise from complex interactions between genetic makeup and environment, type 2 diabetes mellitus (T2DM) is a collection of sub-phenotypes resulting in high fasting glucose. The underlying gene–environment interactions that produce these classes of T2DM are imperfectly characterized. Based on assessments of the complexity of T2DM, we propose a systems biology approach to advance the understanding of origin, onset, development, prevention, and treatment of this complex disease. This systems-based strategy is based on new study design principles and the integrated application of omics technologies: we pursue longitudinal studies in which each subject is analyzed at both homeostasis and after (healthy and safe) challenges. Each enrolled subject functions thereby as their own case and control and this design avoids assigning the subjects a priori to case and control groups based on limited phenotyping. Analyses at different time points along this longitudinal investigation are performed with a comprehensive set of omics platforms. These data sets are generated in a biological context, rather than biochemical compound class-driven manner, which we term “systems omics.”
Collapse
Affiliation(s)
- Martin Kussmann
- Nestlé Institute of Health Sciences SA Lausanne, Switzerland ; Faculty of Life Sciences, Ecole Polytechnique Fédérale Lausanne, Switzerland ; Faculty of Science, Aarhus University Aarhus, Denmark
| | | | | | | | | |
Collapse
|
62
|
Gerhard GS, Chu X, Wood GC, Gerhard GM, Benotti P, Petrick AT, Gabrielsen J, Strodel WE, Still CD, Argyropoulos G. Next-generation sequence analysis of genes associated with obesity and nonalcoholic fatty liver disease-related cirrhosis in extreme obesity. Hum Hered 2013; 75:144-51. [PMID: 24081230 DOI: 10.1159/000351719] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
OBJECTIVES Genome-wide association studies (GWAS) have led to the identification of single nucleotide polymorphisms in or near several loci that are associated with the risk of obesity and nonalcoholic fatty liver disease (NAFLD). We hypothesized that missense variants in GWAS and related candidate genes may underlie cases of extreme obesity and NAFLD-related cirrhosis, an extreme manifestation of NAFLD. METHODS We performed whole-exome sequencing on 6 Caucasian patients with extreme obesity [mean body mass index (BMI) 84.4] and 4 obese Caucasian patients (mean BMI 57.0) with NAFLD-related cirrhosis. RESULTS Sequence analysis was performed on 24 replicated GWAS and selected candidate obesity genes and 5 loci associated with NAFLD. No missense variants were identified in 19 of the 29 genes analyzed, although all patients carried at least 2 missense variants in the remaining genes without excess homozygosity. One patient with extreme obesity carried 2 novel damaging mutations in BBS1 and was homozygous for benign and damaging MC3R variants. In addition, 1 patient with NAFLD-related cirrhosis was compound heterozygous for rare damaging mutations in PNPLA3. CONCLUSIONS These results indicate that analyzing candidate loci previously identified by GWAS analyses using whole-exome sequencing is an effective strategy to identify potentially causative missense variants underlying extreme obesity and NAFLD-related cirrhosis.
Collapse
Affiliation(s)
- Glenn S Gerhard
- Geisinger Obesity Research Institute, Geisinger Clinic, Danville, Pa., USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
63
|
Cheng KF, Chen JH. Detecting rare variants in case-parents association studies. PLoS One 2013; 8:e74310. [PMID: 24086332 PMCID: PMC3784439 DOI: 10.1371/journal.pone.0074310] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Accepted: 07/31/2013] [Indexed: 11/19/2022] Open
Abstract
Despite the success of genome-wide association studies (GWASs) in detecting common variants (minor allele frequency ≥0.05) many suggested that rare variants also contribute to the genetic architecture of diseases. Recently, researchers demonstrated that rare variants can show a strong stratification which may not be corrected by using existing methods. In this paper, we focus on a case-parents study and consider methods for testing group-wise association between multiple rare (and common) variants in a gene region and a disease. All tests depend on the numbers of transmitted mutant alleles from parents to their diseased children across variants and hence they are robust to the effect of population stratification. We use extensive simulation studies to compare the performance of four competing tests: the largest single-variant transmission disequilibrium test (TDT), multivariable test, combined TDT, and a likelihood ratio test based on a random-effects model. We find that the likelihood ratio test is most powerful in a wide range of settings and there is no negative impact to its power performance when common variants are also included in the analysis. If deleterious and protective variants are simultaneously analyzed, the likelihood ratio test was generally insensitive to the effect directionality, unless the effects are extremely inconsistent in one direction.
Collapse
Affiliation(s)
- Kuang-Fu Cheng
- Biostatistics Center and Department of Epidemiology, Taipei Medical University, Taipei, Taiwan
- Graduate Institute of Statistics, National Central University, Chungli, Taiwan
| | - Jin-Hua Chen
- Biostatistics Center and Department of Epidemiology, Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
64
|
Good DJ, Braun T. NHLH2: at the intersection of obesity and fertility. Trends Endocrinol Metab 2013; 24:385-90. [PMID: 23684566 PMCID: PMC3732504 DOI: 10.1016/j.tem.2013.04.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/18/2013] [Revised: 04/15/2013] [Accepted: 04/17/2013] [Indexed: 11/28/2022]
Abstract
Nescient helix-loop-helix 2 (NHLH2/NSCL2) is a neuronal transcription factor originally thought to be involved in neuronal development and childhood neuroblastomas. Accumulating evidence has since identified roles for NHLH2 in adult phenotypes of obesity and fertility. We summarize these findings here and attempt to link genotype with phenotype in mouse models and humans. In particular, NHLH2 (Nhlh2 in mice) is one of only two genes that are genetically linked to physical activity levels. Nhlh2 also controls obesity and fertility, with strong sexual dimorphism for both phenotypes in Nhlh2 mutant animals. We propose that Nhlh2 might function as a molecular sensor in different adult hypothalamic neurons to regulate energy balance, leading to normal body weight and reproduction.
Collapse
Affiliation(s)
- Deborah J Good
- Department of Human Nutrition, Foods and Exercise, Virginia Tech University, Blacksburg, VA 24061, USA.
| | | |
Collapse
|
65
|
Navon O, Sul JH, Han B, Conde L, Bracci PM, Riby J, Skibola CF, Eskin E, Halperin E. Rare variant association testing under low-coverage sequencing. Genetics 2013; 194:769-79. [PMID: 23636738 PMCID: PMC3697979 DOI: 10.1534/genetics.113.150169] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2013] [Accepted: 04/17/2013] [Indexed: 01/15/2023] Open
Abstract
Deep sequencing technologies enable the study of the effects of rare variants in disease risk. While methods have been developed to increase statistical power for detection of such effects, detecting subtle associations requires studies with hundreds or thousands of individuals, which is prohibitively costly. Recently, low-coverage sequencing has been shown to effectively reduce the cost of genome-wide association studies, using current sequencing technologies. However, current methods for disease association testing on rare variants cannot be applied directly to low-coverage sequencing data, as they require individual genotype data, which may not be called correctly due to low-coverage and inherent sequencing errors. In this article, we propose two novel methods for detecting association of rare variants with disease risk, using low coverage, error-prone sequencing. We show by simulation that our methods outperform previous methods under both low- and high-coverage sequencing and under different disease architectures. We use real data and simulation studies to demonstrate that to maximize the power to detect associations for a fixed budget, it is desirable to include more samples while lowering coverage and to perform an analysis using our suggested methods.
Collapse
Affiliation(s)
- Oron Navon
- Molecular Microbiology and Biotechnology Department, Tel-Aviv University, Tel Aviv 69978, Israel
| | - Jae Hoon Sul
- Computer Science Department, University of California, Los Angeles, California 90095
| | - Buhm Han
- Division of Genetics, Brigham & Women’s Hospital, Harvard Medical School, Boston, Massachusetts 02115
- Broad Institute of Harvard and Massachusetts Institute of Technology, Cambridge, Massachusetts 02142
| | - Lucia Conde
- Department of Epidemiology, School of Public Health, and the Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, Alabama 35294
| | - Paige M. Bracci
- Department of Epidemiology and Biostatistics, University of California, San Francisco, California 94107
| | - Jacques Riby
- Department of Epidemiology, School of Public Health, and the Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, Alabama 35294
| | - Christine F. Skibola
- Department of Epidemiology, School of Public Health, and the Comprehensive Cancer Center, University of Alabama at Birmingham, Birmingham, Alabama 35294
| | - Eleazar Eskin
- Computer Science Department, University of California, Los Angeles, California 90095
- Department of Human Genetics, University of California, Los Angeles, California 90095
| | - Eran Halperin
- Molecular Microbiology and Biotechnology Department, Tel-Aviv University, Tel Aviv 69978, Israel
- The Blavatnik School of Computer Science, Tel-Aviv University, Tel Aviv 69978, Israel
- International Computer Science Institute, Berkeley, California 94704
| |
Collapse
|
66
|
Ramachandrappa S, Raimondo A, Cali AM, Keogh JM, Henning E, Saeed S, Thompson A, Garg S, Bochukova EG, Brage S, Trowse V, Wheeler E, Sullivan AE, Dattani M, Clayton PE, Datta V, Bruning JB, Wareham NJ, O’Rahilly S, Peet DJ, Barroso I, Whitelaw ML, Farooqi IS, Farooqi IS. Rare variants in single-minded 1 (SIM1) are associated with severe obesity. J Clin Invest 2013; 123:3042-50. [PMID: 23778139 PMCID: PMC3696558 DOI: 10.1172/jci68016] [Citation(s) in RCA: 114] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 04/18/2013] [Indexed: 02/02/2023] Open
Abstract
Single-minded 1 (SIM1) is a basic helix-loop-helix transcription factor involved in the development and function of the paraventricular nucleus of the hypothalamus. Obesity has been reported in Sim1 haploinsufficient mice and in a patient with a balanced translocation disrupting SIM1. We sequenced the coding region of SIM1 in 2,100 patients with severe, early onset obesity and in 1,680 controls. Thirteen different heterozygous variants in SIM1 were identified in 28 unrelated severely obese patients. Nine of the 13 variants significantly reduced the ability of SIM1 to activate a SIM1-responsive reporter gene when studied in stably transfected cells coexpressing the heterodimeric partners of SIM1 (ARNT or ARNT2). SIM1 variants with reduced activity cosegregated with obesity in extended family studies with variable penetrance. We studied the phenotype of patients carrying variants that exhibited reduced activity in vitro. Variant carriers exhibited increased ad libitum food intake at a test meal, normal basal metabolic rate, and evidence of autonomic dysfunction. Eleven of the 13 probands had evidence of a neurobehavioral phenotype. The phenotypic similarities between patients with SIM1 deficiency and melanocortin 4 receptor (MC4R) deficiency suggest that some of the effects of SIM1 deficiency on energy homeostasis are mediated by altered melanocortin signaling.
Collapse
Affiliation(s)
- Shwetha Ramachandrappa
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Anne Raimondo
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Anna M.G. Cali
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Julia M. Keogh
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Elana Henning
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Sadia Saeed
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Amanda Thompson
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Sumedha Garg
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Elena G. Bochukova
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Soren Brage
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Victoria Trowse
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Eleanor Wheeler
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Adrienne E. Sullivan
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Mehul Dattani
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Peter E. Clayton
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Vippan Datta
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - John B. Bruning
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Nick J. Wareham
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Stephen O’Rahilly
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Daniel J. Peet
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Ines Barroso
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Murray L. Whitelaw
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - I. Sadaf Farooqi
- University of Cambridge Metabolic Research Laboratories and NIHR Cambridge Biomedical Research Centre, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Discipline of Biochemistry, School of Molecular and Biomedical Science and Australian Research Council Special Research Centre for the Molecular Genetics of Development, University of Adelaide, Adelaide, Australia.
Wellcome Trust Sanger Institute, Cambridge, United Kingdom.
MRC Epidemiology Unit, Institute of Metabolic Science, Addenbrooke’s Hospital, Cambridge, United Kingdom.
Clinical and Molecular Genetics Unit, University College London Institute of Child Health, London, United Kingdom.
Manchester Academic Health Sciences Centre, Royal Manchester Children’s Hospital, Manchester, United Kingdom.
Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | | |
Collapse
|
67
|
Bonnefond A, Raimondo A, Stutzmann F, Ghoussaini M, Ramachandrappa S, Bersten DC, Durand E, Vatin V, Balkau B, Lantieri O, Raverdy V, Pattou F, Van Hul W, Van Gaal L, Peet DJ, Weill J, Miller JL, Horber F, Goldstone AP, Driscoll DJ, Bruning JB, Meyre D, Whitelaw ML, Froguel P. Loss-of-function mutations in SIM1 contribute to obesity and Prader-Willi-like features. J Clin Invest 2013; 123:3037-41. [PMID: 23778136 DOI: 10.1172/jci68035] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 04/18/2013] [Indexed: 11/17/2022] Open
Abstract
Sim1 haploinsufficiency in mice induces hyperphagic obesity and developmental abnormalities of the brain. In humans, abnormalities in chromosome 6q16, a region that includes SIM1, were reported in obese children with a Prader-Willi-like syndrome; however, SIM1 involvement in obesity has never been conclusively demonstrated. Here, SIM1 was sequenced in 44 children with Prader-Willi-like syndrome features, 198 children with severe early-onset obesity, 568 morbidly obese adults, and 383 controls. We identified 4 rare variants (p.I128T, p.Q152E, p.R581G, and p.T714A) in 4 children with Prader-Willi-like syndrome features (including severe obesity) and 4 other rare variants (p.T46R, p.E62K, p.H323Y, and p.D740H) in 7 morbidly obese adults. By assessing the carriers' relatives, we found a significant contribution of SIM1 rare variants to intra-family risk for obesity. We then assessed functional effects of the 8 substitutions on SIM1 transcriptional activities in stable cell lines using luciferase gene reporter assays. Three mutations showed strong loss-of-function effects (p.T46R, p.H323Y, and p.T714A) and were associated with high intra-family risk for obesity, while the variants with mild or no effects on SIM1 activity were not associated with obesity within families. Our genetic and functional studies demonstrate a firm link between SIM1 loss of function and severe obesity associated with, or independent of, Prader-Willi-like features.
Collapse
Affiliation(s)
- Amélie Bonnefond
- European Genomic Institute for Diabetes, Lille Pasteur Institute, Lille, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
68
|
Fang H, Hou B, Wang Q, Yang Y. Rare variants analysis by risk-based variable-threshold method. Comput Biol Chem 2013; 46:32-8. [PMID: 23764529 DOI: 10.1016/j.compbiolchem.2013.04.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Revised: 04/03/2013] [Accepted: 04/10/2013] [Indexed: 11/17/2022]
Abstract
Genome-wide association studies, as a powerful approach for detecting common variants associated with diseases, have revealed many disease-associated loci. However, the traditional association analysis methods do not have enough power for detecting the effects of rare variants with limited sample size. As a solution to this problem, pooling rare variants by their functions into a composite variant provides an alternative way for identifying susceptible genes. In this paper, we propose a new pooling method to test the variant-disease association and to identify the functional rare variants related with the disease. Variants with smaller and larger risk measures defined as the ratio of allele frequencies between cases and controls are pooled and a chi-square test of the resultant pooled table is calculated. We vary the threshold of pooling over all possible values and use the maximal chi-square as test statistic. The maximal chi-square is in fact the global maximum over all possible poolings. Our approach is similar to the existing variable-threshold method, but we threshold on the risk measure instead of allele frequencies of controls. Simulation results show that our method performs better in both association testing and variant selection.
Collapse
Affiliation(s)
- Hongyan Fang
- Department of Statistics and Finance, University of Science and Technology of China, Hefei, Anhui 230026, China
| | | | | | | |
Collapse
|
69
|
Wu G, Zhi D. Pathway-based approaches for sequencing-based genome-wide association studies. Genet Epidemiol 2013; 37:478-94. [PMID: 23650134 DOI: 10.1002/gepi.21728] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2012] [Revised: 03/04/2013] [Accepted: 03/29/2013] [Indexed: 01/07/2023]
Abstract
For analyzing complex trait association with sequencing data, most current studies test aggregated effects of variants in a gene or genomic region. Although gene-based tests have insufficient power even for moderately sized samples, pathway-based analyses combine information across multiple genes in biological pathways and may offer additional insight. However, most existing pathway association methods are originally designed for genome-wide association studies, and are not comprehensively evaluated for sequencing data. Moreover, region-based rare variant association methods, although potentially applicable to pathway-based analysis by extending their region definition to gene sets, have never been rigorously tested. In the context of exome-based studies, we use simulated and real datasets to evaluate pathway-based association tests. Our simulation strategy adopts a genome-wide genetic model that distributes total genetic effects hierarchically into pathways, genes, and individual variants, allowing the evaluation of pathway-based methods with realistic quantifiable assumptions on the underlying genetic architectures. The results show that, although no single pathway-based association method offers superior performance in all simulated scenarios, a modification of Gene Set Enrichment Analysis approach using statistics from single-marker tests without gene-level collapsing (weighted Kolmogrov-Smirnov [WKS]-Variant method) is consistently powerful. Interestingly, directly applying rare variant association tests (e.g., sequence kernel association test) to pathway analysis offers a similar power, but its results are sensitive to assumptions of genetic architecture. We applied pathway association analysis to an exome-sequencing data of the chronic obstructive pulmonary disease, and found that the WKS-Variant method confirms associated genes previously published.
Collapse
Affiliation(s)
- Guodong Wu
- Department of Biostatistics, University of Alabama at Birmingham, Birmingham, Alabama 35294, USA
| | | |
Collapse
|
70
|
Shah KP, Douglas JA. A method to prioritize quantitative traits and individuals for sequencing in family-based studies. PLoS One 2013; 8:e62545. [PMID: 23626830 PMCID: PMC3633859 DOI: 10.1371/journal.pone.0062545] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2012] [Accepted: 03/21/2013] [Indexed: 11/18/2022] Open
Abstract
Owing to recent advances in DNA sequencing, it is now technically feasible to evaluate the contribution of rare variation to complex traits and diseases. However, it is still cost prohibitive to sequence the whole genome (or exome) of all individuals in each study. For quantitative traits, one strategy to reduce cost is to sequence individuals in the tails of the trait distribution. However, the next challenge becomes how to prioritize traits and individuals for sequencing since individuals are often characterized for dozens of medically relevant traits. In this article, we describe a new method, the Rare Variant Kinship Test (RVKT), which leverages relationship information in family-based studies to identify quantitative traits that are likely influenced by rare variants. Conditional on nuclear families and extended pedigrees, we evaluate the power of the RVKT via simulation. Not unexpectedly, the power of our method depends strongly on effect size, and to a lesser extent, on the frequency of the rare variant and the number and type of relationships in the sample. As an illustration, we also apply our method to data from two genetic studies in the Old Order Amish, a founder population with extensive genealogical records. Remarkably, we implicate the presence of a rare variant that lowers fasting triglyceride levels in the Heredity and Phenotype Intervention (HAPI) Heart study (p = 0.044), consistent with the presence of a previously identified null mutation in the APOC3 gene that lowers fasting triglyceride levels in HAPI Heart study participants.
Collapse
Affiliation(s)
- Kaanan P. Shah
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan, United States of America
| | - Julie A. Douglas
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
71
|
Davies RW, Lau P, Naing T, Nikpay M, Doelle H, Harper ME, Dent R, McPherson R. A 680 kb duplication at the FTO locus in a kindred with obesity and a distinct body fat distribution. Eur J Hum Genet 2013; 21:1417-22. [PMID: 23591406 DOI: 10.1038/ejhg.2013.63] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2012] [Revised: 03/03/2013] [Accepted: 03/07/2013] [Indexed: 11/09/2022] Open
Abstract
Common intronic SNPs in the human fat mass and obesity associated (FTO) gene are strongly associated with body mass index (BMI). In mouse models, inactivation of the Fto gene results in a lean phenotype, whereas overexpression of Fto leads to increased food intake and obesity. The latter finding suggests that copy number variants at the FTO locus might be associated with extremes of adiposity. To address this question, we searched for rare, private or de novo copy number variation in a cohort of 985 obese and 869 lean subjects of European ancestry drawn from the extremes of the BMI distribution, genotyped on Affymetrix 6.0 arrays. A ∼680 kb duplication, confirmed by real-time PCR and G-to-FISH analyses, was observed between ∼rs11859825 and rs9932411 in a 68-year-old male with severe obesity. The duplicated region on chromosome 16 spans the entire genome-wide association studies risk locus for obesity, and further encompasses RBL2, AKTIP, RPGRIP1L and all but the last exon of the FTO gene. Affected family members exhibit a unique obesity phenotype, characterized by increased fat distribution in the shoulders and neck with a significantly increased neck circumference. This phenotype was accompanied by increased peripheral blood expression of RBL2 with no alteration in expression of FTO or other genes in the region. No other duplications or deletions in this region were identified in the cohort of obese and lean individuals or in a further survey of 4778 individuals, suggesting that large rare copy number variants surrounding the FTO gene are not a frequent cause of obesity.
Collapse
Affiliation(s)
- Robert W Davies
- Atherogenomics Laboratory, University of Ottawa Heart Institute, Ottawa, Ontario, Canada
| | | | | | | | | | | | | | | |
Collapse
|
72
|
Analysis of rare, exonic variation amongst subjects with autism spectrum disorders and population controls. PLoS Genet 2013; 9:e1003443. [PMID: 23593035 PMCID: PMC3623759 DOI: 10.1371/journal.pgen.1003443] [Citation(s) in RCA: 116] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2012] [Accepted: 02/26/2013] [Indexed: 01/09/2023] Open
Abstract
We report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD. This study evaluates association of rare variants and autism spectrum disorders (ASD) in case and control samples sequenced by two centers. Before doing association analyses, we studied how to combine information across studies. We first harmonized the whole-exome sequence (WES) data, across centers, in terms of the distribution of rare variation. Key features included filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. After filtering, the vast majority of variants calls from seven samples sequenced at both centers matched. We also evaluated whether one should combine summary statistics from data from each center (meta-analysis) or combine data and analyze it together (mega-analysis). For many gene-based tests, we showed that mega-analysis yields more power. After quality control of data from 1,039 ASD cases and 870 controls and a range of analyses, no gene showed exome-wide evidence of significant association. Our results comport with recent results demonstrating that hundreds of genes affect risk for ASD; they suggest that rare risk variants are scattered across these many genes, and thus larger samples will be required to identify those genes.
Collapse
|
73
|
Brunham LR, Hayden MR. Hunting human disease genes: lessons from the past, challenges for the future. Hum Genet 2013; 132:603-17. [PMID: 23504071 PMCID: PMC3654184 DOI: 10.1007/s00439-013-1286-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 02/23/2013] [Indexed: 12/30/2022]
Abstract
The concept that a specific alteration in an individual’s DNA can result in disease is central to our notion of molecular medicine. The molecular basis of more than 3,500 Mendelian disorders has now been identified. In contrast, the identification of genes for common disease has been much more challenging. We discuss historical and contemporary approaches to disease gene identification, focusing on novel opportunities such as the use of population extremes and the identification of rare variants. While our ability to sequence DNA has advanced dramatically, assigning function to a given sequence change remains a major challenge, highlighting the need for both bioinformatics and functional approaches to appropriately interpret these data. We review progress in mapping and identifying human disease genes and discuss future challenges and opportunities for the field.
Collapse
Affiliation(s)
- Liam R. Brunham
- Department of Medicine, Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, Canada
- Translational Laboratory for Genetic Medicine, National University of Singapore and the Association for Science, Technology and Research (A*STAR), Singapore, Singapore
| | - Michael R. Hayden
- Department of Medical Genetics, Centre for Molecular Medicine and Therapeutics, Child and Family Research Institute, University of British Columbia, Vancouver, Canada
- Translational Laboratory for Genetic Medicine, National University of Singapore and the Association for Science, Technology and Research (A*STAR), Singapore, Singapore
| |
Collapse
|
74
|
McPherson R. From Genome-Wide Association Studies to Functional Genomics: New Insights Into Cardiovascular Disease. Can J Cardiol 2013. [DOI: 10.1016/j.cjca.2012.08.017] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022] Open
|
75
|
Melanocortin-4 Receptor in Energy Homeostasis and Obesity Pathogenesis. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2013; 114:147-91. [DOI: 10.1016/b978-0-12-386933-3.00005-4] [Citation(s) in RCA: 115] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
76
|
Guo Y, Lanktree MB, Taylor KC, Hakonarson H, Lange LA, Keating BJ. Gene-centric meta-analyses of 108 912 individuals confirm known body mass index loci and reveal three novel signals. Hum Mol Genet 2013; 22:184-201. [PMID: 23001569 PMCID: PMC3522401 DOI: 10.1093/hmg/dds396] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2012] [Revised: 08/04/2012] [Accepted: 09/06/2012] [Indexed: 12/18/2022] Open
Abstract
Recent genetic association studies have made progress in uncovering components of the genetic architecture of the body mass index (BMI). We used the ITMAT-Broad-Candidate Gene Association Resource (CARe) (IBC) array comprising up to 49 320 single nucleotide polymorphisms (SNPs) across ~2100 metabolic and cardiovascular-related loci to genotype up to 108 912 individuals of European ancestry (EA), African-Americans, Hispanics and East Asians, from 46 studies, to provide additional insight into SNPs underpinning BMI. We used a five-phase study design: Phase I focused on meta-analysis of EA studies providing individual level genotype data; Phase II performed a replication of cohorts providing summary level EA data; Phase III meta-analyzed results from the first two phases; associated SNPs from Phase III were used for replication in Phase IV; finally in Phase V, a multi-ethnic meta-analysis of all samples from four ethnicities was performed. At an array-wide significance (P < 2.40E-06), we identify novel BMI associations in loci translocase of outer mitochondrial membrane 40 homolog (yeast) - apolipoprotein E - apolipoprotein C-I (TOMM40-APOE-APOC1) (rs2075650, P = 2.95E-10), sterol regulatory element binding transcription factor 2 (SREBF2, rs5996074, P = 9.43E-07) and neurotrophic tyrosine kinase, receptor, type 2 [NTRK2, a brain-derived neurotrophic factor (BDNF) receptor gene, rs1211166, P = 1.04E-06] in the Phase IV meta-analysis. Of 10 loci with previous evidence for BMI association represented on the IBC array, eight were replicated, with the remaining two showing nominal significance. Conditional analyses revealed two independent BMI-associated signals in BDNF and melanocortin 4 receptor (MC4R) regions. Of the 11 array-wide significant SNPs, three are associated with gene expression levels in both primary B-cells and monocytes; with rs4788099 in SH2B adaptor protein 1 (SH2B1) notably being associated with the expression of multiple genes in cis. These multi-ethnic meta-analyses expand our knowledge of BMI genetics.
Collapse
Affiliation(s)
- Yiran Guo
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Boulevard, Abramson Research Center, Suite 1014H, Philadelphia 19104, PA, USA
- BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
| | - Matthew B. Lanktree
- Department of Medicine and
- Department of Biochemistry, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Ontario, Canada
| | - Kira C. Taylor
- Department of Epidemiology and Population Health, School of Public Health and Information Sciences, University of Louisville, Louisville, KY 40292, USA and
- Epidemiology and
| | - Hakon Hakonarson
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Boulevard, Abramson Research Center, Suite 1014H, Philadelphia 19104, PA, USA
| | - Leslie A. Lange
- Genetics, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Brendan J. Keating
- Center for Applied Genomics, Children's Hospital of Philadelphia, 3615 Civic Center Boulevard, Abramson Research Center, Suite 1014H, Philadelphia 19104, PA, USA
| | | |
Collapse
|
77
|
Population-based rare variant detection via pooled exome or custom hybridization capture with or without individual indexing. BMC Genomics 2012; 13:683. [PMID: 23216810 PMCID: PMC3534616 DOI: 10.1186/1471-2164-13-683] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2012] [Accepted: 11/23/2012] [Indexed: 11/15/2022] Open
Abstract
Background Rare genetic variation in the human population is a major source of pathophysiological variability and has been implicated in a host of complex phenotypes and diseases. Finding disease-related genes harboring disparate functional rare variants requires sequencing of many individuals across many genomic regions and comparing against unaffected cohorts. However, despite persistent declines in sequencing costs, population-based rare variant detection across large genomic target regions remains cost prohibitive for most investigators. In addition, DNA samples are often precious and hybridization methods typically require large amounts of input DNA. Pooled sample DNA sequencing is a cost and time-efficient strategy for surveying populations of individuals for rare variants. We set out to 1) create a scalable, multiplexing method for custom capture with or without individual DNA indexing that was amenable to low amounts of input DNA and 2) expand the functionality of the SPLINTER algorithm for calling substitutions, insertions and deletions across either candidate genes or the entire exome by integrating the variant calling algorithm with the dynamic programming aligner, Novoalign. Results We report methodology for pooled hybridization capture with pre-enrichment, indexed multiplexing of up to 48 individuals or non-indexed pooled sequencing of up to 92 individuals with as little as 70 ng of DNA per person. Modified solid phase reversible immobilization bead purification strategies enable no sample transfers from sonication in 96-well plates through adapter ligation, resulting in 50% less library preparation reagent consumption. Custom Y-shaped adapters containing novel 7 base pair index sequences with a Hamming distance of ≥2 were directly ligated onto fragmented source DNA eliminating the need for PCR to incorporate indexes, and was followed by a custom blocking strategy using a single oligonucleotide regardless of index sequence. These results were obtained aligning raw reads against the entire genome using Novoalign followed by variant calling of non-indexed pools using SPLINTER or SAMtools for indexed samples. With these pipelines, we find sensitivity and specificity of 99.4% and 99.7% for pooled exome sequencing. Sensitivity, and to a lesser degree specificity, proved to be a function of coverage. For rare variants (≤2% minor allele frequency), we achieved sensitivity and specificity of ≥94.9% and ≥99.99% for custom capture of 2.5 Mb in multiplexed libraries of 22–48 individuals with only ≥5-fold coverage/chromosome, but these parameters improved to ≥98.7 and 100% with 20-fold coverage/chromosome. Conclusions This highly scalable methodology enables accurate rare variant detection, with or without individual DNA sample indexing, while reducing the amount of required source DNA and total costs through less hybridization reagent consumption, multi-sample sonication in a standard PCR plate, multiplexed pre-enrichment pooling with a single hybridization and lesser sequencing coverage required to obtain high sensitivity.
Collapse
|
78
|
Abstract
A new generation of genetic studies of diabetes is underway. Following from initial genome-wide association (GWA) studies, more recent approaches have used genotyping arrays of more densely spaced markers, imputation of ungenotyped variants based on improved reference haplotype panels, and sequencing of protein-coding exomes and whole genomes. Experimental and statistical advances make possible the identification of novel variants and loci contributing to trait variation and disease risk. Integration of sequence variants with functional analysis is critical to interpreting the consequences of identified variants. We briefly review these methods and technologies and describe how they will continue to expand our understanding of the genetic risk factors and underlying biology of diabetes.
Collapse
Affiliation(s)
- Karen L. Mohlke
- 5096 Genetic Medicine, 120 Mason Farm Drive, University of North Carolina, Chapel Hill, NC 27599-7264, USA, Tel: 919-966-2913, Fax: 919-843-0291
| | - Laura J. Scott
- M4134 SPH II, 1415 Washington Heights, University of Michigan, Ann Arbor, MI 48109-2029, USA, Tel: 734-763-0006, Fax: 734-763-2215
| |
Collapse
|
79
|
Ferguson J, Wheeler W, Fu Y, Prokunina-Olsson L, Zhao H, Sampson J. Statistical tests for detecting associations with groups of genetic variants: generalization, evaluation, and implementation. Eur J Hum Genet 2012; 21:680-6. [PMID: 23092956 DOI: 10.1038/ejhg.2012.220] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
With recent advances in sequencing, genotyping arrays, and imputation, GWAS now aim to identify associations with rare and uncommon genetic variants. Here, we describe and evaluate a class of statistics, generalized score statistics (GSS), that can test for an association between a group of genetic variants and a phenotype. GSS are a simple weighted sum of single-variant statistics and their cross-products. We show that the majority of statistics currently used to detect associations with rare variants are equivalent to choosing a specific set of weights within this framework. We then evaluate the power of various weighting schemes as a function of variant characteristics, such as MAF, the proportion associated with the phenotype, and the direction of effect. Ultimately, we find that two classical tests are robust and powerful, but details are provided as to when other GSS may perform favorably. The software package CRaVe is available at our website (http://dceg.cancer.gov/bb/tools/crave).
Collapse
Affiliation(s)
- John Ferguson
- Division of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | | | | | | | | | | |
Collapse
|
80
|
|
81
|
Al Rayyan N, Wankhade UD, Bush K, Good DJ. Two single nucleotide polymorphisms in the human nescient helix-loop-helix 2 (NHLH2) gene reduce mRNA stability and DNA binding. Gene 2012; 512:134-42. [PMID: 23026212 DOI: 10.1016/j.gene.2012.09.068] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2012] [Revised: 08/07/2012] [Accepted: 09/12/2012] [Indexed: 01/17/2023]
Abstract
Nescient helix-loop-helix-2 (NHLH2) is a basic helix-loop-helix transcription factor, which has been implicated, using mouse knockouts, in adult body weight regulation and fertility. A scan of the known single nucleotide polymorphisms (SNPs) in the NHLH2 gene revealed one in the 3' untranslated region (3'UTR), which lies within an AUUUA RNA stability motif. A second SNP is nonsynonymous within the coding region of NHLH2, and was found in a genome-wide association study for obesity. Both of these SNPs were examined for their effect on NLHL2 by creating mouse mimics and examining mRNA stability, and protein function in mouse hypothalamic cell lines. The 3'UTR SNP causes increased instability and, when the SNP-containing Nhlh2 3'UTR is attached to luciferase mRNA, reduced protein levels in cells. The nonsynonymous SNP at position 83 in the protein changes an alanine residue, conserved in NHLH2 orthologs through the Drosophila sp. to a proline residue. This change affects migration of the protein on an SDS-PAGE gel, and appears to alter secondary structure of the protein, as predicted using in silico methods. These results provide functional information on two rare human SNPs in the NHLH2 gene. One of these has been linked to human obese phenotypes, while the other is present in a relatively high proportion of individuals. Given their effects on NHLH2 protein levels, both SNPs deserve further analysis in whether they are causative and/or additive for human body weight and fertility phenotypes.
Collapse
Affiliation(s)
- Numan Al Rayyan
- Department of Human Nutrition, Foods and Exercise, Virginia Tech University, Blacksburg, VA 24061, USA
| | | | | | | |
Collapse
|
82
|
Sunyaev SR. Inferring causality and functional significance of human coding DNA variants. Hum Mol Genet 2012; 21:R10-7. [PMID: 22990389 DOI: 10.1093/hmg/dds385] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Sequencing technology enables the complete characterization of human genetic variation. Statistical genetics studies identify numerous loci linked to or associated with phenotypes of direct medical interest. The major remaining challenge is to characterize functionally significant alleles that are causally implicated in the genetic basis of human traits. Here, I review three sources of evidence for the functional significance of human DNA variants in protein-coding genes. These include (i) statistical genetics considerations such as co-segregation with the phenotype, allele frequency in unaffected controls and recurrence; (ii) in vitro functional assays and model organism experiments; and (iii) computational methods for predicting the functional effect of amino acid substitutions. In spite of many successes of recent studies, functional characterization of human allelic variants remains problematic.
Collapse
Affiliation(s)
- Shamil R Sunyaev
- Genetics Division, Brigham and Women's Hospital, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.
| |
Collapse
|
83
|
Mägi R, Asimit JL, Day-Williams AG, Zeggini E, Morris AP. Genome-wide association analysis of imputed rare variants: application to seven common complex diseases. Genet Epidemiol 2012; 36:785-96. [PMID: 22951892 PMCID: PMC3569874 DOI: 10.1002/gepi.21675] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2012] [Revised: 07/23/2012] [Accepted: 07/27/2012] [Indexed: 12/21/2022]
Abstract
Genome-wide association studies have been successful in identifying loci contributing effects to a range of complex human traits. The majority of reproducible associations within these loci are with common variants, each of modest effect, which together explain only a small proportion of heritability. It has been suggested that much of the unexplained genetic component of complex traits can thus be attributed to rare variation. However, genome-wide association study genotyping chips have been designed primarily to capture common variation, and thus are underpowered to detect the effects of rare variants. Nevertheless, we demonstrate here, by simulation, that imputation from an existing scaffold of genome-wide genotype data up to high-density reference panels has the potential to identify rare variant associations with complex traits, without the need for costly re-sequencing experiments. By application of this approach to genome-wide association studies of seven common complex diseases, imputed up to publicly available reference panels, we identify genome-wide significant evidence of rare variant association in PRDM10 with coronary artery disease and multiple genes in the major histocompatibility complex (MHC) with type 1 diabetes. The results of our analyses highlight that genome-wide association studies have the potential to offer an exciting opportunity for gene discovery through association with rare variants, conceivably leading to substantial advancements in our understanding of the genetic architecture underlying complex human traits.
Collapse
Affiliation(s)
- Reedik Mägi
- Estonian Genome Centre, University of Tartu, Tartu, Estonia
| | | | | | | | | |
Collapse
|
84
|
Lee S, Emond MJ, Bamshad MJ, Barnes KC, Rieder MJ, Nickerson DA, Christiani D, Wurfel M, Lin X, Lin X. Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. Am J Hum Genet 2012; 91:224-37. [PMID: 22863193 DOI: 10.1016/j.ajhg.2012.06.007] [Citation(s) in RCA: 730] [Impact Index Per Article: 60.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2012] [Revised: 05/22/2012] [Accepted: 06/12/2012] [Indexed: 12/23/2022] Open
Abstract
We propose in this paper a unified approach for testing the association between rare variants and phenotypes in sequencing association studies. This approach maximizes power by adaptively using the data to optimally combine the burden test and the nonburden sequence kernel association test (SKAT). Burden tests are more powerful when most variants in a region are causal and the effects are in the same direction, whereas SKAT is more powerful when a large fraction of the variants in a region are noncausal or the effects of causal variants are in different directions. The proposed unified test maintains the power in both scenarios. We show that the unified test corresponds to the optimal test in an extended family of SKAT tests, which we refer to as SKAT-O. The second goal of this paper is to develop a small-sample adjustment procedure for the proposed methods for the correction of conservative type I error rates of SKAT family tests when the trait of interest is dichotomous and the sample size is small. Both small-sample-adjusted SKAT and the optimal unified test (SKAT-O) are computationally efficient and can easily be applied to genome-wide sequencing association studies. We evaluate the finite sample performance of the proposed methods using extensive simulation studies and illustrate their application using the acute-lung-injury exome-sequencing data of the National Heart, Lung, and Blood Institute Exome Sequencing Project.
Collapse
|
85
|
Cheung YH, Wang G, Leal SM, Wang S. A fast and noise-resilient approach to detect rare-variant associations with deep sequencing data for complex disorders. Genet Epidemiol 2012; 36:675-85. [PMID: 22865616 DOI: 10.1002/gepi.21662] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2012] [Accepted: 06/14/2012] [Indexed: 11/11/2022]
Abstract
Next generation sequencing technology has enabled the paradigm shift in genetic association studies from the common disease/common variant to common disease/rare-variant hypothesis. Analyzing individual rare variants is known to be underpowered; therefore association methods have been developed that aggregate variants across a genetic region, which for exome sequencing is usually a gene. The foreseeable widespread use of whole genome sequencing poses new challenges in statistical analysis. It calls for new rare-variant association methods that are statistically powerful, robust against high levels of noise due to inclusion of noncausal variants, and yet computationally efficient. We propose a simple and powerful statistic that combines the disease-associated P-values of individual variants using a weight that is the inverse of the expected standard deviation of the allele frequencies under the null. This approach, dubbed as Sigma-P method, is extremely robust to the inclusion of a high proportion of noncausal variants and is also powerful when both detrimental and protective variants are present within a genetic region. The performance of the Sigma-P method was tested using simulated data based on realistic population demographic and disease models and its power was compared to several previously published methods. The results demonstrate that this method generally outperforms other rare-variant association methods over a wide range of models. Additionally, sequence data on the ANGPTL family of genes from the Dallas Heart Study were tested for associations with nine metabolic traits and both known and novel putative associations were uncovered using the Sigma-P method.
Collapse
Affiliation(s)
- Yee Him Cheung
- Department of Biostatistics, Mailman School of Public Health, Columbia University, New York, New York 10032, USA
| | | | | | | |
Collapse
|
86
|
Sha Q, Wang S, Zhang S. Adaptive clustering and adaptive weighting methods to detect disease associated rare variants. Eur J Hum Genet 2012; 21:332-7. [PMID: 22781093 DOI: 10.1038/ejhg.2012.143] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Current statistical methods to test association between rare variants and phenotypes are essentially the group-wise methods that collapse or aggregate all variants in a predefined group into a single variant. Comparing with the variant-by-variant methods, the group-wise methods have their advantages. However, two factors may affect the power of these methods. One is that some of the causal variants may be protective. When both risk and protective variants are presented, it will lose power by collapsing or aggregating all variants because the effects of risk and protective variants will counteract each other. The other is that not all variants in the group are causal; rather, a large proportion is believed to be neutral. When a large proportion of variants are neutral, collapsing or aggregating all variants may not be an optimal solution. We propose two alternative methods, adaptive clustering (AC) method and adaptive weighting (AW) method, aiming to test rare variant association in the presence of neutral and/or protective variants. Both of AC and AW are applicable to quantitative traits as well as qualitative traits. Results of extensive simulation studies show that AC and AW have similar power and both of them have clear advantages from power to computational efficiency comparing with existing group-wise methods and existing data-driven methods that allow neutral and protective variants. We recommend AW method because AW method is computationally more efficient than AC method.
Collapse
Affiliation(s)
- Qiuying Sha
- Department of Mathematical Sciences, Michigan Technological University, Houghton, MI, USA
| | | | | |
Collapse
|
87
|
A noncomplementation screen for quantitative trait alleles in saccharomyces cerevisiae. G3-GENES GENOMES GENETICS 2012; 2:753-60. [PMID: 22870398 PMCID: PMC3385981 DOI: 10.1534/g3.112.002550] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2012] [Accepted: 04/30/2012] [Indexed: 11/18/2022]
Abstract
Both linkage and linkage disequilibrium mapping provide well-defined approaches to mapping quantitative trait alleles. However, alleles of small effect are particularly difficult to refine to individual genes and causative mutations. Quantitative noncomplementation provides a means of directly testing individual genes for quantitative trait alleles in a fixed genetic background. Here, we implement a genome-wide noncomplementation screen for quantitative trait alleles that affect colony color or size by using the yeast deletion collection. As proof of principle, we find a previously known allele of CYS4 that affects colony color and a novel allele of CTT1 that affects resistance to hydrogen peroxide. To screen nearly 4700 genes in nine diverse yeast strains, we developed a high-throughput robotic plating assay to quantify colony color and size. Although we found hundreds of candidate alleles, reciprocal hemizygosity analysis of a select subset revealed that many of the candidates were false positives, in part the result of background-dependent haploinsufficiency or second-site mutations within the yeast deletion collection. Our results highlight the difficulty of identifying small-effect alleles but support the use of noncomplementation as a rapid means of identifying quantitative trait alleles of large effect.
Collapse
|
88
|
Vallania F, Ramos E, Cresci S, Mitra RD, Druley TE. Detection of rare genomic variants from pooled sequencing using SPLINTER. J Vis Exp 2012:3943. [PMID: 22760212 PMCID: PMC3471313 DOI: 10.3791/3943] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
As DNA sequencing technology has markedly advanced in recent years(2), it has become increasingly evident that the amount of genetic variation between any two individuals is greater than previously thought(3). In contrast, array-based genotyping has failed to identify a significant contribution of common sequence variants to the phenotypic variability of common disease(4,5). Taken together, these observations have led to the evolution of the Common Disease / Rare Variant hypothesis suggesting that the majority of the "missing heritability" in common and complex phenotypes is instead due to an individual's personal profile of rare or private DNA variants(6-8). However, characterizing how rare variation impacts complex phenotypes requires the analysis of many affected individuals at many genomic loci, and is ideally compared to a similar survey in an unaffected cohort. Despite the sequencing power offered by today's platforms, a population-based survey of many genomic loci and the subsequent computational analysis required remains prohibitive for many investigators. To address this need, we have developed a pooled sequencing approach(1,9) and a novel software package(1) for highly accurate rare variant detection from the resulting data. The ability to pool genomes from entire populations of affected individuals and survey the degree of genetic variation at multiple targeted regions in a single sequencing library provides excellent cost and time savings to traditional single-sample sequencing methodology. With a mean sequencing coverage per allele of 25-fold, our custom algorithm, SPLINTER, uses an internal variant calling control strategy to call insertions, deletions and substitutions up to four base pairs in length with high sensitivity and specificity from pools of up to 1 mutant allele in 500 individuals. Here we describe the method for preparing the pooled sequencing library followed by step-by-step instructions on how to use the SPLINTER package for pooled sequencing analysis (http://www.ibridgenetwork.org/wustl/splinter). We show a comparison between pooled sequencing of 947 individuals, all of whom also underwent genome-wide array, at over 20kb of sequencing per person. Concordance between genotyping of tagged and novel variants called in the pooled sample were excellent. This method can be easily scaled up to any number of genomic loci and any number of individuals. By incorporating the internal positive and negative amplicon controls at ratios that mimic the population under study, the algorithm can be calibrated for optimal performance. This strategy can also be modified for use with hybridization capture or individual-specific barcodes and can be applied to the sequencing of naturally heterogeneous samples, such as tumor DNA.
Collapse
Affiliation(s)
- Francesco Vallania
- Center for Genome Sciences and Systems Biology, Department of Genetics, Washington University School of Medicine
| | | | | | | | | |
Collapse
|
89
|
Freudenberg J, Gregersen PK, Freudenberg-Hua Y. A simple method for analyzing exome sequencing data shows distinct levels of nonsynonymous variation for human immune and nervous system genes. PLoS One 2012; 7:e38087. [PMID: 22701602 PMCID: PMC3368947 DOI: 10.1371/journal.pone.0038087] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2012] [Accepted: 05/03/2012] [Indexed: 11/29/2022] Open
Abstract
To measure the strength of natural selection that acts upon single nucleotide variants (SNVs) in a set of human genes, we calculate the ratio between nonsynonymous SNVs (nsSNVs) per nonsynonymous site and synonymous SNVs (sSNVs) per synonymous site. We transform this ratio with a respective factor f that corrects for the bias of synonymous sites towards transitions in the genetic code and different mutation rates for transitions and transversions. This method approximates the relative density of nsSNVs (rdnsv) in comparison with the neutral expectation as inferred from the density of sSNVs. Using SNVs from a diploid genome and 200 exomes, we apply our method to immune system genes (ISGs), nervous system genes (NSGs), randomly sampled genes (RSGs), and gene ontology annotated genes. The estimate of rdnsv in an individual exome is around 20% for NSGs and 30-40% for ISGs and RSGs. This smaller rdnsv of NSGs indicates overall stronger purifying selection. To quantify the relative shift of nsSNVs towards rare variants, we next fit a linear regression model to the estimates of rdnsv over different SNV allele frequency bins. The obtained regression models show a negative slope for NSGs, ISGs and RSGs, supporting an influence of purifying selection on the frequency spectrum of segregating nsSNVs. The y-intercept of the model predicts rdnsv for an allele frequency close to 0. This parameter can be interpreted as the proportion of nonsynonymous sites where mutations are tolerated to segregate with an allele frequency notably greater than 0 in the population, given the performed normalization of the observed nsSNV to sSNV ratio. A smaller y-intercept is displayed by NSGs, indicating more nonsynonymous sites under strong negative selection. This predicts more monogenically inherited or de-novo mutation diseases that affect the nervous system.
Collapse
Affiliation(s)
- Jan Freudenberg
- Robert S. Boas Center for Human Genetics and Genomics, The Feinstein Institute for Medical Research, Northshore LIJ Healthsystem, Manhasset, New York, United States of America.
| | | | | |
Collapse
|
90
|
Friese RS, Ye C, Nievergelt CM, Schork AJ, Mahapatra NR, Rao F, Napolitan PS, Waalen J, Ehret GB, Munroe PB, Schmid-Schönbein GW, Eskin E, O'Connor DT. Integrated computational and experimental analysis of the neuroendocrine transcriptome in genetic hypertension identifies novel control points for the cardiometabolic syndrome. ACTA ACUST UNITED AC 2012; 5:430-40. [PMID: 22670052 DOI: 10.1161/circgenetics.111.962415] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
BACKGROUND Essential hypertension, a common complex disease, displays substantial genetic influence. Contemporary methods to dissect the genetic basis of complex diseases such as the genomewide association study are powerful, yet a large gap exists between the fraction of population trait variance explained by such associations and total disease heritability. METHODS AND RESULTS We developed a novel, integrative method (combining animal models, transcriptomics, bioinformatics, molecular biology, and trait-extreme phenotypes) to identify candidate genes for essential hypertension and the metabolic syndrome. We first undertook transcriptome profiling on adrenal glands from blood pressure extreme mouse strains: the hypertensive BPH (blood pressure high) and hypotensive BPL (blood pressure low). Microarray data clustering revealed a striking pattern of global underexpression of intermediary metabolism transcripts in BPH. The MITRA algorithm identified a conserved motif in the transcriptional regulatory regions of the underexpressed metabolic genes, and we then hypothesized that regulation through this motif contributed to the global underexpression. Luciferase reporter assays demonstrated transcriptional activity of the motif through transcription factors HOXA3, SRY, and YY1. We finally hypothesized that genetic variation at HOXA3, SRY, and YY1 might predict blood pressure and other metabolic syndrome traits in humans. Tagging variants for each locus were associated with blood pressure in a human population blood pressure extreme sample with the most extensive associations for YY1 tagging single nucleotide polymorphism rs11625658 on systolic blood pressure, diastolic blood pressure, body mass index, and fasting glucose. Meta-analysis extended the YY1 results into 2 additional large population samples with significant effects preserved on diastolic blood pressure, body mass index, and fasting glucose. CONCLUSIONS The results outline an innovative, systematic approach to the genetic pathogenesis of complex cardiovascular disease traits and point to transcription factor YY1 as a potential candidate gene involved in essential hypertension and the cardiometabolic syndrome.
Collapse
Affiliation(s)
- Ryan S Friese
- Department of Bioengineering, University of California at San Diego School of Medicine, 9500 Gilman Drive, La Jolla, CA 92093-0838, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
91
|
Fang S, Sha Q, Zhang S. Two adaptive weighting methods to test for rare variant associations in family-based designs. Genet Epidemiol 2012; 36:499-507. [PMID: 22674630 DOI: 10.1002/gepi.21646] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2011] [Revised: 04/26/2012] [Accepted: 04/26/2012] [Indexed: 11/06/2022]
Abstract
Although next-generation DNA sequencing technologies have made rare variant association studies feasible and affordable, the development of powerful statistical methods for rare variant association studies is still under way. Most of the existing methods for rare variant association studies compare the number of rare mutations in a group of rare variants (in a gene or a pathway) between cases and controls. However, these methods assume that all causal variants are risk to diseases. Recently, several methods that are robust to the direction and magnitude of effects of causal variants have been proposed. However, they are applicable to unrelated individuals only, whereas family data have been shown to improve power to detect rare variants. In this article, we propose two adaptive weighting methods for rare variant association studies based on family data for quantitative traits. Using extensive simulation studies, we evaluate and compare our proposed methods with two methods based on the weights proposed by Madsen and Browning. Our results show that both proposed methods are robust to population stratification, robust to the direction and magnitude of the effects of causal variants, and more powerful than the methods using weights suggested by Madsen and Browning, especially when both risk and protective variants are present.
Collapse
Affiliation(s)
- Shurong Fang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan 49931, USA
| | | | | |
Collapse
|
92
|
Tranah GJ, Lam ET, Katzman SM, Nalls MA, Zhao Y, Evans DS, Yokoyama JS, Pawlikowska L, Kwok PY, Mooney S, Kritchevsky S, Goodpaster BH, Newman AB, Harris TB, Manini TM, Cummings SR. Mitochondrial DNA sequence variation is associated with free-living activity energy expenditure in the elderly. BIOCHIMICA ET BIOPHYSICA ACTA-BIOENERGETICS 2012; 1817:1691-700. [PMID: 22659402 DOI: 10.1016/j.bbabio.2012.05.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Revised: 05/19/2012] [Accepted: 05/24/2012] [Indexed: 01/11/2023]
Abstract
The decline in activity energy expenditure underlies a range of age-associated pathological conditions, neuromuscular and neurological impairments, disability, and mortality. The majority (90%) of the energy needs of the human body are met by mitochondrial oxidative phosphorylation (OXPHOS). OXPHOS is dependent on the coordinated expression and interaction of genes encoded in the nuclear and mitochondrial genomes. We examined the role of mitochondrial genomic variation in free-living activity energy expenditure (AEE) and physical activity levels (PAL) by sequencing the entire (~16.5 kilobases) mtDNA from 138 Health, Aging, and Body Composition Study participants. Among the common mtDNA variants, the hypervariable region 2 m.185G>A variant was significantly associated with AEE (p=0.001) and PAL (p=0.0005) after adjustment for multiple comparisons. Several unique nonsynonymous variants were identified in the extremes of AEE with some occurring at highly conserved sites predicted to affect protein structure and function. Of interest is the p.T194M, CytB substitution in the lower extreme of AEE occurring at a residue in the Qi site of complex III. Among participants with low activity levels, the burden of singleton variants was 30% higher across the entire mtDNA and OXPHOS complex I when compared to those having moderate to high activity levels. A significant pooled variant association across the hypervariable 2 region was observed for AEE and PAL. These results suggest that mtDNA variation is associated with free-living AEE in older persons and may generate new hypotheses by which specific mtDNA complexes, genes, and variants may contribute to the maintenance of activity levels in late life.
Collapse
Affiliation(s)
- Gregory J Tranah
- California Pacific Medical Center Research Institute, San Francisco, San Francisco, CA 94107, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
93
|
|
94
|
Nelson MR, Wegmann D, Ehm MG, Kessner D, St Jean P, Verzilli C, Shen J, Tang Z, Bacanu SA, Fraser D, Warren L, Aponte J, Zawistowski M, Liu X, Zhang H, Zhang Y, Li J, Li Y, Li L, Woollard P, Topp S, Hall MD, Nangle K, Wang J, Abecasis G, Cardon LR, Zöllner S, Whittaker JC, Chissoe SL, Novembre J, Mooser V. An abundance of rare functional variants in 202 drug target genes sequenced in 14,002 people. Science 2012; 337:100-4. [PMID: 22604722 DOI: 10.1126/science.1217876] [Citation(s) in RCA: 483] [Impact Index Per Article: 40.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Rare genetic variants contribute to complex disease risk; however, the abundance of rare variants in human populations remains unknown. We explored this spectrum of variation by sequencing 202 genes encoding drug targets in 14,002 individuals. We find rare variants are abundant (1 every 17 bases) and geographically localized, so that even with large sample sizes, rare variant catalogs will be largely incomplete. We used the observed patterns of variation to estimate population growth parameters, the proportion of variants in a given frequency class that are putatively deleterious, and mutation rates for each gene. We conclude that because of rapid population growth and weak purifying selection, human populations harbor an abundance of rare variants, many of which are deleterious and have relevance to understanding disease risk.
Collapse
Affiliation(s)
- Matthew R Nelson
- Department of Quantitative Sciences, GlaxoSmithKline (GSK), Research Triangle Park, NC 27709, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
95
|
Liu DJ, Leal SM. SEQCHIP: a powerful method to integrate sequence and genotype data for the detection of rare variant associations. ACTA ACUST UNITED AC 2012; 28:1745-51. [PMID: 22556370 DOI: 10.1093/bioinformatics/bts263] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
MOTIVATION Next-generation sequencing greatly increases the capacity to detect rare-variant complex-trait associations. However, it is still expensive to sequence a large number of samples and therefore often small datasets are used. Given cost constraints, a potentially more powerful two-step strategy is to sequence a subset of the sample to discover variants, and genotype the identified variants in the remaining sample. If only cases are sequenced, directly combining sequence and genotype data will lead to inflated type-I errors in rare-variant association analysis. Although several methods have been developed to correct for the bias, they are either underpowered or theoretically invalid. We proposed a new method SEQCHIP to integrate genotype and sequence data, which can be used with most existing rare-variant tests. RESULTS It is demonstrated using both simulated and real datasets that the SEQCHIP method has controlled type-I errors, and is substantially more powerful than all other currently available methods. AVAILABILITY SEQCHIP is implemented in an R-Package and is available at http://linkage.rockefeller.edu/suzanne/seqchip/Seqchip.html.
Collapse
Affiliation(s)
- Dajiang J Liu
- Department of Biostatistics, Center of Statistical Genetics, University of Michigan, Ann Arbor, MI 48109, USA.
| | | |
Collapse
|
96
|
Liu DJ, Leal SM. A unified framework for detecting rare variant quantitative trait associations in pedigree and unrelated individuals via sequence data. Hum Hered 2012; 73:105-22. [PMID: 22555759 DOI: 10.1159/000336293] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2011] [Accepted: 01/07/2012] [Indexed: 11/19/2022] Open
Abstract
OBJECTIVES There is great interest to sequence unrelated or pedigree samples for detecting rare variant quantitative trait associations. In order to reduce the cost of sequencing and improve power, many studies sequence selected samples with extreme traits. Existing methods for detecting rare variant associations were developed for unrelated samples. Methods are needed to analyze (selected or randomly ascertained) pedigree samples. METHODS We propose a unified framework of modeling extreme trait genetic associations (MEGA) with rare variants. Using MEGA and appropriate permutation algorithms, many rare variant tests can be extended to family data. As an application, we compared study designs using both sib-pairs and unrelated individuals. Extensive simulations were carried out using realistic population genetic and complex trait models. RESULTS It is demonstrated that when extreme sampling is implemented within equal-sized cohorts of unrelated individuals or sib-pairs, analyzing unrelated individuals is consistently more powerful than studying sib-pairs. A higher portion of rare variants can be identified through sequencing unrelated samples compared to sibs. Alternatively, if samples are ascertained using fixed thresholds from an infinite-sized population, sequencing one sib with the most extreme trait from each extreme concordant sib-pair is consistently the most powerful design. CONCLUSIONS MEGA will play an important role in the analysis of sequence-based genetic association studies.
Collapse
Affiliation(s)
- Dajiang J Liu
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | |
Collapse
|
97
|
Daye ZJ, Li H, Wei Z. A powerful test for multiple rare variants association studies that incorporates sequencing qualities. Nucleic Acids Res 2012; 40:e60. [PMID: 22262732 PMCID: PMC3340416 DOI: 10.1093/nar/gks024] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
Next-generation sequencing data will soon become routinely available for association studies between complex traits and rare variants. Sequencing data, however, are characterized by the presence of sequencing errors at each individual genotype. This makes it especially challenging to perform association studies of rare variants, which, due to their low minor allele frequencies, can be easily perturbed by genotype errors. In this article, we develop the quality-weighted multivariate score association test (qMSAT), a new procedure that allows powerful association tests between complex traits and multiple rare variants under the presence of sequencing errors. Simulation results based on quality scores from real data show that the qMSAT often dominates over current methods, that do not utilize quality information. In particular, the qMSAT can dramatically increase power over existing methods under moderate sample sizes and relatively low coverage. Moreover, in an obesity data study, we identified using the qMSAT two functional regions (MGLL promoter and MGLL 3′-untranslated region) where rare variants are associated with extreme obesity. Due to the high cost of sequencing data, the qMSAT is especially valuable for large-scale studies involving rare variants, as it can potentially increase power without additional experimental cost. qMSAT is freely available at http://qmsat.sourceforge.net/.
Collapse
Affiliation(s)
- Z John Daye
- Department of Biostatistics and Epidemiology, University of Pennsylvania School of Medicine, Philadelphia, PA 19104, USA
| | | | | |
Collapse
|
98
|
Yi N, Liu N, Zhi D, Li J. Hierarchical generalized linear models for multiple groups of rare and common variants: jointly estimating group and individual-variant effects. PLoS Genet 2011; 7:e1002382. [PMID: 22144906 PMCID: PMC3228815 DOI: 10.1371/journal.pgen.1002382] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2011] [Accepted: 09/29/2011] [Indexed: 12/19/2022] Open
Abstract
Complex diseases and traits are likely influenced by many common and rare genetic variants and environmental factors. Detecting disease susceptibility variants is a challenging task, especially when their frequencies are low and/or their effects are small or moderate. We propose here a comprehensive hierarchical generalized linear model framework for simultaneously analyzing multiple groups of rare and common variants and relevant covariates. The proposed hierarchical generalized linear models introduce a group effect and a genetic score (i.e., a linear combination of main-effect predictors for genetic variants) for each group of variants, and jointly they estimate the group effects and the weights of the genetic scores. This framework includes various previous methods as special cases, and it can effectively deal with both risk and protective variants in a group and can simultaneously estimate the cumulative contribution of multiple variants and their relative importance. Our computational strategy is based on extending the standard procedure for fitting generalized linear models in the statistical software R to the proposed hierarchical models, leading to the development of stable and flexible tools. The methods are illustrated with sequence data in gene ANGPTL4 from the Dallas Heart Study. The performance of the proposed procedures is further assessed via simulation studies. The methods are implemented in a freely available R package BhGLM (http://www.ssg.uab.edu/bhglm/).
Collapse
Affiliation(s)
- Nengjun Yi
- Department of Biostatistics, Section on Statistical Genetics, University of Alabama at Birmingham, Birmingham, Alabama, USA.
| | | | | | | |
Collapse
|
99
|
Marini NJ, Hoffmann TJ, Lammer EJ, Hardin J, Lazaruk K, Stein JB, Gilbert DA, Wright C, Lipzen A, Pennacchio LA, Carmichael SL, Witte JS, Shaw GM, Rine J. A genetic signature of spina bifida risk from pathway-informed comprehensive gene-variant analysis. PLoS One 2011; 6:e28408. [PMID: 22140583 PMCID: PMC3227667 DOI: 10.1371/journal.pone.0028408] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2011] [Accepted: 11/07/2011] [Indexed: 12/16/2022] Open
Abstract
Despite compelling epidemiological evidence that folic acid supplements reduce the frequency of neural tube defects (NTDs) in newborns, common variant association studies with folate metabolism genes have failed to explain the majority of NTD risk. The contribution of rare alleles as well as genetic interactions within the folate pathway have not been extensively studied in the context of NTDs. Thus, we sequenced the exons in 31 folate-related genes in a 480-member NTD case-control population to identify the full spectrum of allelic variation and determine whether rare alleles or obvious genetic interactions within this pathway affect NTD risk. We constructed a pathway model, predetermined independent of the data, which grouped genes into coherent sets reflecting the distinct metabolic compartments in the folate/one-carbon pathway (purine synthesis, pyrimidine synthesis, and homocysteine recycling to methionine). By integrating multiple variants based on these groupings, we uncovered two provocative, complex genetic risk signatures. Interestingly, these signatures differed by race/ethnicity: a Hispanic risk profile pointed to alterations in purine biosynthesis, whereas that in non-Hispanic whites implicated homocysteine metabolism. In contrast, parallel analyses that focused on individual alleles, or individual genes, as the units by which to assign risk revealed no compelling associations. These results suggest that the ability to layer pathway relationships onto clinical variant data can be uniquely informative for identifying genetic risk as well as for generating mechanistic hypotheses. Furthermore, the identification of ethnic-specific risk signatures for spina bifida resonated with epidemiological data suggesting that the underlying pathogenesis may differ between Hispanic and non-Hispanic groups.
Collapse
Affiliation(s)
- Nicholas J. Marini
- Department of Molecular and Cellular Biology, California Institute for Quantitative Biosciences, University of California, Berkeley, California, United States of America
- * E-mail: (NJM); (JR)
| | - Thomas J. Hoffmann
- Department of Epidemiology and Biostatistics and Institute of Human Genetics, University of California San Francisco, San Francisco, California, United States of America
| | - Edward J. Lammer
- Children's Hospital Oakland Research Institute, Oakland, California, United States of America
| | - Jill Hardin
- VitaPath Genetics, Inc., Foster City, California, United States of America
| | - Katherine Lazaruk
- VitaPath Genetics, Inc., Foster City, California, United States of America
| | - Jason B. Stein
- VitaPath Genetics, Inc., Foster City, California, United States of America
| | - Dennis A. Gilbert
- VitaPath Genetics, Inc., Foster City, California, United States of America
| | - Crystal Wright
- Department of Energy, Joint Genome Institute, Walnut Creek, California, United States of America
| | - Anna Lipzen
- Department of Energy, Joint Genome Institute, Walnut Creek, California, United States of America
| | - Len A. Pennacchio
- Department of Energy, Joint Genome Institute, Walnut Creek, California, United States of America
| | - Suzan L. Carmichael
- Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States of America
| | - John S. Witte
- Department of Epidemiology and Biostatistics and Institute of Human Genetics, University of California San Francisco, San Francisco, California, United States of America
| | - Gary M. Shaw
- Department of Pediatrics, Stanford University School of Medicine, Stanford, California, United States of America
| | - Jasper Rine
- Department of Molecular and Cellular Biology, California Institute for Quantitative Biosciences, University of California, Berkeley, California, United States of America
- * E-mail: (NJM); (JR)
| |
Collapse
|
100
|
Zhang Q, Chung D, Kraja A, Borecki II, Province MA. Methods for adjusting population structure and familial relatedness in association test for collective effect of multiple rare variants on quantitative traits. BMC Proc 2011; 5 Suppl 9:S35. [PMID: 22373066 PMCID: PMC3287871 DOI: 10.1186/1753-6561-5-s9-s35] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Because of the low frequency of rare genetic variants in observed data, the statistical power of detecting their associations with target traits is usually low. The collapsing test of collective effect of multiple rare variants is an important and useful strategy to increase the power; in addition, family data may be enriched with causal rare variants and therefore provide extra power. However, when family data are used, both population structure and familial relatedness need to be adjusted for the possible inflation of false positives. Using a unified mixed linear model and family data, we compared six methods to detect the association between multiple rare variants and quantitative traits. Through the analysis of 200 replications of the quantitative trait Q2 from the Genetic Analysis Workshop 17 data set simulated for 697 subjects from 8 extended families, and based on quantile-quantile plots under the null and receiver operating characteristic curves, we compared the false-positive rate and power of these methods. We observed that adjusting for pedigree-based kinship gives the best control for false-positive rate, whereas adjusting for marker-based identity by state slightly outperforms in terms of power. An adjustment based on a principal components analysis slightly improves the false-positive rate and power. Taking into account type-1 error, power, and computational efficiency, we find that adjusting for pedigree-based kinship seems to be a good choice for the collective test of association between multiple rare variants and quantitative traits using family data.
Collapse
Affiliation(s)
- Qunyuan Zhang
- Division of Statistical Genomics, Washington University School of Medicine, 4444 Forest Park Boulevard, St, Louis, MO 63108, USA.
| | | | | | | | | |
Collapse
|