51
|
Li MJ, Deng J, Wang P, Yang W, Ho SL, Sham PC, Wang J, Li M. wKGGSeq: A Comprehensive Strategy-Based and Disease-Targeted Online Framework to Facilitate Exome Sequencing Studies of Inherited Disorders. Hum Mutat 2015; 36:496-503. [PMID: 25676918 DOI: 10.1002/humu.22766] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2014] [Accepted: 02/03/2015] [Indexed: 12/19/2022]
Abstract
With the rapid advances in high-throughput sequencing technologies, exome sequencing and targeted region sequencing have become routine approaches for identifying mutations of inherited disorders in both genetics research and molecular diagnosis. There is an imminent need for comprehensive and easy-to-use downstream analysis tools to isolate causal mutations in exome sequencing studies. We have developed a user-friendly online framework, wKGGSeq, to provide systematic annotation, filtration, prioritization, and visualization functions for characterizing causal mutation(s) in exome sequencing studies of inherited disorders. wKGGSeq provides: (1) a novel strategy-based procedure for downstream analysis of a large amount of exome sequencing data and (2) a disease-targeted analysis procedure to facilitate clinical diagnosis of well-studied genetic diseases. In addition, it is also equipped with abundant online annotation functions for sequence variants. We demonstrate that wKGGSeq either outperforms or is comparable to two popular tools in several real exome sequencing samples. This tool will greatly facilitate the downstream analysis of exome sequencing data and can play a useful role for researchers and clinicians in identifying causal mutations of inherited disorders. The wKGGSeq is freely available at http://statgenpro.psychiatry.hku.hk/wkggseq or http://jjwanglab.org/wkggseq, and will be updated frequently.
Collapse
Affiliation(s)
- Mulin Jun Li
- Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR, China; Departments of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, SAR, China; Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, Guangdong, 518057, China
| | | | | | | | | | | | | | | |
Collapse
|
52
|
Yazdi FT, Clee SM, Meyre D. Obesity genetics in mouse and human: back and forth, and back again. PeerJ 2015; 3:e856. [PMID: 25825681 PMCID: PMC4375971 DOI: 10.7717/peerj.856] [Citation(s) in RCA: 93] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2014] [Accepted: 03/05/2015] [Indexed: 12/19/2022] Open
Abstract
Obesity is a major public health concern. This condition results from a constant and complex interplay between predisposing genes and environmental stimuli. Current attempts to manage obesity have been moderately effective and a better understanding of the etiology of obesity is required for the development of more successful and personalized prevention and treatment options. To that effect, mouse models have been an essential tool in expanding our understanding of obesity, due to the availability of their complete genome sequence, genetically identified and defined strains, various tools for genetic manipulation and the accessibility of target tissues for obesity that are not easily attainable from humans. Our knowledge of monogenic obesity in humans greatly benefited from the mouse obesity genetics field. Genes underlying highly penetrant forms of monogenic obesity are part of the leptin-melanocortin pathway in the hypothalamus. Recently, hypothesis-generating genome-wide association studies for polygenic obesity traits in humans have led to the identification of 119 common gene variants with modest effect, most of them having an unknown function. These discoveries have led to novel animal models and have illuminated new biologic pathways. Integrated mouse-human genetic approaches have firmly established new obesity candidate genes. Innovative strategies recently developed by scientists are described in this review to accelerate the identification of causal genes and deepen our understanding of obesity etiology. An exhaustive dissection of the molecular roots of obesity may ultimately help to tackle the growing obesity epidemic worldwide.
Collapse
Affiliation(s)
- Fereshteh T. Yazdi
- Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON, Canada
| | - Susanne M. Clee
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British Columbia, Vancouver, BC, Canada
| | - David Meyre
- Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON, Canada
- Department of Pathology and Molecular Medicine, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
53
|
Porth I, El-Kassaby YA. Using Populus as a lignocellulosic feedstock for bioethanol. Biotechnol J 2015; 10:510-24. [PMID: 25676392 DOI: 10.1002/biot.201400194] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2014] [Revised: 11/11/2014] [Accepted: 12/30/2014] [Indexed: 11/10/2022]
Abstract
Populus species along with species from the sister genus Salix will provide valuable feedstock resources for advanced second-generation biofuels. Their inherent fast growth characteristics can particularly be exploited for short rotation management, a time and energy saving cultivation alternative for lignocellulosic feedstock supply. Salicaceae possess inherent cell wall characteristics with favorable cellulose to lignin ratios for utilization as bioethanol crop. We review economically important traits relevant for intensively managed biofuel crop plantations, genomic and phenotypic resources available for Populus, breeding strategies for forest trees dedicated to bioenergy provision, and bioprocesses and downstream applications related to opportunities using Salicaceae as a renewable resource. Challenges need to be resolved for every single step of the conversion process chain, i.e., starting from tree domestication for improved performance as a bioenergy crop, bioconversion process, policy development for land use changes associated with advanced biofuels, and harvest and supply logistics associated with industrial-scale biorefinery plants using Populus as feedstock. Significant hurdles towards cost and energy efficiency, environmental friendliness, and yield maximization with regards to biomass pretreatment, saccharification, and fermentation of celluloses and the sustainability of biorefineries as a whole still need to be overcome.
Collapse
Affiliation(s)
- Ilga Porth
- Forest and Conservation Sciences, University of British Columbia, Vancouver, Canada.
| | | |
Collapse
|
54
|
Rolph RC, Waltham M, Smith A, Kuivaniemi H. Expanding Horizons for Abdominal Aortic Aneurysms. AORTA : OFFICIAL JOURNAL OF THE AORTIC INSTITUTE AT YALE-NEW HAVEN HOSPITAL 2015; 3:9-15. [PMID: 26798751 DOI: 10.12945/j.aorta.2015.14-041] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Accepted: 09/29/2014] [Indexed: 11/18/2022]
Abstract
Recent technological advances have allowed researchers to interrogate the genetic basis of abdominal aortic aneurysms in great detail. The results from these studies are expected to transform our understanding of this complex disease with both multiple genetic and environmental risk factors. Clinicians need to keep abreast of these genetic findings and understand the implications for their practice. Patients will become increasingly informed on genetic risk, and a new era of individualized risk assessment for AAA is just beginning. This brief update aims to provide the clinician with a succinct précis of the recent progress in this area.
Collapse
Affiliation(s)
- Rachel C Rolph
- King's College London, BHF Centre of Research Excellence & NIHR Biomedical Research Centre at King's Health Partners, Academic Department of Surgery, Cardiovascular Division and Division of Imaging Sciences, St Thomas' Hospital, London, UK
| | - Matthew Waltham
- King's College London, BHF Centre of Research Excellence & NIHR Biomedical Research Centre at King's Health Partners, Academic Department of Surgery, Cardiovascular Division and Division of Imaging Sciences, St Thomas' Hospital, London, UK
| | - Alberto Smith
- King's College London, BHF Centre of Research Excellence & NIHR Biomedical Research Centre at King's Health Partners, Academic Department of Surgery, Cardiovascular Division and Division of Imaging Sciences, St Thomas' Hospital, London, UK
| | - Helena Kuivaniemi
- The Sigfried and Janet Weis Center for Research, Geisinger Health System, Danville, Pennsylvania, USA; Department of Surgery, Temple University School of Medicine, Philadelphia, Pennsylvania, USA
| |
Collapse
|
55
|
Wang M, Lin S. Detecting associations of rare variants with common diseases: collapsing or haplotyping? Brief Bioinform 2015; 16:759-68. [PMID: 25596401 DOI: 10.1093/bib/bbu050] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Indexed: 01/11/2023] Open
Abstract
In recent years, a myriad of new statistical methods have been proposed for detecting associations of rare single-nucleotide variants (SNVs) with common diseases. These methods can be generally classified as 'collapsing' or 'haplotyping' based. The former is the predominant class, composed of most of the rare variant association methods proposed to date. However, recent works have suggested that haplotyping-based methods may offer advantages and can even be more powerful than collapsing methods in certain situations. In this article, we review and compare collapsing- versus haplotyping-based methods/software in terms of both power and type I error. For collapsing methods, we consider three approaches: Combined Multivariate and Collapsing, Sequence Kernel Association Test and Family-Based Association Test (FBAT): the first two are population based and are among the most popular; the last test is family based, a modification from the popular FBAT to accommodate rare SNVs. For haplotyping-based methods, we include Logistic Bayesian Lasso (LBL) for population data and family-based LBL (famLBL) for family (trio) data. These two methods are selected, as they can be used to test association for specific rare and common haplotypes. Our results show that haplotype methods can be more powerful than collapsing methods if there are interacting SNVs leading to larger haplotype effects. Even if only common SNVs are genotyped, haplotype methods can still detect specific rare haplotypes that tag rare causal SNVs. As expected, family-based methods are robust, whereas population-based methods are susceptible, to population substructure. However, the population-based haplotype approach appears to have smaller inflation of type I error than its collapsing counterparts.
Collapse
|
56
|
Kessler T, Kaess B, Bourier F, Erdmann J, Schunkert H. [Genetic analyses as basis for a personalized medicine in patients with coronary artery disease]. Herz 2014; 39:186-93. [PMID: 24464254 DOI: 10.1007/s00059-013-4048-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Knowledge about the etiology of coronary artery disease (CAD) entered new dimensions using genome-wide association studies. The current situation is that 46 chromosomal loci have been identified to be associated with CAD with genome-wide significance, i.e. p<5×10(-8), in Western Europeans. As the individual DNA sequence remains unchanged after fertilization, the risk variants cannot occur due to confounders, such as secondary disease processes. Thus, it can be proposed that these variants are directly affecting a primary and thereby causal pathophysiological process in CAD. Interestingly, only 20% of the effects mediated by the identified loci can be explained by the influence of traditional risk factors. This implies that yet unknown mechanisms and, as a consequence, new therapeutic targets play an important role in the pathophysiology of CAD. However, the high allele frequency of risk loci was also surprising. In the diploid chromosome set Western European individuals carry on average 30-50 risk variants at the 46 loci. Considering this, every individual in the population carries a larger or smaller genetic predisposition for CAD. On the other hand it is remarkable that many risk allele carriers seem to be able to compensate the genetic risk: even in old age not everyone suffers from CAD. This indicates yet unknown gene-gene and gene-environment interactions and limits the current possibilities in individual risk prediction.
Collapse
Affiliation(s)
- T Kessler
- Deutsches Herzzentrum München, Klinik für Herz- und Kreislauferkrankungen, Technische Universität München, Lazarettstr. 36, 80636, München, Deutschland
| | | | | | | | | |
Collapse
|
57
|
Sanchez-Pulido L, Ponting CP. TM6SF2 and MAC30, new enzyme homologs in sterol metabolism and common metabolic disease. Front Genet 2014; 5:439. [PMID: 25566323 PMCID: PMC4263179 DOI: 10.3389/fgene.2014.00439] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2014] [Accepted: 11/27/2014] [Indexed: 12/14/2022] Open
Abstract
Carriers of the Glu167Lys coding variant in the TM6SF2 gene have recently been identified as being more susceptible to non-alcoholic fatty liver disease (NAFLD), yet exhibit lower levels of circulating lipids and hence are protected against cardiovascular disease. Despite the physiological importance of these observations, the molecular function of TM6SF2 remains unknown, and no sequence similarity with functionally characterized proteins has been identified. In order to trace its evolutionary history and to identify functional domains, we embarked on a computational protein sequence analysis of TM6SF2. We identified a new domain, the EXPERA domain, which is conserved among TM6SF, MAC30/TMEM97 and EBP (D8, D7 sterol isomerase) protein families. EBP mutations are the cause of chondrodysplasia punctata 2 X-linked dominant (CDPX2), also known as Conradi-Hünermann-Happle syndrome, a defective cholesterol biosynthesis disorder. Our analysis of evolutionary conservation among EXPERA domain-containing families and the previously suggested catalytic mechanism for the EBP enzyme, indicate that TM6SF and MAC30/TMEM97 families are both highly likely to possess, as for the EBP family, catalytic activity as sterol isomerases. This unexpected prediction of enzymatic functions for TM6SF and MAC30/TMEM97 is important because it now permits detailed experiments to investigate the function of these key proteins in various human pathologies, from cardiovascular disease to cancer.
Collapse
Affiliation(s)
- Luis Sanchez-Pulido
- Medical Research Council Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford Oxford, UK
| | - Chris P Ponting
- Medical Research Council Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford Oxford, UK
| |
Collapse
|
58
|
Wu L, Schaid DJ, Sicotte H, Wieben ED, Li H, Petersen GM. Case-only exome sequencing and complex disease susceptibility gene discovery: study design considerations. J Med Genet 2014; 52:10-6. [PMID: 25371537 DOI: 10.1136/jmedgenet-2014-102697] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Whole exome sequencing (WES) provides an unprecedented opportunity to identify the potential aetiological role of rare functional variants in human complex diseases. Large-scale collaborations have generated germline WES data on patients with a number of diseases, especially cancer, but less often on healthy controls under the same sequencing procedures. These data can be a valuable resource for identifying new disease susceptibility loci if study designs are appropriately applied. This review describes suggested strategies and technical considerations when focusing on case-only study designs that use WES data in complex disease scenarios. These include variant filtering based on frequency and functionality, gene prioritisation, interrogation of different data types and targeted sequencing validation. We propose that if case-only WES designs were applied in an appropriate manner, new susceptibility genes containing rare variants for human complex diseases can be detected.
Collapse
Affiliation(s)
- Lang Wu
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA Center for Clinical and Translational Science, Mayo Clinic, Rochester, Minnesota, USA
| | - Daniel J Schaid
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | - Hugues Sicotte
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| | - Eric D Wieben
- Department of Biochemistry and Molecular Biology, Mayo Clinic, Rochester, Minnesota, USA
| | - Hu Li
- Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Rochester, Minnesota, USA
| | - Gloria M Petersen
- Department of Health Sciences Research, Mayo Clinic, Rochester, Minnesota, USA
| |
Collapse
|
59
|
Lord J, Lu AJ, Cruchaga C. Identification of rare variants in Alzheimer's disease. Front Genet 2014; 5:369. [PMID: 25389433 PMCID: PMC4211559 DOI: 10.3389/fgene.2014.00369] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2014] [Accepted: 10/03/2014] [Indexed: 12/21/2022] Open
Abstract
Much progress has been made in recent years in identifying genes involved in the risk of developing Alzheimer's disease (AD), the most common form of dementia. Yet despite the identification of over 20 disease associated loci, mainly through genome wide association studies (GWAS), a large proportion of the genetic component of the disorder remains unexplained. Recent evidence from the AD field, as with other complex diseases, suggests a large proportion of this "missing heritability" may be due to rare variants of moderate to large effect size, but the methodologies to detect such variants are still in their infancy. The latest studies in the field have been focused on the identification of coding variation associated with AD risk, through whole-exome or whole-genome sequencing. Such variants are expected to have larger effect sizes than GWAS loci, and are easier to functionally characterize, and develop cellular and animal models for. This review explores the issues involved in detecting rare variant associations in the context of AD, highlighting some successful approaches utilized to date.
Collapse
Affiliation(s)
- Jenny Lord
- Department of Psychiatry, Washington University School of MedicineSt. Louis, MO, USA
| | - Alexander J. Lu
- Department of Psychiatry, Washington University School of MedicineSt. Louis, MO, USA
| | - Carlos Cruchaga
- Department of Psychiatry, Washington University School of MedicineSt. Louis, MO, USA
- Hope Center Program on Protein Aggregation and Neurodegeneration, Washington University School of MedicineSt. Louis, MO, USA
| |
Collapse
|
60
|
van Rheenen W, Diekstra FP, van den Berg LH, Veldink JH. Are CHCHD10 mutations indeed associated with familial amyotrophic lateral sclerosis? ACTA ACUST UNITED AC 2014; 137:e313. [PMID: 25348631 DOI: 10.1093/brain/awu299] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Wouter van Rheenen
- Department of Neurology and Neurosurgery, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Frank P Diekstra
- Department of Neurology and Neurosurgery, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Leonard H van den Berg
- Department of Neurology and Neurosurgery, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, The Netherlands
| | - Jan H Veldink
- Department of Neurology and Neurosurgery, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, The Netherlands
| |
Collapse
|
61
|
Gupta RM. Digenic inheritance of myocardial infarction risk implicates dysfunctional nitric oxide signaling. ACTA ACUST UNITED AC 2014; 7:93-4. [PMID: 24550432 DOI: 10.1161/circgenetics.114.000502] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Rajat M Gupta
- Division of Cardiovascular Medicine, Brigham and Women's Hospital, Boston, MA
| |
Collapse
|
62
|
Requena T, Cabrera S, Martín-Sierra C, Price SD, Lysakowski A, Lopez-Escamez JA. Identification of two novel mutations in FAM136A and DTNA genes in autosomal-dominant familial Meniere's disease. Hum Mol Genet 2014; 24:1119-26. [PMID: 25305078 DOI: 10.1093/hmg/ddu524] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Meniere's disease (MD) is a chronic disorder of the inner ear defined by sensorineural hearing loss, tinnitus and episodic vertigo, and familial MD is observed in 5-15% of sporadic cases. Although its pathophysiology is largely unknown, studies in human temporal bones have found an accumulation of endolymph in the scala media of the cochlea. By whole-exome sequencing, we have identified two novel heterozygous single-nucleotide variants in FAM136A and DTNA genes, both in a Spanish family with three affected cases in consecutive generations, highly suggestive of autosomal-dominant inheritance. The nonsense mutation in the FAM136A gene leads to a stop codon that disrupts the FAM136A protein product. Sequencing revealed two mRNA transcripts of FAM136A in lymphoblasts from patients, which were confirmed by immunoblotting. Carriers of the FAM136A mutation showed a significant decrease in the expression level of both transcripts in lymphoblastoid cell lines. The missense mutation in the DTNA gene produces a novel splice site which skips exon 21 and leads to a shorter alternative transcript. We also demonstrated that FAM136A and DTNA proteins are expressed in the neurosensorial epithelium of the crista ampullaris of the rat by immunohistochemistry. While FAM136A encodes a mitochondrial protein with unknown function, DTNA encodes a cytoskeleton-interacting membrane protein involved in the formation and stability of synapses with a crucial role in the permeability of the blood-brain barrier. Neither of these genes has been described in patients with hearing loss, FAM136A and DTNA being candidate gene for familiar MD.
Collapse
Affiliation(s)
- Teresa Requena
- Otology & Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer/University of Granada/Junta de Andalucía, PTS, Granada 18016, Spain
| | - Sonia Cabrera
- Otology & Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer/University of Granada/Junta de Andalucía, PTS, Granada 18016, Spain
| | - Carmen Martín-Sierra
- Otology & Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer/University of Granada/Junta de Andalucía, PTS, Granada 18016, Spain
| | - Steven D Price
- Department of Anatomy and Cell Biology, University of Illinois, Chicago, IL 60612, USA and
| | - Anna Lysakowski
- Department of Anatomy and Cell Biology, University of Illinois, Chicago, IL 60612, USA and
| | - José A Lopez-Escamez
- Otology & Neurotology Group CTS495, Department of Genomic Medicine, GENYO - Centre for Genomics and Oncological Research - Pfizer/University of Granada/Junta de Andalucía, PTS, Granada 18016, Spain, Department of Otolaryngology, Hospital de Poniente, El Ejido, Almería 04700, Spain
| |
Collapse
|
63
|
Bang SY, Na YJ, Kim K, Joo YB, Park Y, Lee J, Lee SY, Ansari AA, Jung J, Rhee H, Lee JY, Han BG, Ahn SM, Won S, Lee HS, Bae SC. Targeted exon sequencing fails to identify rare coding variants with large effect in rheumatoid arthritis. Arthritis Res Ther 2014; 16:447. [PMID: 25267259 PMCID: PMC4203956 DOI: 10.1186/s13075-014-0447-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2014] [Accepted: 08/29/2014] [Indexed: 12/30/2022] Open
Abstract
Introduction Although it has been suggested that rare coding variants could explain the substantial missing heritability, very few sequencing studies have been performed in rheumatoid arthritis (RA). We aimed to identify novel functional variants with rare to low frequency using targeted exon sequencing of RA in Korea. Methods We analyzed targeted exon sequencing data of 398 genes selected from a multifaceted approach in Korean RA patients (n = 1,217) and controls (n = 717). We conducted a single-marker association test and a gene-based analysis of rare variants. For meta-analysis or enrichment tests, we also used ethnically matched independent samples of Korean genome-wide association studies (GWAS) (n = 4,799) or immunochip data (n = 4,722). Results After stringent quality control, we analyzed 10,588 variants of 398 genes from 1,934 Korean RA case controls. We identified 13 nonsynonymous variants with nominal association in single-variant association tests. In a meta-analysis, we did not find any novel variant with genome-wide significance for RA risk. Using a gene-based approach, we identified 17 genes with nominal burden signals. Among them, VSTM1 showed the greatest association with RA (P = 7.80 × 10−4). In the enrichment test using Korean GWAS, although the significant signal appeared to be driven by total genic variants, we found no evidence for enriched association of coding variants only with RA. Conclusions We were unable to identify rare coding variants with large effect to explain the missing heritability for RA in the current targeted resequencing study. Our study raises skepticism about exon sequencing of targeted genes for complex diseases like RA. Electronic supplementary material The online version of this article (doi:10.1186/s13075-014-0447-7) contains supplementary material, which is available to authorized users.
Collapse
|
64
|
Boyd SD, Galli SJ, Schrijver I, Zehnder JL, Ashley EA, Merker JD. A Balanced Look at the Implications of Genomic (and Other "Omics") Testing for Disease Diagnosis and Clinical Care. Genes (Basel) 2014; 5:748-66. [PMID: 25257203 PMCID: PMC4198929 DOI: 10.3390/genes5030748] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2014] [Revised: 07/20/2014] [Accepted: 08/18/2014] [Indexed: 11/16/2022] Open
Abstract
The tremendous increase in DNA sequencing capacity arising from the commercialization of "next generation" instruments has opened the door to innumerable routes of investigation in basic and translational medical science. It enables very large data sets to be gathered, whose interpretation and conversion into useful knowledge is only beginning. A challenge for modern healthcare systems and academic medical centers is to apply these new methods for the diagnosis of disease and the management of patient care without unnecessary delay, but also with appropriate evaluation of the quality of data and interpretation, as well as the clinical value of the insights gained. Most critically, the standards applied for evaluating these new laboratory data and ensuring that the results and their significance are clearly communicated to patients and their caregivers should be at least as rigorous as those applied to other kinds of medical tests. Here, we present an overview of conceptual and practical issues to be considered in planning for the integration of genomic methods or, in principle, any other type of "omics" testing into clinical care.
Collapse
Affiliation(s)
- Scott D Boyd
- Department of Pathology, Stanford University, Stanford, CA 94305, USA.
| | - Stephen J Galli
- Department of Pathology, Stanford University, Stanford, CA 94305, USA.
| | - Iris Schrijver
- Department of Pathology, Stanford University, Stanford, CA 94305, USA.
| | - James L Zehnder
- Department of Pathology, Stanford University, Stanford, CA 94305, USA.
| | - Euan A Ashley
- Department of Medicine, Stanford University, Stanford, CA 94305, USA.
| | - Jason D Merker
- Department of Pathology, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
65
|
Lin YC, Hsieh AR, Hsiao CL, Wu SJ, Wang HM, Lian IB, Fann CSJ. Identifying rare and common disease associated variants in genomic data using Parkinson's disease as a model. J Biomed Sci 2014; 21:88. [PMID: 25175702 PMCID: PMC4428531 DOI: 10.1186/s12929-014-0088-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 08/21/2014] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Genome-wide association studies have been successful in identifying common genetic variants for human diseases. However, much of the heritable variation associated with diseases such as Parkinson's disease remains unknown suggesting that many more risk loci are yet to be identified. Rare variants have become important in disease association studies for explaining missing heritability. Methods for detecting this type of association require prior knowledge on candidate genes and combining variants within the region. These methods may suffer from power loss in situations with many neutral variants or causal variants with opposite effects. RESULTS We propose a method capable of scanning genetic variants to identify the region most likely harbouring disease gene with rare and/or common causal variants. Our method assigns a score at each individual variant based on our scoring system. It uses aggregate scores to identify the region with disease association. We evaluate performance by simulation based on 1000 Genomes sequencing data and compare with three commonly used methods. We use a Parkinson's disease case-control dataset as a model to demonstrate the application of our method. Our method has better power than CMC and WSS and similar power to SKAT-O with well-controlled type I error under simulation based on 1000 Genomes sequencing data. In real data analysis, we confirm the association of α-synuclein gene (SNCA) with Parkinson's disease (p = 0.005). We further identify association with hyaluronan synthase 2 (HAS2, p = 0.028) and kringle containing transmembrane protein 1 (KREMEN1, p = 0.006). KREMEN1 is associated with Wnt signalling pathway which has been shown to play an important role for neurodegeneration in Parkinson's disease. CONCLUSIONS Our method is time efficient and less sensitive to inclusion of neutral variants and direction effect of causal variants. It can narrow down a genomic region or a chromosome to a disease associated region. Using Parkinson's disease as a model, our method not only confirms association for a known gene but also identifies two genes previously found by other studies. In spite of many existing methods, we conclude that our method serves as an efficient alternative for exploring genomic data containing both rare and common variants.
Collapse
Affiliation(s)
- Ying-Chao Lin
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan. .,Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan. .,Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan.
| | - Ai-Ru Hsieh
- Graduate Institute of Biostatistics, China Medical University, Taichung, Taiwan.
| | - Ching-Lin Hsiao
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan.
| | - Shang-Jung Wu
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan.
| | - Hui-Min Wang
- Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan.
| | - Ie-Bin Lian
- Graduate Institute of Statistics and Information Science, National Changhua University of Education, Changhua, Taiwan.
| | - Cathy S J Fann
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan. .,Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan. .,Institute of Public Health, National Yang-Ming University, Taipei, Taiwan.
| |
Collapse
|
66
|
Li A, Meyre D. Jumping on the Train of Personalized Medicine: A Primer for Non-Geneticist Clinicians: Part 2. Fundamental Concepts in Genetic Epidemiology. ACTA ACUST UNITED AC 2014; 10:101-117. [PMID: 25598767 PMCID: PMC4287874 DOI: 10.2174/1573400510666140319235334] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2013] [Revised: 02/07/2014] [Accepted: 04/18/2014] [Indexed: 12/12/2022]
Abstract
With the decrease in sequencing costs, personalized genome sequencing will eventually become common in medical practice. We therefore write this series of three reviews to help non-geneticist clinicians to jump into the fast-moving field of personalized medicine. In the first article of this series, we reviewed the fundamental concepts in molecular genetics. In this second article, we cover the key concepts and methods in genetic epidemiology including the classification of genetic disorders, study designs and their implementation, genetic marker selection, genotyping and sequencing technologies, gene identification strategies, data analyses and data interpretation. This review will help the reader critically appraise a genetic association study. In the next article, we will discuss the clinical applications of genetic epidemiology in the personalized medicine area.
Collapse
Affiliation(s)
- Aihua Li
- Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON L8N 3Z5, Canada
| | - David Meyre
- Department of Clinical Epidemiology and Biostatistics, McMaster University, Hamilton, ON L8N 3Z5, Canada
| |
Collapse
|
67
|
Vinson A, Prongay K, Ferguson B. The value of extended pedigrees for next-generation analysis of complex disease in the rhesus macaque. ILAR J 2014; 54:91-105. [PMID: 24174435 DOI: 10.1093/ilar/ilt041] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Complex diseases (e.g., cardiovascular disease and type 2 diabetes, among many others) pose the biggest threat to human health worldwide and are among the most challenging to investigate. Susceptibility to complex disease may be caused by multiple genetic variants (GVs) and their interaction, by environmental factors, and by interaction between GVs and environment, and large study cohorts with substantial analytical power are typically required to elucidate these individual contributions. Here, we discuss the advantages of both power and feasibility afforded by the use of extended pedigrees of rhesus macaques (Macaca mulatta) for genetic studies of complex human disease based on next-generation sequence data. We present these advantages in the context of previous research conducted in rhesus macaques for several representative complex diseases. We also describe a single, multigeneration pedigree of Indian-origin rhesus macaques and a sample biobank we have developed for genetic analysis of complex disease, including power of this pedigree to detect causal GVs using either genetic linkage or association methods in a variance decomposition approach. Finally, we summarize findings of significant heritability for a number of quantitative traits that demonstrate that genetic contributions to risk factors for complex disease can be detected and measured in this pedigree. We conclude that the development and application of an extended pedigree to analysis of complex disease traits in the rhesus macaque have shown promising early success and that genome-wide genetic and higher order -omics studies in this pedigree are likely to yield useful insights into the architecture of complex human disease.
Collapse
|
68
|
Krupp DR, Soldano KL, Garrett ME, Cope H, Ashley-Koch AE, Gregory SG. Missing genetic risk in neural tube defects: can exome sequencing yield an insight? ACTA ACUST UNITED AC 2014; 100:642-6. [PMID: 25044326 DOI: 10.1002/bdra.23276] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2014] [Revised: 05/30/2014] [Accepted: 05/31/2014] [Indexed: 01/12/2023]
Abstract
BACKGROUND Neural tube defects (NTD) have a strong genetic component, with up to 70% of variance in human prevalence determined by heritable factors. Although the identification of causal DNA variants by sequencing candidate genes from functionally relevant pathways and model organisms has provided some success, alternative approaches are demanded. METHODS Next generation sequencing platforms are facilitating the production of massive amounts of sequencing data, primarily from the protein coding regions of the genome, at a faster rate and cheaper cost than has previously been possible. These platforms are permitting the identification of variants (de novo, rare, and common) that are drivers of NYTD etiology, and the cost of the approach allows for the screening of increased numbers of affected and unaffected individuals from NTD families and in simplex cases. CONCLUSION The next generation sequencing platforms represent a powerful tool in the armory of the genetics researcher to identify the causal genetic basis of NTDs.
Collapse
Affiliation(s)
- Deidre R Krupp
- Duke Molecular Physiology Institute, DUMC, 300 North Duke Street, Durham, NC, 27701
| | | | | | | | | | | |
Collapse
|
69
|
Jiang Y, Conneely KN, Epstein MP. Flexible and robust methods for rare-variant testing of quantitative traits in trios and nuclear families. Genet Epidemiol 2014; 38:542-51. [PMID: 25044337 DOI: 10.1002/gepi.21839] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2014] [Revised: 05/21/2014] [Accepted: 05/29/2014] [Indexed: 11/07/2022]
Abstract
Most rare-variant association tests for complex traits are applicable only to population-based or case-control resequencing studies. There are fewer rare-variant association tests for family-based resequencing studies, which is unfortunate because pedigrees possess many attractive characteristics for such analyses. Family-based studies can be more powerful than their population-based counterparts due to increased genetic load and further enable the implementation of rare-variant association tests that, by design, are robust to confounding due to population stratification. With this in mind, we propose a rare-variant association test for quantitative traits in families; this test integrates the QTDT approach of Abecasis et al. [Abecasis et al., ] into the kernel-based SNP association test KMFAM of Schifano et al. [Schifano et al., ]. The resulting within-family test enjoys the many benefits of the kernel framework for rare-variant association testing, including rapid evaluation of P-values and preservation of power when a region harbors rare causal variation that acts in different directions on phenotype. Additionally, by design, this within-family test is robust to confounding due to population stratification. Although within-family association tests are generally less powerful than their counterparts that use all genetic information, we show that we can recover much of this power (although still ensuring robustness to population stratification) using a straightforward screening procedure. Our method accommodates covariates and allows for missing parental genotype data, and we have written software implementing the approach in R for public use.
Collapse
Affiliation(s)
- Yunxuan Jiang
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, United States of America
| | | | | |
Collapse
|
70
|
Abstract
The use of genetically isolated populations can empower next-generation association studies. In this review, we discuss the advantages of this approach and review study design and analytical considerations of genetic association studies focusing on isolates. We cite successful examples of using population isolates in association studies and outline potential ways forward.
Collapse
|
71
|
Lee S, Abecasis G, Boehnke M, Lin X. Rare-variant association analysis: study designs and statistical tests. Am J Hum Genet 2014; 95:5-23. [PMID: 24995866 DOI: 10.1016/j.ajhg.2014.06.009] [Citation(s) in RCA: 658] [Impact Index Per Article: 65.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2014] [Indexed: 12/30/2022] Open
Abstract
Despite the extensive discovery of trait- and disease-associated common variants, much of the genetic contribution to complex traits remains unexplained. Rare variants can explain additional disease risk or trait variability. An increasing number of studies are underway to identify trait- and disease-associated rare variants. In this review, we provide an overview of statistical issues in rare-variant association studies with a focus on study designs and statistical tests. We present the design and analysis pipeline of rare-variant studies and review cost-effective sequencing designs and genotyping platforms. We compare various gene- or region-based association tests, including burden tests, variance-component tests, and combined omnibus tests, in terms of their assumptions and performance. Also discussed are the related topics of meta-analysis, population-stratification adjustment, genotype imputation, follow-up studies, and heritability due to rare variants. We provide guidelines for analysis and discuss some of the challenges inherent in these studies and future research directions.
Collapse
|
72
|
Park DJ, Nguyen-Dumont T, Kang S, Verspoor K, Pope BJ. Annokey: an annotation tool based on key term search of the NCBI Entrez Gene database. SOURCE CODE FOR BIOLOGY AND MEDICINE 2014. [PMCID: PMC4106183 DOI: 10.1186/1751-0473-9-15] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Background The NCBI Entrez Gene and PubMed databases contain a wealth of high-quality information about genes for many different organisms. The NCBI Entrez online web-search interface is convenient for simple manual search for a small number of genes but impractical for the kinds of outputs seen in typical genomics projects. Results We have developed an efficient open source tool implemented in Python called Annokey, which annotates gene lists with the results of a keyword search of the NCBI Entrez Gene database and linked Pubmed article information. The user steers the search by specifying a ranked list of keywords (including multi-word phrases and regular expressions) that are correlated with their topic of interest. Rank information of matched terms allows the user to guide further investigation. We applied Annokey to the entire human Entrez Gene database using the key-term “DNA repair” and assessed its performance in identifying the 176 members of a published “gold standard” list of genes established to be involved in this pathway. For this test case we observed a sensitivity and specificity of 97% and 96%, respectively. Conclusions Annokey facilitates the identification of genes related to an area of interest, a task which can be onerous if performed manually on a large number of genes. Annokey provides a way to capitalize on the high quality information provided by the Entrez Gene database allowing both scalability and compatibility with automated analysis pipelines, thus offering the potential to significantly enhance research productivity.
Collapse
|
73
|
Mosley JD, Van Driest SL, Weeke PE, Delaney JT, Wells QS, Bastarache L, Roden DM, Denny JC. Integrating EMR-linked and in vivo functional genetic data to identify new genotype-phenotype associations. PLoS One 2014; 9:e100322. [PMID: 24949630 PMCID: PMC4065041 DOI: 10.1371/journal.pone.0100322] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Accepted: 05/25/2014] [Indexed: 12/31/2022] Open
Abstract
The coupling of electronic medical records (EMR) with genetic data has created the potential for implementing reverse genetic approaches in humans, whereby the function of a gene is inferred from the shared pattern of morbidity among homozygotes of a genetic variant. We explored the feasibility of this approach to identify phenotypes associated with low frequency variants using Vanderbilt's EMR-based BioVU resource. We analyzed 1,658 low frequency non-synonymous SNPs (nsSNPs) with a minor allele frequency (MAF)<10% collected on 8,546 subjects. For each nsSNP, we identified diagnoses shared by at least 2 minor allele homozygotes and with an association p<0.05. The diagnoses were reviewed by a clinician to ascertain whether they may share a common mechanistic basis. While a number of biologically compelling clinical patterns of association were observed, the frequency of these associations was identical to that observed using genotype-permuted data sets, indicating that the associations were likely due to chance. To refine our analysis associations, we then restricted the analysis to 711 nsSNPs in genes with phenotypes in the On-line Mendelian Inheritance in Man (OMIM) or knock-out mouse phenotype databases. An initial comparison of the EMR diagnoses to the known in vivo functions of the gene identified 25 candidate nsSNPs, 19 of which had significant genotype-phenotype associations when tested using matched controls. Twleve of the 19 nsSNPs associations were confirmed by a detailed record review. Four of 12 nsSNP-phenotype associations were successfully replicated in an independent data set: thrombosis (F5,rs6031), seizures/convulsions (GPR98,rs13157270), macular degeneration (CNGB3,rs3735972), and GI bleeding (HGFAC,rs16844401). These analyses demonstrate the feasibility and challenges of using reverse genetics approaches to identify novel gene-phenotype associations in human subjects using low frequency variants. As increasing amounts of rare variant data are generated from modern genotyping and sequence platforms, model organism data may be an important tool to enable discovery.
Collapse
Affiliation(s)
- Jonathan D. Mosley
- Department of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Sara L. Van Driest
- Department of Pediatrics, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Peter E. Weeke
- Department of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Jessica T. Delaney
- Department of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Quinn S. Wells
- Department of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Lisa Bastarache
- Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Dan M. Roden
- Department of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Josh C. Denny
- Department of Medicine, Vanderbilt University, Nashville, Tennessee, United States of America
- Biomedical Informatics, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
74
|
Li MJ, Yan B, Sham PC, Wang J. Exploring the function of genetic variants in the non-coding genomic regions: approaches for identifying human regulatory variants affecting gene expression. Brief Bioinform 2014; 16:393-412. [PMID: 24916300 DOI: 10.1093/bib/bbu018] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2014] [Accepted: 04/23/2014] [Indexed: 12/13/2022] Open
Abstract
Understanding the genetic basis of human traits/diseases and the underlying mechanisms of how these traits/diseases are affected by genetic variations is critical for public health. Current genome-wide functional genomics data uncovered a large number of functional elements in the noncoding regions of human genome, providing new opportunities to study regulatory variants (RVs). RVs play important roles in transcription factor bindings, chromatin states and epigenetic modifications. Here, we systematically review an array of methods currently used to map RVs as well as the computational approaches in annotating and interpreting their regulatory effects, with emphasis on regulatory single-nucleotide polymorphism. We also briefly introduce experimental methods to validate these functional RVs.
Collapse
|
75
|
|
76
|
Managing incidental findings in exome sequencing for research. Methods Mol Biol 2014; 1168:207-25. [PMID: 24870138 DOI: 10.1007/978-1-4939-0847-9_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2023]
Abstract
Exome sequencing for research has become available for broadly based genomic studies as well as smaller targeted investigations. New exome research projects being considered will intentionally process a large amount of common and rare DNA variation for the purpose of finding specific links between genotype and phenotype. However, the risks of uncovering a clinically relevant incidental finding are not uniform across projects but are highly dependent on the question being asked and exactly how it is intended to be answered.Factors that influence the possibility of revealing a clinically relevant incidental DNA variation include the following: The overall design of the study and the number of participants involved, the mode of inheritance of the phenotype including whether the phenotype is likely to have a monogenic or a complex inheritance, whether the study is assessing a known list of genes or not, and whether the causative DNA variation is likely to be rare or common. Importantly, differing bioinformatics DNA variant filtering strategies strongly influence the odds of discovering an incidental finding. This chapter provides a framework for understanding and assessing the likelihood of discovering clinically relevant, incidental DNA variations that are not directly related to the question being addressed in a particular exome research project. It also outlines DNA variant filtering and functional informatics approaches that can investigate specific genomic questions while minimizing the risks of uncovering an incidental finding.
Collapse
|
77
|
Lohmueller KE. The impact of population demography and selection on the genetic architecture of complex traits. PLoS Genet 2014; 10:e1004379. [PMID: 24875776 PMCID: PMC4038606 DOI: 10.1371/journal.pgen.1004379] [Citation(s) in RCA: 98] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 03/28/2014] [Indexed: 02/06/2023] Open
Abstract
Population genetic studies have found evidence for dramatic population growth in recent human history. It is unclear how this recent population growth, combined with the effects of negative natural selection, has affected patterns of deleterious variation, as well as the number, frequency, and effect sizes of mutations that contribute risk to complex traits. Because researchers are performing exome sequencing studies aimed at uncovering the role of low-frequency variants in the risk of complex traits, this topic is of critical importance. Here I use simulations under population genetic models where a proportion of the heritability of the trait is accounted for by mutations in a subset of the exome. I show that recent population growth increases the proportion of nonsynonymous variants segregating in the population, but does not affect the genetic load relative to a population that did not expand. Under a model where a mutation's effect on a trait is correlated with its effect on fitness, rare variants explain a greater portion of the additive genetic variance of the trait in a population that has recently expanded than in a population that did not recently expand. Further, when using a single-marker test, for a given false-positive rate and sample size, recent population growth decreases the expected number of significant associations with the trait relative to the number detected in a population that did not expand. However, in a model where there is no correlation between a mutation's effect on fitness and the effect on the trait, common variants account for much of the additive genetic variance, regardless of demography. Moreover, here demography does not affect the number of significant associations detected. These findings suggest recent population history may be an important factor influencing the power of association tests and in accounting for the missing heritability of certain complex traits.
Collapse
Affiliation(s)
- Kirk E Lohmueller
- Department of Ecology and Evolutionary Biology, Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
| |
Collapse
|
78
|
Fritsche LG, Fariss RN, Stambolian D, Abecasis GR, Curcio CA, Swaroop A. Age-related macular degeneration: genetics and biology coming together. Annu Rev Genomics Hum Genet 2014; 15:151-71. [PMID: 24773320 DOI: 10.1146/annurev-genom-090413-025610] [Citation(s) in RCA: 340] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Genetic and genomic studies have enhanced our understanding of complex neurodegenerative diseases that exert a devastating impact on individuals and society. One such disease, age-related macular degeneration (AMD), is a major cause of progressive and debilitating visual impairment. Since the pioneering discovery in 2005 of complement factor H (CFH) as a major AMD susceptibility gene, extensive investigations have confirmed 19 additional genetic risk loci, and more are anticipated. In addition to common variants identified by now-conventional genome-wide association studies, targeted genomic sequencing and exome-chip analyses are uncovering rare variant alleles of high impact. Here, we provide a critical review of the ongoing genetic studies and of common and rare risk variants at a total of 20 susceptibility loci, which together explain 40-60% of the disease heritability but provide limited power for diagnostic testing of disease risk. Identification of these susceptibility loci has begun to untangle the complex biological pathways underlying AMD pathophysiology, pointing to new testable paradigms for treatment.
Collapse
Affiliation(s)
- Lars G Fritsche
- Center for Statistical Genetics, Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan 48109; ,
| | | | | | | | | | | |
Collapse
|
79
|
Pulit SL, Leusink M, Menelaou A, de Bakker PIW. Association claims in the sequencing era. Genes (Basel) 2014; 5:196-213. [PMID: 24705293 PMCID: PMC3978519 DOI: 10.3390/genes5010196] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2013] [Revised: 02/24/2014] [Accepted: 02/24/2014] [Indexed: 12/13/2022] Open
Abstract
Since the completion of the Human Genome Project, the field of human genetics has been in great flux, largely due to technological advances in studying DNA sequence variation. Although community-wide adoption of statistical standards was key to the success of genome-wide association studies, similar standards have not yet been globally applied to the processing and interpretation of sequencing data. It has proven particularly challenging to pinpoint unequivocally disease variants in sequencing studies of polygenic traits. Here, we comment on a number of factors that may contribute to irreproducible claims of association in scientific literature and discuss possible steps that we can take towards cultural change.
Collapse
Affiliation(s)
- Sara L Pulit
- Department of Medical Genetics, Institute for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands.
| | - Maarten Leusink
- Department of Medical Genetics, Institute for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands.
| | - Androniki Menelaou
- Department of Medical Genetics, Institute for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands.
| | - Paul I W de Bakker
- Department of Medical Genetics, Institute for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands.
| |
Collapse
|
80
|
Ratnapriya R, Chew EY. Age-related macular degeneration-clinical review and genetics update. Clin Genet 2014; 84:160-6. [PMID: 23713713 PMCID: PMC3732788 DOI: 10.1111/cge.12206] [Citation(s) in RCA: 109] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2013] [Revised: 05/23/2013] [Accepted: 05/23/2013] [Indexed: 12/19/2022]
Abstract
Age-related macular degeneration (AMD) is the leading cause of central vision impairment in persons over the age of 50 years in developed countries. Both genetic and non-genetic (environmental) factors play major roles in AMD etiology, and multiple gene variants and lifestyle factors such as smoking have been associated with the disease. While dissecting the basic etiology of the disease remains a major challenge, current genetic knowledge has provided opportunities for improved risk assessment, molecular diagnosis and clinical testing of genetic variants in AMD treatment and management. This review addresses the potential of translating the wealth of genetic findings for improved risk prediction and therapeutic intervention in AMD patients. Finally, we discuss the recent advancement in genetics and genomics and the future prospective of personalized medicine in AMD patients.
Collapse
Affiliation(s)
- R Ratnapriya
- Neurobiology-Neurodegeneration and Repair Laboratory, National Institutes of Health, Bethesda, MD, USA
| | | |
Collapse
|
81
|
Li B, Liu DJ, Leal SM. Identifying rare variants associated with complex traits via sequencing. ACTA ACUST UNITED AC 2014; Chapter 1:Unit 1.26. [PMID: 23853079 DOI: 10.1002/0471142905.hg0126s78] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Although genome-wide association studies have been successful in detecting associations with common variants, there is currently an increasing interest in identifying low-frequency and rare variants associated with complex traits. Next-generation sequencing technologies make it feasible to survey the full spectrum of genetic variation in coding regions or the entire genome. The association analysis for rare variants is challenging, and traditional methods are ineffective, however, due to the low frequency of rare variants, coupled with allelic heterogeneity. Recently a battery of new statistical methods has been proposed for identifying rare variants associated with complex traits. These methods test for associations by aggregating multiple rare variants across a gene or a genomic region or among a group of variants in the genome. In this unit, we describe key concepts for rare variant association for complex traits, survey some of the recent methods, discuss their statistical power under various scenarios, and provide practical guidance on analyzing next-generation sequencing data for identifying rare variants associated with complex traits.
Collapse
Affiliation(s)
- Bingshan Li
- Department of Molecular Physiology and Biophysics, Center for Human Genetics Research, Vanderbilt University, Nashville, Tennessee, USA
| | | | | |
Collapse
|
82
|
Samuels ME, Hasselmann C, Deal CL, Deladoey J, Vliet GV. Whole-exome sequencing: opportunities in pediatric endocrinology. Per Med 2014; 11:63-78. [PMID: 29751389 DOI: 10.2217/pme.13.96] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Pediatric endocrinology services see a wide variety of patients with diverse clinical symptoms, including disorders of growth, metabolism, bone and sexual development. Molecular diagnosis plays an important role in this branch of medicine. Traditional PCR-based Sanger sequencing is a mainstay format for molecular testing in pediatric cases despite its relatively high cost, but the large number of gene defects associated with the various endocrine disorders renders gene-by-gene testing increasingly unattractive. Using new high-throughput sequencing technologies, whole genomes, whole exomes or candidate-gene panels (targeted gene sequencing) can now be cost-effectively sequenced for endocrine patients. Based on our own recent experiences with exome sequencing in a research context, we describe the general clinical ascertainment of relevant pediatric endocrine patients, compare different formats for next-generation sequencing and provide examples. Our view is that protocols involving next-generation sequencing should now be considered as an appropriate component of routine clinical diagnosis for relevant patients.
Collapse
Affiliation(s)
- Mark E Samuels
- Endocrinology Service, Department of Pediatrics, Université de Montréal & Centre de Recherche du CHU Ste-Justine, Montreal, QC, Canada.,Department of Medicine, Centre de Recherche du CHU Ste-Justine, Montreal, QC, Canada.
| | - Caroline Hasselmann
- Endocrinology Service, Department of Pediatrics, Université de Montréal & Centre de Recherche du CHU Ste-Justine, Montreal, QC, Canada
| | - Cheri L Deal
- Endocrinology Service, Department of Pediatrics, Université de Montréal & Centre de Recherche du CHU Ste-Justine, Montreal, QC, Canada
| | - Johnny Deladoey
- Endocrinology Service, Department of Pediatrics, Université de Montréal & Centre de Recherche du CHU Ste-Justine, Montreal, QC, Canada
| | - Guy Van Vliet
- Endocrinology Service, Department of Pediatrics, Université de Montréal & Centre de Recherche du CHU Ste-Justine, Montreal, QC, Canada
| |
Collapse
|
83
|
Shcherbakova NV, Meshkov AN, Boytsov SA. EXOME SEQUENCING AND THE DIAGNOSTICS OF COMPLEX DISEASE PREDISPOSITION IN PREVENTIVE MEDICINE. КАРДИОВАСКУЛЯРНАЯ ТЕРАПИЯ И ПРОФИЛАКТИКА 2013. [DOI: 10.15829/1728-8800-2013-6-24-28] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
The further development of preventive medicine in the 21st century may be impossible without the assessment of individual patients’ genetic data. At present, genetic methods are the gold standard in the diagnostics of monogenic diseases. Recently, genetic factors linked to complex (multifactorial) diseases, such as coronary heart disease and arterial hypertension, have been actively explored. This review focuses on the possible identification of new genetic factors of complex disease heritability, using the exome sequencing approach.
Collapse
Affiliation(s)
| | - A. N. Meshkov
- State Research Centre for Preventive Medicine, Moscow
| | - S. A. Boytsov
- State Research Centre for Preventive Medicine, Moscow
| |
Collapse
|
84
|
Berglund EC, Lindqvist CM, Hayat S, Övernäs E, Henriksson N, Nordlund J, Wahlberg P, Forestier E, Lönnerholm G, Syvänen AC. Accurate detection of subclonal single nucleotide variants in whole genome amplified and pooled cancer samples using HaloPlex target enrichment. BMC Genomics 2013; 14:856. [PMID: 24314227 PMCID: PMC4046713 DOI: 10.1186/1471-2164-14-856] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2013] [Accepted: 11/25/2013] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Target enrichment and resequencing is a widely used approach for identification of cancer genes and genetic variants associated with diseases. Although cost effective compared to whole genome sequencing, analysis of many samples constitutes a significant cost, which could be reduced by pooling samples before capture. Another limitation to the number of cancer samples that can be analyzed is often the amount of available tumor DNA. We evaluated the performance of whole genome amplified DNA and the power to detect subclonal somatic single nucleotide variants in non-indexed pools of cancer samples using the HaloPlex technology for target enrichment and next generation sequencing. RESULTS We captured a set of 1528 putative somatic single nucleotide variants and germline SNPs, which were identified by whole genome sequencing, with the HaloPlex technology and sequenced to a depth of 792-1752. We found that the allele fractions of the analyzed variants are well preserved during whole genome amplification and that capture specificity or variant calling is not affected. We detected a large majority of the known single nucleotide variants present uniquely in one sample with allele fractions as low as 0.1 in non-indexed pools of up to ten samples. We also identified and experimentally validated six novel variants in the samples included in the pools. CONCLUSION Our work demonstrates that whole genome amplified DNA can be used for target enrichment equally well as genomic DNA and that accurate variant detection is possible in non-indexed pools of cancer samples. These findings show that analysis of a large number of samples is feasible at low cost, even when only small amounts of DNA is available, and thereby significantly increases the chances of indentifying recurrent mutations in cancer samples.
Collapse
Affiliation(s)
- Eva C Berglund
- Department of Medical Sciences, Molecular Medicine and Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
85
|
Lohmueller KE, Sparsø T, Li Q, Andersson E, Korneliussen T, Albrechtsen A, Banasik K, Grarup N, Hallgrimsdottir I, Kiil K, Kilpeläinen TO, Krarup NT, Pers TH, Sanchez G, Hu Y, Degiorgio M, Jørgensen T, Sandbæk A, Lauritzen T, Brunak S, Kristiansen K, Li Y, Hansen T, Wang J, Nielsen R, Pedersen O. Whole-exome sequencing of 2,000 Danish individuals and the role of rare coding variants in type 2 diabetes. Am J Hum Genet 2013; 93:1072-86. [PMID: 24290377 DOI: 10.1016/j.ajhg.2013.11.005] [Citation(s) in RCA: 116] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Revised: 10/16/2013] [Accepted: 11/04/2013] [Indexed: 12/15/2022] Open
Abstract
It has been hypothesized that, in aggregate, rare variants in coding regions of genes explain a substantial fraction of the heritability of common diseases. We sequenced the exomes of 1,000 Danish cases with common forms of type 2 diabetes (including body mass index > 27.5 kg/m(2) and hypertension) and 1,000 healthy controls to an average depth of 56×. Our simulations suggest that our study had the statistical power to detect at least one causal gene (a gene containing causal mutations) if the heritability of these common diseases was explained by rare variants in the coding regions of a limited number of genes. We applied a series of gene-based tests to detect such susceptibility genes. However, no gene showed a significant association with disease risk after we corrected for the number of genes analyzed. Thus, we could reject a model for the genetic architecture of type 2 diabetes where rare nonsynonymous variants clustered in a modest number of genes (fewer than 20) are responsible for the majority of disease risk.
Collapse
Affiliation(s)
- Kirk E Lohmueller
- Department of Integrative Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
86
|
Varshney GK, Burgess SM. Mutagenesis and phenotyping resources in zebrafish for studying development and human disease. Brief Funct Genomics 2013; 13:82-94. [PMID: 24162064 DOI: 10.1093/bfgp/elt042] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The zebrafish (Danio rerio) is an important model organism for studying development and human disease. The zebrafish has an excellent reference genome and the functions of hundreds of genes have been tested using both forward and reverse genetic approaches. Recent years have seen an increasing number of large-scale mutagenesis projects and the number of mutants or gene knockouts in zebrafish has increased rapidly, including for the first time conditional knockout technologies. In addition, targeted mutagenesis techniques such as zinc finger nucleases, transcription activator-like effector nucleases and clustered regularly interspaced short sequences (CRISPR) or CRISPR-associated (Cas), have all been shown to effectively target zebrafish genes as well as the first reported germline homologous recombination, further expanding the utility and power of zebrafish genetics. Given this explosion of mutagenesis resources, it is now possible to perform systematic, high-throughput phenotype analysis of all zebrafish gene knockouts.
Collapse
Affiliation(s)
- Gaurav Kumar Varshney
- Developmental Genomics Section, Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| | | |
Collapse
|
87
|
Ratnapriya R, Swaroop A. Genetic architecture of retinal and macular degenerative diseases: the promise and challenges of next-generation sequencing. Genome Med 2013; 5:84. [PMID: 24112618 PMCID: PMC4066589 DOI: 10.1186/gm488] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Inherited retinal degenerative diseases (RDDs) display wide variation in their mode of inheritance, underlying genetic defects, age of onset, and phenotypic severity. Molecular mechanisms have not been delineated for many retinal diseases, and treatment options are limited. In most instances, genotype-phenotype correlations have not been elucidated because of extensive clinical and genetic heterogeneity. Next-generation sequencing (NGS) methods, including exome, genome, transcriptome and epigenome sequencing, provide novel avenues towards achieving comprehensive understanding of the genetic architecture of RDDs. Whole-exome sequencing (WES) has already revealed several new RDD genes, whereas RNA-Seq and ChIP-Seq analyses are expected to uncover novel aspects of gene regulation and biological networks that are involved in retinal development, aging and disease. In this review, we focus on the genetic characterization of retinal and macular degeneration using NGS technology and discuss the basic framework for further investigations. We also examine the challenges of NGS application in clinical diagnosis and management.
Collapse
Affiliation(s)
- Rinki Ratnapriya
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Anand Swaroop
- Neurobiology-Neurodegeneration and Repair Laboratory, National Eye Institute, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
88
|
Wolock S, Yates A, Petrill SA, Bohland JW, Blair C, Li N, Machiraju R, Huang K, Bartlett CW. Gene × smoking interactions on human brain gene expression: finding common mechanisms in adolescents and adults. J Child Psychol Psychiatry 2013; 54:1109-19. [PMID: 23909413 PMCID: PMC3809890 DOI: 10.1111/jcpp.12119] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/04/2013] [Indexed: 12/25/2022]
Abstract
BACKGROUND Numerous studies have examined gene × environment interactions (G × E) in cognitive and behavioral domains. However, these studies have been limited in that they have not been able to directly assess differential patterns of gene expression in the human brain. Here, we assessed G × E interactions using two publically available datasets to assess if DNA variation is associated with post-mortem brain gene expression changes based on smoking behavior, a biobehavioral construct that is part of a complex system of genetic and environmental influences. METHODS We conducted an expression quantitative trait locus (eQTL) study on two independent human brain gene expression datasets assessing G × E for selected psychiatric genes and smoking status. We employed linear regression to model the significance of the Gene × Smoking interaction term, followed by meta-analysis across datasets. RESULTS Overall, we observed that the effect of DNA variation on gene expression is moderated by smoking status. Expression of 16 genes was significantly associated with single nucleotide polymorphisms that demonstrated G × E effects. The strongest finding (p = 1.9 × 10⁻¹¹) was neurexin 3-alpha (NRXN3), a synaptic cell-cell adhesion molecule involved in maintenance of neural connections (such as the maintenance of smoking behavior). Other significant G × E associations include four glutamate genes. CONCLUSIONS This is one of the first studies to demonstrate G × E effects within the human brain. In particular, this study implicated NRXN3 in the maintenance of smoking. The effect of smoking on NRXN3 expression and downstream behavior is different based upon SNP genotype, indicating that DNA profiles based on SNPs could be useful in understanding the effects of smoking behaviors. These results suggest that better measurement of psychiatric conditions, and the environment in post-mortem brain studies may yield an important avenue for understanding the biological mechanisms of G × E interactions in psychiatry.
Collapse
Affiliation(s)
- Samuel Wolock
- Battelle Center for Mathematical Medicine, Nationwide Children’s Hospital, Columbus, OH, USA
| | - Andrew Yates
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, USA
| | | | - Jason W. Bohland
- Department of Health Sciences, Boston University, Boston, MA, USA
| | - Clancy Blair
- Department of Applied Psychology, New York University, New York, NY, USA
| | - Ning Li
- Battelle Center for Mathematical Medicine, Nationwide Children’s Hospital, Columbus, OH, USA
| | - Raghu Machiraju
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, USA
,Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
| | - Kun Huang
- Department of Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, USA
,Department of Computer Science and Engineering, The Ohio State University, Columbus, OH, USA
,The CCC Biomedical Informatics Shared Resource, The Ohio State University Columbus, OH, USA
| | - Christopher W. Bartlett
- Battelle Center for Mathematical Medicine, Nationwide Children’s Hospital, Columbus, OH, USA
,Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| |
Collapse
|
89
|
Matullo G, Di Gaetano C, Guarrera S. Next generation sequencing and rare genetic variants: from human population studies to medical genetics. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 2013; 54:518-532. [PMID: 23922201 DOI: 10.1002/em.21799] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2013] [Revised: 05/31/2013] [Accepted: 06/09/2013] [Indexed: 06/02/2023]
Abstract
The allelic frequency spectrum emerging from several Next Generation Sequencing (NGS) projects is revealing important details about evolutionary and demographic forces that shaped the human genome. Herein, we discuss some of the achievements of the use of low-frequency and rare variants from NGS studies. The majority of variants that affect protein-coding regions are recent and rare. Often, the novel rare variants are enriched for deleterious alleles and are population-specific, making them suitable for the study of disease susceptibility. To investigate this kind of variation and its effects in association studies, very large sample sizes will be necessary to achieve sufficient statistical power. Moreover, as these variants are typically population-specific, the replication of disease associations across populations could be very difficult due to population stratification. Therefore, the design of experiments focusing on the identification of rare variants and their effects should be carefully planned. Although several successes have already been achieved through NGS for genetic epidemiology, pharmacogenetic and clinical purposes, with improvements of the sequencing technology and decreased costs, further advances are expected in the near future.
Collapse
Affiliation(s)
- Giuseppe Matullo
- Dipartimento di Scienze Mediche, Università di Torino, Torino, Italy.
| | | | | |
Collapse
|
90
|
Li MJ, Wang LY, Xia Z, Sham PC, Wang J. GWAS3D: Detecting human regulatory variants by integrative analysis of genome-wide associations, chromosome interactions and histone modifications. Nucleic Acids Res 2013; 41:W150-8. [PMID: 23723249 PMCID: PMC3692118 DOI: 10.1093/nar/gkt456] [Citation(s) in RCA: 79] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Revised: 04/15/2013] [Accepted: 05/06/2013] [Indexed: 12/29/2022] Open
Abstract
Interpreting the genetic variants located in the regulatory regions, such as enhancers and promoters, is an indispensable step to understand molecular mechanism of complex traits. Recent studies show that genetic variants detected by genome-wide association study (GWAS) are significantly enriched in the regulatory regions. Therefore, detecting, annotating and prioritizing of genetic variants affecting gene regulation are critical to our understanding of genotype-phenotype relationships. Here, we developed a web server GWAS3D to systematically analyze the genetic variants that could affect regulatory elements, by integrating annotations from cell type-specific chromatin states, epigenetic modifications, sequence motifs and cross-species conservation. The regulatory elements are inferred from the genome-wide chromosome interaction data, chromatin marks in 16 different cell types and 73 regulatory factors motifs from the Encyclopedia of DNA Element project. Furthermore, we used these function elements, as well as risk haplotype, binding affinity, conservation and P-values reported from the original GWAS to reprioritize the genetic variants. Using studies from low-density lipoprotein cholesterol, we demonstrated that our reprioritizing approach was effective and cell type specific. In conclusion, GWAS3D provides a comprehensive annotation and visualization tool to help users interpreting their results. The web server is freely available at http://jjwanglab.org/gwas3d.
Collapse
Affiliation(s)
- Mulin Jun Li
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, Guangdong 518057, China, Department of Anaesthesiology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Department of Psychiatry LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China and State Key Laboratory in Cognitive and Brain Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Lily Yan Wang
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, Guangdong 518057, China, Department of Anaesthesiology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Department of Psychiatry LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China and State Key Laboratory in Cognitive and Brain Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Zhengyuan Xia
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, Guangdong 518057, China, Department of Anaesthesiology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Department of Psychiatry LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China and State Key Laboratory in Cognitive and Brain Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Pak Chung Sham
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, Guangdong 518057, China, Department of Anaesthesiology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Department of Psychiatry LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China and State Key Laboratory in Cognitive and Brain Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Junwen Wang
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Shenzhen Institute of Research and Innovation, The University of Hong Kong, Shenzhen, Guangdong 518057, China, Department of Anaesthesiology, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Centre for Genomic Sciences, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China, Department of Psychiatry LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China and State Key Laboratory in Cognitive and Brain Sciences, The University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
91
|
Fang H, Hou B, Wang Q, Yang Y. Rare variants analysis by risk-based variable-threshold method. Comput Biol Chem 2013; 46:32-8. [PMID: 23764529 DOI: 10.1016/j.compbiolchem.2013.04.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Revised: 04/03/2013] [Accepted: 04/10/2013] [Indexed: 11/17/2022]
Abstract
Genome-wide association studies, as a powerful approach for detecting common variants associated with diseases, have revealed many disease-associated loci. However, the traditional association analysis methods do not have enough power for detecting the effects of rare variants with limited sample size. As a solution to this problem, pooling rare variants by their functions into a composite variant provides an alternative way for identifying susceptible genes. In this paper, we propose a new pooling method to test the variant-disease association and to identify the functional rare variants related with the disease. Variants with smaller and larger risk measures defined as the ratio of allele frequencies between cases and controls are pooled and a chi-square test of the resultant pooled table is calculated. We vary the threshold of pooling over all possible values and use the maximal chi-square as test statistic. The maximal chi-square is in fact the global maximum over all possible poolings. Our approach is similar to the existing variable-threshold method, but we threshold on the risk measure instead of allele frequencies of controls. Simulation results show that our method performs better in both association testing and variant selection.
Collapse
Affiliation(s)
- Hongyan Fang
- Department of Statistics and Finance, University of Science and Technology of China, Hefei, Anhui 230026, China
| | | | | | | |
Collapse
|
92
|
Auer PL, Wang G, Leal SM. Testing for rare variant associations in the presence of missing data. Genet Epidemiol 2013; 37:529-38. [PMID: 23757187 DOI: 10.1002/gepi.21736] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2013] [Revised: 04/01/2013] [Accepted: 04/17/2013] [Indexed: 11/07/2022]
Abstract
For studies of genetically complex diseases, many association methods have been developed to analyze rare variants. When variant calls are missing, naïve implementation of rare variant association (RVA) methods may lead to inflated type I error rates as well as a reduction in power. To overcome these problems, we developed extensions for four commonly used RVA tests. Data from the National Heart Lung and Blood Institute-Exome Sequencing Project were used to demonstrate that missing variant calls can lead to increased false-positive rates and that the extended RVA methods control type I error without reducing power. We suggest a combined strategy of data filtering based on variant and sample level missing genotypes along with implementation of these extended RVA tests.
Collapse
Affiliation(s)
- Paul L Auer
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA
| | | | | |
Collapse
|
93
|
Pers TH, Dworzyński P, Thomas CE, Lage K, Brunak S. MetaRanker 2.0: a web server for prioritization of genetic variation data. Nucleic Acids Res 2013; 41:W104-8. [PMID: 23703204 PMCID: PMC3692047 DOI: 10.1093/nar/gkt387] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
MetaRanker 2.0 is a web server for prioritization of common and rare frequency genetic variation data. Based on heterogeneous data sets including genetic association data, protein–protein interactions, large-scale text-mining data, copy number variation data and gene expression experiments, MetaRanker 2.0 prioritizes the protein-coding part of the human genome to shortlist candidate genes for targeted follow-up studies. MetaRanker 2.0 is made freely available at www.cbs.dtu.dk/services/MetaRanker-2.0.
Collapse
Affiliation(s)
- Tune H Pers
- Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark
| | | | | | | | | |
Collapse
|
94
|
Wagner MJ. Rare-variant genome-wide association studies: a new frontier in genetic analysis of complex traits. Pharmacogenomics 2013; 14:413-24. [DOI: 10.2217/pgs.13.36] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Genome-wide association studies have, in the last few years, identified thousands of common genetic variants associated with common complex traits and diseases, implicating many genes not previously known to be involved in the biology of those traits. However, these variants have so far explained little of the population variance in trait values or disease susceptibility. As large-scale genome sequencing efforts have revealed the extent of genetic variation at the low end of the frequency range in human populations, the effects of rare variants have been proposed as an explanation of the ‘missing genetic variance.’ Improved technologies for genotyping rare variants, including inexpensive whole-genome and whole-exome sequencing and rare-variant genotyping chips, coupled with novel analytical methods, are making genome-wide scans for the effects of rare variants possible, and seem likely to usher in a new era in the genetic analysis of complex traits.
Collapse
Affiliation(s)
- Michael J Wagner
- Institute for Pharmacogenomics & Individualized Therapy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7361, USA
| |
Collapse
|
95
|
Kaname T, Yanagi K, Naritomi K. A commentary on The diagnostic utility of exome sequencing in Joubert syndrome and related disorders. J Hum Genet 2012. [DOI: 10.1038/jhg.2012.138] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
96
|
Sunyaev SR. Inferring causality and functional significance of human coding DNA variants. Hum Mol Genet 2012; 21:R10-7. [PMID: 22990389 DOI: 10.1093/hmg/dds385] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Sequencing technology enables the complete characterization of human genetic variation. Statistical genetics studies identify numerous loci linked to or associated with phenotypes of direct medical interest. The major remaining challenge is to characterize functionally significant alleles that are causally implicated in the genetic basis of human traits. Here, I review three sources of evidence for the functional significance of human DNA variants in protein-coding genes. These include (i) statistical genetics considerations such as co-segregation with the phenotype, allele frequency in unaffected controls and recurrence; (ii) in vitro functional assays and model organism experiments; and (iii) computational methods for predicting the functional effect of amino acid substitutions. In spite of many successes of recent studies, functional characterization of human allelic variants remains problematic.
Collapse
Affiliation(s)
- Shamil R Sunyaev
- Genetics Division, Brigham and Women's Hospital, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.
| |
Collapse
|