201
|
Abstract
The Collaborative Cross (CC) is a multiparent panel of recombinant inbred (RI) mouse strains derived from eight founder laboratory strains. RI panels are popular because of their long-term genetic stability, which enhances reproducibility and integration of data collected across time and conditions. Characterization of their genomes can be a community effort, reducing the burden on individual users. Here we present the genomes of the CC strains using two complementary approaches as a resource to improve power and interpretation of genetic experiments. Our study also provides a cautionary tale regarding the limitations imposed by such basic biological processes as mutation and selection. A distinct advantage of inbred panels is that genotyping only needs to be performed on the panel, not on each individual mouse. The initial CC genome data were haplotype reconstructions based on dense genotyping of the most recent common ancestors (MRCAs) of each strain followed by imputation from the genome sequence of the corresponding founder inbred strain. The MRCA resource captured segregating regions in strains that were not fully inbred, but it had limited resolution in the transition regions between founder haplotypes, and there was uncertainty about founder assignment in regions of limited diversity. Here we report the whole genome sequence of 69 CC strains generated by paired-end short reads at 30× coverage of a single male per strain. Sequencing leads to a substantial improvement in the fine structure and completeness of the genomes of the CC. Both MRCAs and sequenced samples show a significant reduction in the genome-wide haplotype frequencies from two wild-derived strains, CAST/EiJ and PWK/PhJ. In addition, analysis of the evolution of the patterns of heterozygosity indicates that selection against three wild-derived founder strains played a significant role in shaping the genomes of the CC. The sequencing resource provides the first description of tens of thousands of new genetic variants introduced by mutation and drift in the CC genomes. We estimate that new SNP mutations are accumulating in each CC strain at a rate of 2.4 ± 0.4 per gigabase per generation. The fixation of new mutations by genetic drift has introduced thousands of new variants into the CC strains. The majority of these mutations are novel compared to currently sequenced laboratory stocks and wild mice, and some are predicted to alter gene function. Approximately one-third of the CC inbred strains have acquired large deletions (>10 kb) many of which overlap known coding genes and functional elements. The sequence of these mice is a critical resource to CC users, increases threefold the number of mouse inbred strain genomes available publicly, and provides insight into the effect of mutation and drift on common resources.
Collapse
|
202
|
White SJ, Laros JF, Bakker E, Cambon‐Thomsen A, Eden M, Leonard S, Lochmüller H, Matthijs G, Mattocks C, Patton S, Payne K, Scheffer H, Souche E, Thomassen E, Thompson R, Traeger‐Synodinos J, Vooren S, Janssen B, den Dunnen JT. Critical points for an accurate human genome analysis. Hum Mutat 2017; 38:912-921. [DOI: 10.1002/humu.23238] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Revised: 04/13/2017] [Accepted: 04/23/2017] [Indexed: 12/16/2022]
Affiliation(s)
- Stefan J. White
- Department of Human Genetics, Leiden University Medical Center The Netherlands
| | - Jeroen F.J. Laros
- Department of Human Genetics, Leiden University Medical Center The Netherlands
- Clinical GeneticsLeiden University Medical Center The Netherlands
- GenomeScan Leiden The Netherlands
| | - Egbert Bakker
- Clinical GeneticsLeiden University Medical Center The Netherlands
| | - Anne Cambon‐Thomsen
- Epidemiology and Public Health Analyses, Inserm and Université Toulouse III Paul Sabatier Toulouse UMR 1027 France
| | - Martin Eden
- Manchester Centre for Health Economics, University of Manchester Manchester UK
| | - Samantha Leonard
- Epidemiology and Public Health Analyses, Inserm and Université Toulouse III Paul Sabatier Toulouse UMR 1027 France
| | - Hanns Lochmüller
- Institute of Genetic Medicine, Newcastle University Newcastle upon Tyne UK
| | | | | | - Simon Patton
- Central Manchester University Hospitals Foundation Trust, EMQN Manchester UK
| | - Katherine Payne
- Manchester Centre for Health Economics, University of Manchester Manchester UK
| | | | | | - Ellen Thomassen
- Department of Human Genetics, Leiden University Medical Center The Netherlands
| | - Rachel Thompson
- Institute of Genetic Medicine, Newcastle University Newcastle upon Tyne UK
| | | | | | | | - Johan T. den Dunnen
- Department of Human Genetics, Leiden University Medical Center The Netherlands
- Clinical GeneticsLeiden University Medical Center The Netherlands
| |
Collapse
|
203
|
Abstract
Monoallelic expression not due to cis-regulatory sequence polymorphism poses an intriguing problem in epigenetics because it requires the unequal treatment of two segments of DNA that are present in the same nucleus and that can indeed have absolutely identical sequences. Here, I focus on a few recent developments in the field of monoallelic expression that are of particular interest and raise interesting questions for future work. One development is regarding analyses of imprinted genes, in which recent work suggests the possibility that intriguing networks of imprinted genes exist and are important for genetic and physiological studies. Another issue that has been raised in recent years by a number of publications is the question of how skewed allelic expression should be for it to be designated as monoallelic expression and, further, what methods are appropriate or inappropriate for analyzing genomic data to examine allele-specific expression. Perhaps the most exciting recent development in mammalian monoallelic expression is a clever and carefully executed analysis of genetic diversity of autosomal genes subject to random monoallelic expression (RMAE), which provides compelling evidence for distinct evolutionary forces acting on random monoallelically expressed genes.
Collapse
Affiliation(s)
- Andrew Chess
- Department of Genetics and Genomic Sciences, Department of Developmental and Regenerative Biology, Fishberg Department of Neuroscience, and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029-6574;
| |
Collapse
|
204
|
A unique haplotype of RCCX copy number variation: from the clinics of congenital adrenal hyperplasia to evolutionary genetics. Eur J Hum Genet 2017; 25:702-710. [PMID: 28401898 DOI: 10.1038/ejhg.2017.38] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Revised: 02/08/2017] [Accepted: 02/14/2017] [Indexed: 01/26/2023] Open
Abstract
There is a difficulty in the molecular diagnosis of congenital adrenal hyperplasia (CAH) due to the c.955C>T (p.(Q319*), formerly Q318X, rs7755898) variant of the CYP21A2 gene. Therefore, a systematic assessment of the genetic and evolutionary relationships between c.955C>T, CYP21A2 haplotypes and the RCCX copy number variation (CNV) structures, which harbor CYP21A2, was performed. In total, 389 unrelated Hungarian individuals with European ancestry (164 healthy subjects, 125 patients with non-functioning adrenal incidentaloma and 100 patients with classical CAH) as well as 34 adrenocortical tumor specimens were studied using a set of experimental and bioinformatic methods. A unique, moderately frequent (2%) haplotypic RCCX CNV structure with three repeated segments, abbreviated to LBSASB, harboring a CYP21A2 with a c.955C>T variant in the 3'-segment, and a second CYP21A2 with a specific c.*12C>T (rs150697472) variant in the middle segment occurred in all c.955C>T carriers with normal steroid levels. The second CYP21A2 was free of CAH-causing mutations and produced mRNA in the adrenal gland, confirming its functionality and ability to rescue the carriers from CAH. Neither LBSASB nor c.*12C>T occurred in classical CAH patients. However, CAH-causing CYP21A2 haplotypes with c.955C>T could be derived from the 3'-segment of LBSASB after the loss of functional CYP21A2 from the middle segment. The c.*12C>T indicated a functional CYP21A2 and could distinguish between non-pathogenic and pathogenic genomic contexts of the c.955C>T variant in the studied European population. Therefore, c.*12C>T may be suitable as a marker to avoid this genetic confound and improve the diagnosis of CAH.
Collapse
|
205
|
DeBoever C, Li H, Jakubosky D, Benaglio P, Reyna J, Olson KM, Huang H, Biggs W, Sandoval E, D'Antonio M, Jepsen K, Matsui H, Arias A, Ren B, Nariai N, Smith EN, D'Antonio-Chronowska A, Farley EK, Frazer KA. Large-Scale Profiling Reveals the Influence of Genetic Variation on Gene Expression in Human Induced Pluripotent Stem Cells. Cell Stem Cell 2017; 20:533-546.e7. [PMID: 28388430 PMCID: PMC5444918 DOI: 10.1016/j.stem.2017.03.009] [Citation(s) in RCA: 111] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2016] [Revised: 12/27/2016] [Accepted: 03/15/2017] [Indexed: 12/18/2022]
Abstract
In this study, we used whole-genome sequencing and gene expression profiling of 215 human induced pluripotent stem cell (iPSC) lines from different donors to identify genetic variants associated with RNA expression for 5,746 genes. We were able to predict causal variants for these expression quantitative trait loci (eQTLs) that disrupt transcription factor binding and validated a subset of them experimentally. We also identified copy-number variant (CNV) eQTLs, including some that appear to affect gene expression by altering the copy number of intergenic regulatory regions. In addition, we were able to identify effects on gene expression of rare genic CNVs and regulatory single-nucleotide variants and found that reactivation of gene expression on the X chromosome depends on gene chromosomal position. Our work highlights the value of iPSCs for genetic association analyses and provides a unique resource for investigating the genetic regulation of gene expression in pluripotent cells.
Collapse
Affiliation(s)
- Christopher DeBoever
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - He Li
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - David Jakubosky
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA 92093-0419, USA; Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Paola Benaglio
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Joaquin Reyna
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Katrina M Olson
- Division of Cardiology, Department of Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA; Division of Biological Sciences, Section of Molecular Biology, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Hui Huang
- Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA 92093-0419, USA; Ludwig Institute for Cancer Research, La Jolla, CA 92093, USA
| | | | | | - Matteo D'Antonio
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Kristen Jepsen
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Hiroko Matsui
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Angelo Arias
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Bing Ren
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA; Ludwig Institute for Cancer Research, La Jolla, CA 92093, USA; Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Naoki Nariai
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | - Erin N Smith
- Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA
| | | | - Emma K Farley
- Division of Cardiology, Department of Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA; Division of Biological Sciences, Section of Molecular Biology, University of California, San Diego, La Jolla, CA 92093-0419, USA.
| | - Kelly A Frazer
- Institute for Genomic Medicine, University of California, San Diego, La Jolla, CA 92093-0419, USA; Department of Pediatrics and Rady Children's Hospital, University of California, San Diego, La Jolla, CA 92093-0419, USA.
| |
Collapse
|
206
|
Human Y chromosome copy number variation in the next generation sequencing era and beyond. Hum Genet 2017; 136:591-603. [PMID: 28378101 PMCID: PMC5418319 DOI: 10.1007/s00439-017-1788-5] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 03/25/2017] [Indexed: 11/16/2022]
Abstract
The human Y chromosome provides a fertile ground for structural rearrangements owing to its haploidy and high content of repeated sequences. The methodologies used for copy number variation (CNV) studies have developed over the years. Low-throughput techniques based on direct observation of rearrangements were developed early on, and are still used, often to complement array-based or sequencing approaches which have limited power in regions with high repeat content and specifically in the presence of long, identical repeats, such as those found in human sex chromosomes. Some specific rearrangements have been investigated for decades; because of their effects on fertility, or their outstanding evolutionary features, the interest in these has not diminished. However, following the flourishing of large-scale genomics, several studies have investigated CNVs across the whole chromosome. These studies sometimes employ data generated within large genomic projects such as the DDD study or the 1000 Genomes Project, and often survey large samples of healthy individuals without any prior selection. Novel technologies based on sequencing long molecules and combinations of technologies, promise to stimulate the study of Y-CNVs in the immediate future.
Collapse
|
207
|
Collins RL, Brand H, Redin CE, Hanscom C, Antolik C, Stone MR, Glessner JT, Mason T, Pregno G, Dorrani N, Mandrile G, Giachino D, Perrin D, Walsh C, Cipicchio M, Costello M, Stortchevoi A, An JY, Currall BB, Seabra CM, Ragavendran A, Margolin L, Martinez-Agosto JA, Lucente D, Levy B, Sanders SJ, Wapner RJ, Quintero-Rivera F, Kloosterman W, Talkowski ME. Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome. Genome Biol 2017; 18:36. [PMID: 28260531 PMCID: PMC5338099 DOI: 10.1186/s13059-017-1158-6] [Citation(s) in RCA: 111] [Impact Index Per Article: 15.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Accepted: 01/20/2017] [Indexed: 12/13/2022] Open
Abstract
Background Structural variation (SV) influences genome organization and contributes to human disease. However, the complete mutational spectrum of SV has not been routinely captured in disease association studies. Results We sequenced 689 participants with autism spectrum disorder (ASD) and other developmental abnormalities to construct a genome-wide map of large SV. Using long-insert jumping libraries at 105X mean physical coverage and linked-read whole-genome sequencing from 10X Genomics, we document seven major SV classes at ~5 kb SV resolution. Our results encompass 11,735 distinct large SV sites, 38.1% of which are novel and 16.8% of which are balanced or complex. We characterize 16 recurrent subclasses of complex SV (cxSV), revealing that: (1) cxSV are larger and rarer than canonical SV; (2) each genome harbors 14 large cxSV on average; (3) 84.4% of large cxSVs involve inversion; and (4) most large cxSV (93.8%) have not been delineated in previous studies. Rare SVs are more likely to disrupt coding and regulatory non-coding loci, particularly when truncating constrained and disease-associated genes. We also identify multiple cases of catastrophic chromosomal rearrangements known as chromoanagenesis, including somatic chromoanasynthesis, and extreme balanced germline chromothripsis events involving up to 65 breakpoints and 60.6 Mb across four chromosomes, further defining rare categories of extreme cxSV. Conclusions These data provide a foundational map of large SV in the morbid human genome and demonstrate a previously underappreciated abundance and diversity of cxSV that should be considered in genomic studies of human disease. Electronic supplementary material The online version of this article (doi:10.1186/s13059-017-1158-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ryan L Collins
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, Harvard Medical School, Boston, MA, 02115, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Harrison Brand
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Claire E Redin
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Carrie Hanscom
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Caroline Antolik
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Matthew R Stone
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Joseph T Glessner
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Tamara Mason
- Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Giulia Pregno
- Medical Genetics Unit, Department of Clinical and Biological Sciences, University of Torino, Orbassano, Italy
| | - Naghmeh Dorrani
- Department of Pathology & Laboratory Medicine and UCLA Clinical Genomics Center, David Geffen School of Medicine, University of California Los Angeles, UCLA, Los Angeles, CA, 90095, USA
| | - Giorgia Mandrile
- Medical Genetics Unit, Department of Clinical and Biological Sciences, University of Torino, Orbassano, Italy
| | - Daniela Giachino
- Medical Genetics Unit, Department of Clinical and Biological Sciences, University of Torino, Orbassano, Italy
| | - Danielle Perrin
- Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Cole Walsh
- Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Michelle Cipicchio
- Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Maura Costello
- Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Alexei Stortchevoi
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Joon-Yong An
- Department of Psychiatry, University of California San Francisco, San Francisco, CA, 94103, USA
| | - Benjamin B Currall
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Catarina M Seabra
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA.,GABBA Program, University of Porto, Porto, 4099-002, Portugal
| | - Ashok Ragavendran
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA.,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Lauren Margolin
- Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA
| | - Julian A Martinez-Agosto
- Department of Pathology & Laboratory Medicine and UCLA Clinical Genomics Center, David Geffen School of Medicine, University of California Los Angeles, UCLA, Los Angeles, CA, 90095, USA
| | - Diane Lucente
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Brynn Levy
- Department of Pathology, Columbia University, New York, NY, 10032, USA
| | - Stephan J Sanders
- Department of Psychiatry, University of California San Francisco, San Francisco, CA, 94103, USA
| | - Ronald J Wapner
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Columbia University Medical Center, New York, NY, 10032, USA
| | - Fabiola Quintero-Rivera
- Department of Pathology & Laboratory Medicine and UCLA Clinical Genomics Center, David Geffen School of Medicine, University of California Los Angeles, UCLA, Los Angeles, CA, 90095, USA
| | - Wigard Kloosterman
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, 3584CG, The Netherlands
| | - Michael E Talkowski
- Molecular Neurogenetics Unit and Psychiatric and Neurodevelopmental Genetics Unit, Center for Genomic Medicine, and Department of Neurology, Massachusetts General Hospital, Boston, MA, 02114, USA. .,Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, Harvard Medical School, Boston, MA, 02115, USA. .,Program in Population and Medical Genetics and Genomics Platform, The Broad Institute of M.I.T. and Harvard, Cambridge, MA, 02142, USA.
| |
Collapse
|
208
|
Abstract
Deciphering the genetic basis of human disease requires a comprehensive knowledge of genetic variants irrespective of their class or frequency. Although an impressive number of human genetic variants have been catalogued, a large fraction of the genetic difference that distinguishes two human genomes is still not understood at the base-pair level. This is because the emphasis has been on single-nucleotide variation as opposed to less tractable and more complex genetic variants, including indels and structural variants. The latter, we propose, will have a large impact on human phenotypes but require a more systematic assessment of genomes at deeper coverage and alternate sequencing and mapping technologies.
Collapse
|
209
|
Hu XS, Yeh FC, Hu Y, Deng LT, Ennos RA, Chen X. High mutation rates explain low population genetic divergence at copy-number-variable loci in Homo sapiens. Sci Rep 2017; 7:43178. [PMID: 28225073 PMCID: PMC5320550 DOI: 10.1038/srep43178] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2016] [Accepted: 01/19/2017] [Indexed: 11/09/2022] Open
Abstract
Copy-number-variable (CNV) loci differ from single nucleotide polymorphic (SNP) sites in size, mutation rate, and mechanisms of maintenance in natural populations. It is therefore hypothesized that population genetic divergence at CNV loci will differ from that found at SNP sites. Here, we test this hypothesis by analysing 856 CNV loci from the genomes of 1184 healthy individuals from 11 HapMap populations with a wide range of ancestry. The results show that population genetic divergence at the CNV loci is generally more than three times lower than at genome-wide SNP sites. Populations generally exhibit very small genetic divergence (Gst = 0.05 ± 0.049). The smallest divergence is among African populations (Gst = 0.0081 ± 0.0025), with increased divergence among non-African populations (Gst = 0.0217 ± 0.0109) and then among African and non-African populations (Gst = 0.0324 ± 0.0064). Genetic diversity is high in African populations (~0.13), low in Asian populations (~0.11), and intermediate in the remaining 11 populations. Few significant linkage disequilibria (LDs) occur between the genome-wide CNV loci. Patterns of gametic and zygotic LDs indicate the absence of epistasis among CNV loci. Mutation rate is about twice as large as the migration rate in the non-African populations, suggesting that the high mutation rates play dominant roles in producing the low population genetic divergence at CNV loci.
Collapse
Affiliation(s)
- Xin-Sheng Hu
- Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong 510642, China.,College of Forestry and Landscape Architecture, South China Agricultural University, Guangdong 510642, China
| | - Francis C Yeh
- Department of Renewable Resources, 751 General Service Building, University of Alberta, Edmonton, AB T6G 2H1, Canada
| | - Yang Hu
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2S4, Canada
| | - Li-Ting Deng
- Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong 510642, China.,College of Forestry and Landscape Architecture, South China Agricultural University, Guangdong 510642, China
| | - Richard A Ennos
- Institute of Evolutionary Biology, Ashworth Laboratories, School of Biological Sciences, University of Edinburgh, Edinburgh EH 9 3JT, United Kingdom
| | - Xiaoyang Chen
- Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong 510642, China.,College of Forestry and Landscape Architecture, South China Agricultural University, Guangdong 510642, China
| |
Collapse
|
210
|
Kirk IK, Weinhold N, Belling K, Skakkebæk NE, Jensen TS, Leffers H, Juul A, Brunak S. Chromosome-wise Protein Interaction Patterns and Their Impact on Functional Implications of Large-Scale Genomic Aberrations. Cell Syst 2017; 4:357-364.e3. [PMID: 28215527 DOI: 10.1016/j.cels.2017.01.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2016] [Revised: 10/23/2016] [Accepted: 01/05/2017] [Indexed: 10/20/2022]
Abstract
Gene copy-number changes influence phenotypes through gene-dosage alteration and subsequent changes of protein complex stoichiometry. Human trisomies where gene copy numbers are increased uniformly over entire chromosomes provide generic cases for studying these relationships. In most trisomies, gene and protein level alterations have fatal consequences. We used genome-wide protein-protein interaction data to identify chromosome-specific patterns of protein interactions. We found that some chromosomes encode proteins that interact infrequently with each other, chromosome 21 in particular. We combined the protein interaction data with transcriptome data from human brain tissue to investigate how this pattern of global interactions may affect cellular function. We identified highly connected proteins that also had coordinated gene expression. These proteins were associated with important neurological functions affecting the characteristic phenotypes for Down syndrome and have previously been validated in mouse knockout experiments. Our approach is general and applicable to other gene-dosage changes, such as arm-level amplifications in cancer.
Collapse
Affiliation(s)
- Isa Kristina Kirk
- Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark, 2800 Lyngby, Denmark; Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Nils Weinhold
- Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark, 2800 Lyngby, Denmark; Department of Radiation Oncology, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
| | - Kirstine Belling
- Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark, 2800 Lyngby, Denmark; Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Niels Erik Skakkebæk
- Department of Growth and Reproduction, Rigshospitalet and University of Copenhagen, 2100 Copenhagen, Denmark
| | - Thomas Skøt Jensen
- Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Henrik Leffers
- Department of Growth and Reproduction, Rigshospitalet and University of Copenhagen, 2100 Copenhagen, Denmark
| | - Anders Juul
- Department of Growth and Reproduction, Rigshospitalet and University of Copenhagen, 2100 Copenhagen, Denmark
| | - Søren Brunak
- Department of Systems Biology, Center for Biological Sequence Analysis, Technical University of Denmark, 2800 Lyngby, Denmark; Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, 2200 Copenhagen, Denmark.
| |
Collapse
|
211
|
Pagès M, Beccaria K, Boddaert N, Saffroy R, Besnard A, Castel D, Fina F, Barets D, Barret E, Lacroix L, Bielle F, Andreiuolo F, Tauziède-Espariat A, Figarella-Branger D, Puget S, Grill J, Chrétien F, Varlet P. Co-occurrence of histone H3 K27M and BRAF V600E mutations in paediatric midline grade I ganglioglioma. Brain Pathol 2017; 28:103-111. [PMID: 27984673 DOI: 10.1111/bpa.12473] [Citation(s) in RCA: 70] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 12/01/2016] [Indexed: 12/31/2022] Open
Abstract
Ganglioglioma (GG) is a grade I tumor characterized by alterations in the MAPK pathway, including BRAF V600E mutation. Recently, diffuse midline glioma with an H3 K27M mutation was added to the WHO 2016 classification as a new grade IV entity. As co-occurrence of H3 K27M and BRAF V600E mutations has been reported in midline tumors and anaplastic GG, we searched for BRAF V600E and H3 K27M mutations in a series of 54 paediatric midline grade I GG (midline GG) to determine the frequency of double mutations and its relevance for prognosis. Twenty-seven patients (50%) possessed the BRAF V600E mutation. The frequency of the co-occurrence of H3F3A/BRAF mutations at diagnosis was 9.3%. No H3 K27M mutation was detected in the absence of the BRAF V600E mutation. Double-immunostaining revealed that BRAF V600E and H3 K27M mutant proteins were present in both the glial and neuronal components. Immunopositivity for the BRAF V600E mutant protein correlated with BRAF mutation status as detected by massARRAY or digital droplet PCR. The median follow-up of patients with double mutation was 4 years. One patient died of progressive disease 8 years after diagnosis, whereas the four other patients were all alive with stable disease at the last clinical follow-up (at 9 months, 1 year and 7 years) without adjuvant therapy. We demonstrate in this first series of midline GGs that the H3 K27M mutation can occur in association with the BRAF V600E mutation in grade I glioneuronal tumors. Despite the presence of H3 K27M mutations, these cases should not be graded and treated as grade IV tumors because they have a better spontaneous outcome than classic diffuse midline H3 K27M-mutant glioma. These data suggest that H3 K27M cannot be considered a specific hallmark of grade IV diffuse gliomas and highlight the importance of integrated histomolecular diagnosis in paediatric brain tumors.
Collapse
Affiliation(s)
- Mélanie Pagès
- Department of Neuropathology, Sainte-Anne Hospital, Paris, France.,Paris V Descartes University, Paris, France.,Institut National de la Santé et de la Recherche Médicale, INSERM Unit 1000 "Neuroimaging & Psychiatry,", Université Paris Sud, Orsay
| | - Kevin Beccaria
- Department of Paediatric Neurosurgery, Necker Enfants Malades Hospital, Paris, France
| | - Nathalie Boddaert
- Department of Paediatric Neuroradiology, Necker Enfants Malades Hospital, Paris, France
| | - Raphaël Saffroy
- Department of Biochemistry, Paul Brousse Hospital, Paris, France
| | - Aurore Besnard
- Department of Neuropathology, Sainte-Anne Hospital, Paris, France
| | - David Castel
- UMR8203 "Vectorologie et Thérapeutiques Anticancéreuses," CNRS, Gustave Roussy, Univ. Paris-Sud, Université Paris-Saclay, Villejuif, 94805, France.,Département de Cancérologie de l'Enfant et de l'Adolescent, Gustave Roussy, Univ. Paris-Sud, Université Paris-Saclay, Villejuif, 94805, France
| | - Frédéric Fina
- Service de transfert d'Oncologie Biologique, LBM APHM Marseille, France
| | - Doriane Barets
- APHM, Hôpital de la Timone, Service d'Anatomie Pathologique et de Neuropathologie, Marseille, France
| | - Emilie Barret
- UMR8203 "Vectorologie et Thérapeutiques Anticancéreuses," CNRS, Gustave Roussy, Univ. Paris-Sud, Université Paris-Saclay, Villejuif, 94805, France.,Département de Cancérologie de l'Enfant et de l'Adolescent, Gustave Roussy, Univ. Paris-Sud, Université Paris-Saclay, Villejuif, 94805, France
| | - Ludovic Lacroix
- Departement de Biologie et Pathologie Médicales, Gustave Roussy, Univ. Paris-Sud, Université Paris-Saclay, Villejuif, 94805, France
| | - Franck Bielle
- Department of Neuropathology, Laboratoire Escourolle, Hôpitaux Universitaires Pitié Salpêtrière Charles Foix, AP-HP, Paris, France
| | | | | | - Dominique Figarella-Branger
- APHM, Hôpital de la Timone, Service d'Anatomie Pathologique et de Neuropathologie, Marseille, France.,Aix-Marseille Université, Inserm, CRO2 UMR_S 911, Marseille, France
| | - Stéphanie Puget
- Department of Paediatric Neurosurgery, Necker Enfants Malades Hospital, Paris, France
| | - Jacques Grill
- UMR8203 "Vectorologie et Thérapeutiques Anticancéreuses," CNRS, Gustave Roussy, Univ. Paris-Sud, Université Paris-Saclay, Villejuif, 94805, France.,Département de Cancérologie de l'Enfant et de l'Adolescent, Gustave Roussy, Univ. Paris-Sud, Université Paris-Saclay, Villejuif, 94805, France
| | - Fabrice Chrétien
- Department of Neuropathology, Sainte-Anne Hospital, Paris, France.,Paris V Descartes University, Paris, France.,Infection & Epidemiology Department, Human Histopathology and Animal Models Unit, Institut Pasteur, Paris, France
| | - Pascale Varlet
- Department of Neuropathology, Sainte-Anne Hospital, Paris, France.,Paris V Descartes University, Paris, France.,Institut National de la Santé et de la Recherche Médicale, INSERM Unit 1000 "Neuroimaging & Psychiatry,", Université Paris Sud, Orsay
| |
Collapse
|
212
|
Luo Y, de Lange KM, Jostins L, Moutsianas L, Randall J, Kennedy NA, Lamb CA, McCarthy S, Ahmad T, Edwards C, Serra EG, Hart A, Hawkey C, Mansfield JC, Mowat C, Newman WG, Nichols S, Pollard M, Satsangi J, Simmons A, Tremelling M, Uhlig H, Wilson DC, Lee JC, Prescott NJ, Lees CW, Mathew CG, Parkes M, Barrett JC, Anderson CA. Exploring the genetic architecture of inflammatory bowel disease by whole-genome sequencing identifies association at ADCY7. Nat Genet 2017; 49:186-192. [PMID: 28067910 PMCID: PMC5289625 DOI: 10.1038/ng.3761] [Citation(s) in RCA: 120] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2016] [Accepted: 12/07/2016] [Indexed: 02/06/2023]
Abstract
To further resolve the genetic architecture of the inflammatory bowel diseases ulcerative colitis and Crohn's disease, we sequenced the whole genomes of 4,280 patients at low coverage and compared them to 3,652 previously sequenced population controls across 73.5 million variants. We then imputed from these sequences into new and existing genome-wide association study cohorts and tested for association at ∼12 million variants in a total of 16,432 cases and 18,843 controls. We discovered a 0.6% frequency missense variant in ADCY7 that doubles the risk of ulcerative colitis. Despite good statistical power, we did not identify any other new low-frequency risk variants and found that such variants explained little heritability. We detected a burden of very rare, damaging missense variants in known Crohn's disease risk genes, suggesting that more comprehensive sequencing studies will continue to improve understanding of the biology of complex diseases.
Collapse
Affiliation(s)
- Yang Luo
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
- Division of Genetics and Rheumatology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | | | - Luke Jostins
- Wellcome Trust Centre for Human Genetics, University of Oxford, Headington, UK
- Christ Church, University of Oxford, St Aldates, UK
| | - Loukas Moutsianas
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Joshua Randall
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Nicholas A. Kennedy
- Precision Medicine Exeter, University of Exeter, Exeter, UK
- IBD Pharmacogenetics, Royal Devon and Exeter Foundation Trust, Exeter, UK
| | | | - Shane McCarthy
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Tariq Ahmad
- Precision Medicine Exeter, University of Exeter, Exeter, UK
- IBD Pharmacogenetics, Royal Devon and Exeter Foundation Trust, Exeter, UK
| | - Cathryn Edwards
- Department of Gastroenterology, Torbay Hospital, Torbay, Devon, UK
| | | | - Ailsa Hart
- Department of Medicine, St Mark's Hospital, Harrow, Middlesex, UK
| | - Chris Hawkey
- Nottingham Digestive Diseases Centre, Queens Medical Centre, Nottingham, UK
| | - John C. Mansfield
- Institute of Human Genetics, Newcastle University, Newcastle upon Tyne, UK
| | - Craig Mowat
- Department of Medicine, Ninewells Hospital and Medical School, Dundee, UK
| | - William G. Newman
- Genetic Medicine, Manchester Academic Health Science Centre, Manchester, UK
- The Manchester Centre for Genomic Medicine, University of Manchester, Manchester, UK
| | - Sam Nichols
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Martin Pollard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Jack Satsangi
- Gastrointestinal Unit, Wester General Hospital University of Edinburgh, Edinburgh, UK
| | - Alison Simmons
- Translational Gastroenterology Unit, John Radcliffe Hospital, University of Oxford, Oxford OX3 9DS, UK
- Human Immunology Unit, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DS, UK
| | - Mark Tremelling
- Gastroenterology & General Medicine, Norfolk and Norwich University Hospital, Norwich, UK
| | - Holm Uhlig
- Translational Gastroenterology Unit and the Department of Paediatrics, University of Oxford, Oxford, United Kingdom
| | - David C. Wilson
- Paediatric Gastroenterology and Nutrition, Royal Hospital for Sick Children, Edinburgh, UK
- Child Life and Health, University of Edinburgh, Edinburgh, Scotland, UK
| | - James C. Lee
- Inflammatory Bowel Disease Research Group, Addenbrooke's Hospital, Cambridge, UK
| | - Natalie J. Prescott
- Department of Medical and Molecular Genetics, Faculty of Life Science and Medicine, King's College London, Guy's Hospital, London, UK
| | - Charlie W. Lees
- Gastrointestinal Unit, Wester General Hospital University of Edinburgh, Edinburgh, UK
| | - Christopher G. Mathew
- Department of Medical and Molecular Genetics, Faculty of Life Science and Medicine, King's College London, Guy's Hospital, London, UK
- Sydney Brenner Institute for Molecular Bioscience, Faculty of Health Sciences, University of Witwatersrand, South Africa
| | - Miles Parkes
- Inflammatory Bowel Disease Research Group, Addenbrooke's Hospital, Cambridge, UK
| | - Jeffrey C. Barrett
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Carl A. Anderson
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| |
Collapse
|
213
|
Improvements and impacts of GRCh38 human reference on high throughput sequencing data analysis. Genomics 2017; 109:83-90. [PMID: 28131802 DOI: 10.1016/j.ygeno.2017.01.005] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Revised: 01/16/2017] [Accepted: 01/24/2017] [Indexed: 12/31/2022]
Abstract
Analyses of high throughput sequencing data starts with alignment against a reference genome, which is the foundation for all re-sequencing data analyses. Each new release of the human reference genome has been augmented with improved accuracy and completeness. It is presumed that the latest release of human reference genome, GRCh38 will contribute more to high throughput sequencing data analysis by providing more accuracy. But the amount of improvement has not yet been quantified. We conducted a study to compare the genomic analysis results between the GRCh38 reference and its predecessor GRCh37. Through analyses of alignment, single nucleotide polymorphisms, small insertion/deletions, copy number and structural variants, we show that GRCh38 offers overall more accurate analysis of human sequencing data. More importantly, GRCh38 produced fewer false positive structural variants. In conclusion, GRCh38 is an improvement over GRCh37 not only from the genome assembly aspect, but also yields more reliable genomic analysis results.
Collapse
|
214
|
Abstract
Copy number variation (CNV), where a segment of DNA differs in copy number between different individuals, is an extensive and often underappreciated source of genetic variation within species. However, reliably determining copy number of a particular DNA sequence for a large number of samples can be challenging. Here, I describe and review the paralogue ratio test (PRT) in detail. PRT was developed to robustly type the CNV of the beta-defensin locus using small amounts of genomic DNA in a high-throughput manner, and has been applied successfully at many other loci. I discuss the strategies for designing successful PRT assays using both manual and bioinformatics methods, how to optimize experimental conditions, and approaches for analyzing the data. I discuss strengths and weaknesses of the approach, and how to troubleshoot results, as well as the range of problems to which PRT can be a potential solution.
Collapse
|
215
|
Ji T, Chen J. Statistical models for DNA copy number variation detection using read-depth data from next generation sequencing experiments. AUST NZ J STAT 2016. [DOI: 10.1111/anzs.12175] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Affiliation(s)
- Tieming Ji
- Department of Statistics; University of Missouri at Columbia; Columbia MI 65211 USA
| | - Jie Chen
- Department of Biostatistics and Epidemiology; Medical College of Georgia, Augusta University; Augusta GA 30912 USA
| |
Collapse
|
216
|
Comprehensive population-based genome sequencing provides insight into hematopoietic regulatory mechanisms. Proc Natl Acad Sci U S A 2016; 114:E327-E336. [PMID: 28031487 DOI: 10.1073/pnas.1619052114] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Genetic variants affecting hematopoiesis can influence commonly measured blood cell traits. To identify factors that affect hematopoiesis, we performed association studies for blood cell traits in the population-based Estonian Biobank using high-coverage whole-genome sequencing (WGS) in 2,284 samples and SNP genotyping in an additional 14,904 samples. Using up to 7,134 samples with available phenotype data, our analyses identified 17 associations across 14 blood cell traits. Integration of WGS-based fine-mapping and complementary epigenomic datasets provided evidence for causal mechanisms at several loci, including at a previously undiscovered basophil count-associated locus near the master hematopoietic transcription factor CEBPA The fine-mapped variant at this basophil count association near CEBPA overlapped an enhancer active in common myeloid progenitors and influenced its activity. In situ perturbation of this enhancer by CRISPR/Cas9 mutagenesis in hematopoietic stem and progenitor cells demonstrated that it is necessary for and specifically regulates CEBPA expression during basophil differentiation. We additionally identified basophil count-associated variation at another more pleiotropic myeloid enhancer near GATA2, highlighting regulatory mechanisms for ordered expression of master hematopoietic regulators during lineage specification. Our study illustrates how population-based genetic studies can provide key insights into poorly understood cell differentiation processes of considerable physiologic relevance.
Collapse
|
217
|
Dennis MY, Eichler EE. Human adaptation and evolution by segmental duplication. Curr Opin Genet Dev 2016; 41:44-52. [PMID: 27584858 PMCID: PMC5161654 DOI: 10.1016/j.gde.2016.08.001] [Citation(s) in RCA: 114] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2016] [Revised: 07/02/2016] [Accepted: 08/02/2016] [Indexed: 12/29/2022]
Abstract
Duplications are the primary force by which new gene functions arise and provide a substrate for large-scale structural variation. Analysis of thousands of genomes shows that humans and great apes have more genetic differences in content and structure over recent segmental duplications than any other euchromatic region. Novel human-specific duplicated genes, ARHGAP11B and SRGAP2C, have recently been described with a potential role in neocortical expansion and increased neuronal spine density. Large segmental duplications and the structural variants they promote are also frequently stratified between human populations with a subset being subjected to positive selection. The impact of recent duplications on human evolution and adaptation is only beginning to be realized as new technologies enhance their discovery and accurate genotyping.
Collapse
Affiliation(s)
- Megan Y Dennis
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA 95616, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
218
|
Homann OR, Misura K, Lamas E, Sandrock RW, Nelson P, McDonough SI, DeLisi LE. Whole-genome sequencing in multiplex families with psychoses reveals mutations in the SHANK2 and SMARCA1 genes segregating with illness. Mol Psychiatry 2016; 21:1690-1695. [PMID: 27001614 PMCID: PMC5033653 DOI: 10.1038/mp.2016.24] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Revised: 01/16/2016] [Accepted: 01/20/2016] [Indexed: 12/30/2022]
Abstract
A current focus in psychiatric genetics is detection of multiple common risk alleles through very large genome-wide association study analyses. Yet families do exist, albeit rare, that have multiple affected members who are presumed to have a similar inherited cause to their illnesses. We hypothesized that within some of these families there may be rare highly penetrant mutations that segregate with illness. In this exploratory study, the genomes of 90 individuals across nine families were sequenced. Each family included a minimum of three available relatives affected with a psychotic illness and three available unaffected relatives. Twenty-six variants were identified that are private to a family, alter protein sequence, and are transmitted to all sequenced affected individuals within the family. In one family, seven siblings with schizophrenia spectrum disorders each carry a novel private missense variant within the SHANK2 gene. This variant lies within the consensus SH3 protein-binding motif by which SHANK2 may interact with post-synaptic glutamate receptors. In another family, four affected siblings and their unaffected mother each carry a novel private missense variant in the SMARCA1 gene on the X chromosome. Both variants represent candidates that may be causal for psychotic disorders when considered in the context of their transmission pattern and known gene and disease biology.
Collapse
Affiliation(s)
| | | | | | | | - Paul Nelson
- The BVARI Foundation, VA Boston Healthcare System
| | | | - Lynn E DeLisi
- The BVARI Foundation, VA Boston Healthcare System, VA Boston Healthcare System, Boston and Brockton, Ma, Department of Psychiatry, Harvard Medical School,Corresponding Author Address: Building 2, Rm 204, 940 Belmont Avenue, Brockton, Massachusetts, 02301 USA, Phone: 774-826-3155;
| |
Collapse
|
219
|
Ganna A, Genovese G, Howrigan DP, Byrnes A, Kurki M, Zekavat SM, Whelan CW, Kals M, Nivard MG, Bloemendal A, Bloom JM, Goldstein JI, Poterba T, Seed C, Handsaker RE, Natarajan P, Mägi R, Gage D, Robinson EB, Metspalu A, Salomaa V, Suvisaari J, Purcell SM, Sklar P, Kathiresan S, Daly MJ, McCarroll SA, Sullivan PF, Palotie A, Esko T, Hultman C, Neale BM. Ultra-rare disruptive and damaging mutations influence educational attainment in the general population. Nat Neurosci 2016; 19:1563-1565. [PMID: 27694993 PMCID: PMC5127781 DOI: 10.1038/nn.4404] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2016] [Accepted: 09/07/2016] [Indexed: 12/14/2022]
Abstract
Disruptive, damaging ultra-rare variants in highly constrained genes are enriched in individuals with neurodevelopmental disorders. In the general population, this class of variants was associated with a decrease in years of education (YOE). This effect was stronger among highly brain-expressed genes and explained more YOE variance than pathogenic copy number variation but less than common variants. Disruptive, damaging ultra-rare variants in highly constrained genes influence the determinants of YOE in the general population.
Collapse
Affiliation(s)
- Andrea Ganna
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 171 77, Sweden
| | - Giulio Genovese
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Daniel P. Howrigan
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Andrea Byrnes
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Mitja Kurki
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki FI-00014, Finland
| | - Seyedeh M. Zekavat
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Center for Human Genetic Research and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Christopher W. Whelan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Mart Kals
- Estonian Genome Center, University of Tartu, Tartu 51010, Estonia
- Institute of Mathematics and Statistics, University of Tartu, Tartu 50409, Estonia
| | - Michel G. Nivard
- Department of Biological Psychology, VU University Amsterdam, Amsterdam 1081 HV, The Netherlands
| | - Alex Bloemendal
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jonathan M. Bloom
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jacqueline I. Goldstein
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Timothy Poterba
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Cotton Seed
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Robert E. Handsaker
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Pradeep Natarajan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Center for Human Genetic Research and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Reedik Mägi
- Estonian Genome Center, University of Tartu, Tartu 51010, Estonia
| | - Diane Gage
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Elise B. Robinson
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Andres Metspalu
- Estonian Genome Center, University of Tartu, Tartu 51010, Estonia
| | - Veikko Salomaa
- Department of Health, THL-National Institute for Health and Welfare, Helsinki FI-00271, Finland
| | - Jaana Suvisaari
- Department of Health, THL-National Institute for Health and Welfare, Helsinki FI-00271, Finland
| | - Shaun M. Purcell
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Pamela Sklar
- Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Sekar Kathiresan
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Center for Human Genetic Research and Cardiovascular Research Center, Massachusetts General Hospital, Boston, MA, 02114, USA
| | - Mark J. Daly
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Steven A. McCarroll
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Patrick F. Sullivan
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 171 77, Sweden
- Departments of Genetics and Psychiatry, University of North Carolina, Chapel Hill, North Carolina 27599-7264, USA
| | - Aarno Palotie
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki FI-00014, Finland
| | - Tõnu Esko
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Estonian Genome Center, University of Tartu, Tartu 51010, Estonia
| | - Christina Hultman
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm 171 77, Sweden
| | - Benjamin M. Neale
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston 02114, MA, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
220
|
Huddleston J, Chaisson MJP, Steinberg KM, Warren W, Hoekzema K, Gordon D, Graves-Lindsay TA, Munson KM, Kronenberg ZN, Vives L, Peluso P, Boitano M, Chin CS, Korlach J, Wilson RK, Eichler EE. Discovery and genotyping of structural variation from long-read haploid genome sequence data. Genome Res 2016; 27:677-685. [PMID: 27895111 PMCID: PMC5411763 DOI: 10.1101/gr.214007.116] [Citation(s) in RCA: 226] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Accepted: 11/15/2016] [Indexed: 01/07/2023]
Abstract
In an effort to more fully understand the full spectrum of human genetic variation, we generated deep single-molecule, real-time (SMRT) sequencing data from two haploid human genomes. By using an assembly-based approach (SMRT-SV), we systematically assessed each genome independently for structural variants (SVs) and indels resolving the sequence structure of 461,553 genetic variants from 2 bp to 28 kbp in length. We find that >89% of these variants have been missed as part of analysis of the 1000 Genomes Project even after adjusting for more common variants (MAF > 1%). We estimate that this theoretical human diploid differs by as much as ∼16 Mbp with respect to the human reference, with long-read sequencing data providing a fivefold increase in sensitivity for genetic variants ranging in size from 7 bp to 1 kbp compared with short-read sequence data. Although a large fraction of genetic variants were not detected by short-read approaches, once the alternate allele is sequence-resolved, we show that 61% of SVs can be genotyped in short-read sequence data sets with high accuracy. Uncoupling discovery from genotyping thus allows for the majority of this missed common variation to be genotyped in the human population. Interestingly, when we repeat SV detection on a pseudodiploid genome constructed in silico by merging the two haploids, we find that ∼59% of the heterozygous SVs are no longer detected by SMRT-SV. These results indicate that haploid resolution of long-read sequencing data will significantly increase sensitivity of SV detection.
Collapse
Affiliation(s)
- John Huddleston
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Mark J P Chaisson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Karyn Meltz Steinberg
- McDonnell Genome Institute, Department of Medicine, Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | - Wes Warren
- McDonnell Genome Institute, Department of Medicine, Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - David Gordon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Tina A Graves-Lindsay
- McDonnell Genome Institute, Department of Medicine, Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63108, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Zev N Kronenberg
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Laura Vives
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Paul Peluso
- Pacific Biosciences of California, Incorporated, Menlo Park, California 94025, USA
| | - Matthew Boitano
- Pacific Biosciences of California, Incorporated, Menlo Park, California 94025, USA
| | - Chen-Shin Chin
- Pacific Biosciences of California, Incorporated, Menlo Park, California 94025, USA
| | - Jonas Korlach
- Pacific Biosciences of California, Incorporated, Menlo Park, California 94025, USA
| | - Richard K Wilson
- Department of Pathology, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
221
|
Zmienko A, Samelak-Czajka A, Kozlowski P, Szymanska M, Figlerowicz M. Arabidopsis thaliana population analysis reveals high plasticity of the genomic region spanning MSH2, AT3G18530 and AT3G18535 genes and provides evidence for NAHR-driven recurrent CNV events occurring in this location. BMC Genomics 2016; 17:893. [PMID: 27825302 PMCID: PMC5101643 DOI: 10.1186/s12864-016-3221-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 10/27/2016] [Indexed: 12/28/2022] Open
Abstract
Background Intraspecies copy number variations (CNVs), defined as unbalanced structural variations of specific genomic loci, ≥1 kb in size, are present in the genomes of animals and plants. A growing number of examples indicate that CNVs may have functional significance and contribute to phenotypic diversity. In the model plant Arabidopsis thaliana at least several hundred protein-coding genes might display CNV; however, locus-specific genotyping studies in this plant have not been conducted. Results We analyzed the natural CNVs in the region overlapping MSH2 gene that encodes the DNA mismatch repair protein, and AT3G18530 and AT3G18535 genes that encode poorly characterized proteins. By applying multiplex ligation-dependent probe amplification and droplet digital PCR we genotyped those genes in 189 A. thaliana accessions. We found that AT3G18530 and AT3G18535 were duplicated (2–14 times) in 20 and deleted in 101 accessions. MSH2 was duplicated in 12 accessions (up to 12-14 copies) but never deleted. In all but one case, the MSH2 duplications were associated with those of AT3G18530 and AT3G18535. Considering the structure of the CNVs, we distinguished 5 genotypes for this region, determined their frequency and geographical distribution. We defined the CNV breakpoints in 35 accessions with AT3G18530 and AT3G18535 deletions and tandem duplications and showed that they were reciprocal events, resulting from non-allelic homologous recombination between 99 %-identical sequences flanking these genes. The widespread geographical distribution of the deletions supported by the SNP and linkage disequilibrium analyses of the genomic sequence confirmed the recurrent nature of this CNV. Conclusions We characterized in detail for the first time the complex multiallelic CNV in Arabidopsis genome. The region encoding MSH2, AT3G18530 and AT3G18535 genes shows enormous variation of copy numbers among natural ecotypes, being a remarkable example of high Arabidopsis genome plasticity. We provided the molecular insight into the mechanism underlying the recurrent nature of AT3G18530-AT3G18535 duplications/deletions. We also performed the first direct comparison of the two leading experimental methods, suitable for assessing the DNA copy number status. Our comprehensive case study provides foundation information for further analyses of CNV evolution in Arabidopsis and other plants, and their possible use in plant breeding. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3221-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Agnieszka Zmienko
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704, Poznan, Poland.,Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965, Poznan, Poland
| | - Anna Samelak-Czajka
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965, Poznan, Poland
| | - Piotr Kozlowski
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704, Poznan, Poland
| | - Maja Szymanska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704, Poznan, Poland
| | - Marek Figlerowicz
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704, Poznan, Poland. .,Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965, Poznan, Poland.
| |
Collapse
|
222
|
Gilks WP, Pennell TM, Flis I, Webster MT, Morrow EH. Whole genome resequencing of a laboratory-adapted Drosophila melanogaster population sample. F1000Res 2016; 5:2644. [PMID: 27928499 PMCID: PMC5115224 DOI: 10.12688/f1000research.9912.3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/21/2016] [Indexed: 12/30/2022] Open
Abstract
As part of a study into the molecular genetics of sexually dimorphic complex traits, we used high-throughput sequencing to obtain data on genomic variation in an outbred laboratory-adapted fruit fly (
Drosophila melanogaster) population. We successfully resequenced the whole genome of 220 hemiclonal females that were heterozygous for the same Berkeley reference line genome (BDGP6/dm6), and a unique haplotype from the outbred base population (LH
M). The use of a static and known genetic background enabled us to obtain sequences from whole-genome phased haplotypes. We used a BWA-Picard-GATK pipeline for mapping sequence reads to the dm6 reference genome assembly, at a median depth-of coverage of 31X, and have made the resulting data publicly-available in the NCBI Short Read Archive (Accession number SRP058502). We used Haplotype Caller to discover and genotype 1,726,931 small genomic variants (SNPs and indels, <200bp). Additionally we detected and genotyped 167 large structural variants (1-100Kb in size) using GenomeStrip/2.0. Sequence and genotype data are publicly-available at the corresponding NCBI databases: Short Read Archive, dbSNP and dbVar (BioProject PRJNA282591). We have also released the unfiltered genotype data, and the code and logs for data processing and summary statistics (
https://zenodo.org/communities/sussex_drosophila_sequencing/).
Collapse
Affiliation(s)
- William P Gilks
- Evolution, Behaviour and Environment Group, School of Life Sciences, John Maynard Smith Building, University of Sussex, Falmer, UK
| | - Tanya M Pennell
- Evolution, Behaviour and Environment Group, School of Life Sciences, John Maynard Smith Building, University of Sussex, Falmer, UK
| | - Ilona Flis
- Evolution, Behaviour and Environment Group, School of Life Sciences, John Maynard Smith Building, University of Sussex, Falmer, UK
| | - Matthew T Webster
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
| | - Edward H Morrow
- Evolution, Behaviour and Environment Group, School of Life Sciences, John Maynard Smith Building, University of Sussex, Falmer, UK
| |
Collapse
|
223
|
Keel BN, Keele JW, Snelling WM. Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds,. Anim Genet 2016; 48:141-150. [DOI: 10.1111/age.12519] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/27/2016] [Indexed: 12/19/2022]
Affiliation(s)
- B. N. Keel
- USDA; ARS; U.S. Meat Animal Research Center; Clay Center NE 68933 USA
| | - J. W. Keele
- USDA; ARS; U.S. Meat Animal Research Center; Clay Center NE 68933 USA
| | - W. M. Snelling
- USDA; ARS; U.S. Meat Animal Research Center; Clay Center NE 68933 USA
| |
Collapse
|
224
|
Branham K, Matsui H, Biswas P, Guru AA, Hicks M, Suk JJ, Li H, Jakubosky D, Long T, Telenti A, Nariai N, Heckenlively JR, Frazer KA, Sieving PA, Ayyagari R. Establishing the involvement of the novel gene AGBL5 in retinitis pigmentosa by whole genome sequencing. Physiol Genomics 2016; 48:922-927. [PMID: 27764769 DOI: 10.1152/physiolgenomics.00101.2016] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 10/06/2016] [Indexed: 02/06/2023] Open
Abstract
While more than 250 genes are known to cause inherited retinal degenerations (IRD), nearly 40-50% of families have the genetic basis for their disease unknown. In this study we sought to identify the underlying cause of IRD in a family by whole genome sequence (WGS) analysis. Clinical characterization including standard ophthalmic examination, fundus photography, visual field testing, electroretinography, and review of medical and family history was performed. WGS was performed on affected and unaffected family members using Illumina HiSeq X10. Sequence reads were aligned to hg19 using BWA-MEM and variant calling was performed with Genome Analysis Toolkit. The called variants were annotated with SnpEff v4.11, PolyPhen v2.2.2, and CADD v1.3. Copy number variations were called using Genome STRiP (svtoolkit 2.00.1611) and SpeedSeq software. Variants were filtered to detect rare potentially deleterious variants segregating with disease. Candidate variants were validated by dideoxy sequencing. Clinical evaluation revealed typical adolescent-onset recessive retinitis pigmentosa (arRP) in affected members. WGS identified about 4 million variants in each individual. Two rare and potentially deleterious compound heterozygous variants p.Arg281Cys and p.Arg487* were identified in the gene ATP/GTP binding protein like 5 (AGBL5) as likely causal variants. No additional variants in IRD genes that segregated with disease were identified. Mutation analysis confirmed the segregation of these variants with the IRD in the pedigree. Homology models indicated destabilization of AGBL5 due to the p.Arg281Cys change. Our findings establish the involvement of mutations in AGBL5 in RP and validate the WGS variant filtering pipeline we designed.
Collapse
Affiliation(s)
- Kari Branham
- Kellogg Eye Center, University of Michigan, Ann Arbor, Michigan
| | - Hiroko Matsui
- Institute for Genomic Medicine, University of California San Diego, La Jolla, California
| | - Pooja Biswas
- Shiley Eye Institute, University of California San Diego, La Jolla, California
| | - Aditya A Guru
- Shiley Eye Institute, University of California San Diego, La Jolla, California
| | | | - John J Suk
- Shiley Eye Institute, University of California San Diego, La Jolla, California
| | - He Li
- Institute for Genomic Medicine, University of California San Diego, La Jolla, California
| | - David Jakubosky
- Institute for Genomic Medicine, University of California San Diego, La Jolla, California
| | - Tao Long
- Human Longevity Incorporated, San Diego, California
| | | | - Naoki Nariai
- Institute for Genomic Medicine, University of California San Diego, La Jolla, California
| | | | - Kelly A Frazer
- Institute for Genomic Medicine, University of California San Diego, La Jolla, California.,Department of Pediatrics and Rady Children's Hospital, Division of Genome Information Sciences, University of California, San Diego, La Jolla, California; and
| | - Paul A Sieving
- National Eye Institute, National Institutes of Health, Bethesda, Maryland
| | - Radha Ayyagari
- Shiley Eye Institute, University of California San Diego, La Jolla, California;
| |
Collapse
|
225
|
Sapkota Y, Narasimhan A, Kumaran M, Sehrawat BS, Damaraju S. A Genome-Wide Association Study to Identify Potential Germline Copy Number Variants for Sporadic Breast Cancer Susceptibility. Cytogenet Genome Res 2016; 149:156-164. [PMID: 27668787 DOI: 10.1159/000448558] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/23/2016] [Indexed: 11/19/2022] Open
Abstract
Breast cancer (BC) predisposition in populations arises from both genetic and nongenetic risk factors. Structural variations such as copy number variations (CNVs) are heritable determinants for disease susceptibility. The primary objectives of this study are (1) to identify CNVs associated with sporadic BC using a genome-wide association study (GWAS) design; (2) to utilize 2 distinct CNV calling algorithms to identify concordant CNVs as a strategy to reduce false positive associations in the hypothesis-generating GWAS discovery phase, and (3) to identify potential candidate CNVs for follow-up replication studies. We used Affymetrix SNP Array 6.0 data profiled on Caucasian subjects (422 cases/348 controls) to call CNVs using algorithms implemented in Nexus Copy Number and Partek Genomics Suite software. Nexus algorithm identified CNVs associated with BC (731 autosomal CNVs with >5% frequency in the total sample and Q < 0.05). Thirteen CNVs were identified when Partek algorithm-called CNVs were overlapped with Nexus-identified CNVs; these CNVs showed concordances for frequency, effect size, and direction. Coding genes present within BC-associated CNVs were known to play a role in disease etiology and prognosis. Long noncoding RNAs identified within CNVs showed tissue-specific expression, indicating potential functional relevance of the findings. The identified candidate CNVs warrant independent replication.
Collapse
Affiliation(s)
- Yadav Sapkota
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tenn., USA
| | | | | | | | | |
Collapse
|
226
|
Steenwyk JL, Soghigian JS, Perfect JR, Gibbons JG. Copy number variation contributes to cryptic genetic variation in outbreak lineages of Cryptococcus gattii from the North American Pacific Northwest. BMC Genomics 2016; 17:700. [PMID: 27590805 PMCID: PMC5009542 DOI: 10.1186/s12864-016-3044-0] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 08/24/2016] [Indexed: 12/13/2022] Open
Abstract
Background Copy number variants (CNVs) are a class of structural variants (SVs) and are defined as fragments of DNA that are present at variable copy number in comparison with a reference genome. Recent advances in bioinformatics methodologies and sequencing technologies have enabled the high-resolution quantification of genome-wide CNVs. In pathogenic fungi SVs have been shown to alter gene expression, influence host specificity, and drive fungicide resistance, but little attention has focused specifically on CNVs. Using publicly available sequencing data, we identified 90 isolates across 212 Cryptococcus gattii genomes that belong to the VGII subgroups responsible for the recent deadly outbreaks in the North American Pacific Northwest. We generated CNV profiles for each sample to investigate the prevalence and function of CNV in C. gattii. Results We identified eight genetic clusters among publicly available Illumina whole genome sequence data from 212 C. gattii isolates through population structure analysis. Three clusters represent the VGIIa, VGIIb, and VGIIc subgroups from the North American Pacific Northwest. CNV was bioinformatically predicted and affected ~300–400 Kilobases (Kb) of the C. gattii VGII subgroup genomes. Sixty-seven loci, encompassing 58 genes, showed highly divergent patterns of copy number variation between VGII subgroups. Analysis of PFam domains within divergent CN variable genes revealed enrichment of protein domains associated with transport, cell wall organization and external encapsulating structure. Conclusions CNVs may contribute to pathological and phenotypic differences observed between the C. gattii VGIIa, VGIIb, and VGIIc subpopulations. Genes overlapping with population differentiated CNVs were enriched for several virulence related functional terms. These results uncover novel candidate genes to examine the genetic and functional underpinnings of C. gattii pathogenicity. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3044-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jacob L Steenwyk
- Biology Department, Clark University, 950 Main Street, Worcester, MA, USA.,Current address: Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | - John S Soghigian
- Biology Department, Clark University, 950 Main Street, Worcester, MA, USA.,Current address: Department of Environmental Sciences, The Connecticut Agricultural Experiment Station, New Haven, CT, USA
| | - John R Perfect
- Division of Infectious Diseases, Department of Medicine, Duke University Medical Center, Durham, NC, USA
| | - John G Gibbons
- Biology Department, Clark University, 950 Main Street, Worcester, MA, USA.
| |
Collapse
|
227
|
Morgan AP, Holt JM, McMullan RC, Bell TA, Clayshulte AMF, Didion JP, Yadgary L, Thybert D, Odom DT, Flicek P, McMillan L, de Villena FPM. The Evolutionary Fates of a Large Segmental Duplication in Mouse. Genetics 2016; 204:267-85. [PMID: 27371833 PMCID: PMC5012392 DOI: 10.1534/genetics.116.191007] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2016] [Accepted: 06/27/2016] [Indexed: 01/21/2023] Open
Abstract
Gene duplication and loss are major sources of genetic polymorphism in populations, and are important forces shaping the evolution of genome content and organization. We have reconstructed the origin and history of a 127-kbp segmental duplication, R2d, in the house mouse (Mus musculus). R2d contains a single protein-coding gene, Cwc22 De novo assembly of both the ancestral (R2d1) and the derived (R2d2) copies reveals that they have been subject to nonallelic gene conversion events spanning tens of kilobases. R2d2 is also a hotspot for structural variation: its diploid copy number ranges from zero in the mouse reference genome to >80 in wild mice sampled from around the globe. Hemizygosity for high copy-number alleles of R2d2 is associated in cis with meiotic drive; suppression of meiotic crossovers; and copy-number instability, with a mutation rate in excess of 1 per 100 transmissions in some laboratory populations. Our results provide a striking example of allelic diversity generated by duplication and demonstrate the value of de novo assembly in a phylogenetic context for understanding the mutational processes affecting duplicate genes.
Collapse
Affiliation(s)
- Andrew P Morgan
- Department of Genetics and Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599
| | - J Matthew Holt
- Department of Computer Science, University of North Carolina, Chapel Hill, North Carolina 27599
| | - Rachel C McMullan
- Department of Genetics and Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599
| | - Timothy A Bell
- Department of Genetics and Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599
| | - Amelia M-F Clayshulte
- Department of Genetics and Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599
| | - John P Didion
- Department of Genetics and Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599
| | - Liran Yadgary
- Department of Genetics and Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599
| | - David Thybert
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Cambridge, CB10 1SD, United Kingdom
| | - Duncan T Odom
- Cancer Research United Kingdom Cambridge Institute, University of Cambridge, CB2 0RE, United Kingdom Wellcome Trust Sanger Institute, Wellcome Genome Campus, Cambridge, CB10 1SA, United Kingdom
| | - Paul Flicek
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Cambridge, CB10 1SD, United Kingdom Wellcome Trust Sanger Institute, Wellcome Genome Campus, Cambridge, CB10 1SA, United Kingdom
| | - Leonard McMillan
- Department of Computer Science, University of North Carolina, Chapel Hill, North Carolina 27599
| | - Fernando Pardo-Manuel de Villena
- Department of Genetics and Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina 27599
| |
Collapse
|
228
|
Patterns of genic intolerance of rare copy number variation in 59,898 human exomes. Nat Genet 2016; 48:1107-11. [PMID: 27533299 PMCID: PMC5042837 DOI: 10.1038/ng.3638] [Citation(s) in RCA: 124] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Accepted: 07/12/2016] [Indexed: 12/14/2022]
Abstract
Copy number variation (CNV) affecting protein-coding genes contributes substantially to human diversity and disease. Here we characterized the rates and properties of rare genic CNVs (<0.5% frequency) in exome sequencing data from nearly 60,000 individuals in the Exome Aggregation Consortium (ExAC) database. On average, individuals possessed 0.81 deleted and 1.75 duplicated genes, and most (70%) carried at least one rare genic CNV. For every gene, we empirically estimated an index of relative intolerance to CNVs that demonstrated moderate correlation with measures of genic constraint based on single-nucleotide variation (SNV) and was independently correlated with measures of evolutionary conservation. For individuals with schizophrenia, genes affected by CNVs were more intolerant than in controls. The ExAC CNV data constitute a critical component of an integrated database spanning the spectrum of human genetic variation, aiding in the interpretation of personal genomes as well as population-based disease studies. These data are freely available for download and visualization online.
Collapse
|
229
|
Oetjens MT, Shen F, Emery SB, Zou Z, Kidd JM. Y-Chromosome Structural Diversity in the Bonobo and Chimpanzee Lineages. Genome Biol Evol 2016; 8:2231-40. [PMID: 27358426 PMCID: PMC4987114 DOI: 10.1093/gbe/evw150] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The male-specific regions of primate Y-chromosomes (MSY) are enriched for multi-copy genes highly expressed in the testis. These genes are located in large repetitive sequences arranged as palindromes, inverted-, and tandem repeats termed amplicons. In humans, these genes have critical roles in male fertility and are essential for the production of sperm. The structure of human and chimpanzee amplicon sequences show remarkable difference relative to the remainder of the genome, a difference that may be the result of intense selective pressure on male fertility. Four subspecies of common chimpanzees have undergone extended periods of isolation and appear to be in the early process of subspeciation. A recent study found amplicons enriched for testis-expressed genes on the primate X-chromosome the target of hard selective sweeps, and male-fertility genes on the Y-chromosome may also be the targets of selection. However, little is understood about Y-chromosome amplicon diversity within and across chimpanzee populations. Here, we analyze nine common chimpanzee (representing three subspecies: Pan troglodytes schweinfurthii, Pan troglodytes ellioti, and Pan troglodytes verus) and two bonobo (Pan paniscus) male whole-genome sequences to assess Y ampliconic copy-number diversity across the Pan genus. We observe that the copy number of Y chromosome amplicons is variable among chimpanzees and bonobos, and identify several lineage-specific patterns, including variable copy number of azoospermia candidates RBMY and DAZ. We detect recurrent switchpoints of copy-number change along the ampliconic tracts across chimpanzee populations, which may be the result of localized genome instability or selective forces.
Collapse
Affiliation(s)
| | - Feichen Shen
- Department of Human Genetics, University of Michigan Medical School
| | - Sarah B Emery
- Department of Human Genetics, University of Michigan Medical School
| | - Zhengting Zou
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan Medical School Department of Computational Medicine and Bioinformatics, University of Michigan Medical School
| |
Collapse
|
230
|
Hu XS, Hu Y, Chen X. Testing neutrality at copy-number-variable loci under the finite-allele and finite-site models. Theor Popul Biol 2016; 112:1-13. [PMID: 27423854 DOI: 10.1016/j.tpb.2016.07.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Revised: 07/05/2016] [Accepted: 07/06/2016] [Indexed: 02/01/2023]
Abstract
Copy-number variation (CNV) is an important form of DNA structural variation because a certain proportion of genomes in many eukaryotic species can contribute to such variations. Owing to the differences between CNVs and single nucleotide polymorphisms (SNPs) in size, mutation rate and maintaining mechanism, it is more realistic to characterize CNV evolution under the finite-allele and finite-site models. Here, we propose a method to test multiple CNVs neutrality under the finite-allele and finite-site models and the assumption of mutation-drift process. The statistical property of the method is evaluated through Monte Carlo simulations under the effects of the sample size, the scaled mutation rates, the number of CNVs, the population demographic change, and selection. Different from Tajima's D test, a bootstrap or a permutation approach is suggested to conduct a neutrality test. Application of this method is illustrated using the diploid CNV genotypes measured in discrete copy numbers in 11 HapMap phase III populations. The results show that the mutation-drift process can explain the variation of genome-wide CNVs among 1184 individuals (856 CNVs, ∼0.02Mb on average in size), irrespective of the historical demographic changes. Patterns from allele-frequency-spectrum analysis also support the hypothesis of neutral CNVs. Our results suggest that most human chromosomal changes in healthy individuals via unbalanced rearrangements of the segments with certain sizes are neutral.
Collapse
Affiliation(s)
- Xin-Sheng Hu
- Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong 510642, China; State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, South China Agricultural University, Guangdong 510642, China; Department of Plant Sciences, University of Oxford, South Parks Road, Oxford, OX13RB, United Kingdom.
| | - Yang Hu
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2S4, Canada
| | - Xiaoyang Chen
- Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, South China Agricultural University, Guangdong 510642, China; State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, South China Agricultural University, Guangdong 510642, China.
| |
Collapse
|
231
|
Polley S, Cipriani V, Khan JC, Shahid H, Moore AT, Yates JRW, Hollox EJ. Analysis of copy number variation at DMBT1 and age-related macular degeneration. BMC MEDICAL GENETICS 2016; 17:44. [PMID: 27416785 PMCID: PMC4946147 DOI: 10.1186/s12881-016-0311-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2015] [Accepted: 07/07/2016] [Indexed: 12/22/2022]
Abstract
BACKGROUND DMBT1 is a gene that shows extensive copy number variation (CNV) that alters the number of bacteria-binding domains in the protein and has been shown to activate the complement pathway. It lies next to the ARMS2/HTRA1 genes in a region of chromosome 10q26, where single nucleotide variants have been strongly associated with age-related macular degeneration (AMD), the commonest cause of blindness in Western populations. Complement activation is thought to be a key factor in the pathogenesis of this condition. We sought to investigate whether DMBT1 CNV plays any role in the susceptibility to AMD. METHODS We analysed long-range linkage disequilibrium of DMBT1 CNV1 and CNV2 with flanking single nucleotide polymorphisms (SNPs) using our previously published CNV and HapMap Phase 3 SNP data in the CEPH Europeans from Utah (CEU). We then typed a large cohort of 860 AMD patients and 419 examined age-matched controls for copy number at DMBT1 CNV1 and CNV2 and combined these data with copy numbers from a further 480 unexamined controls. RESULTS We found weak linkage disequilibrium between DMBT1 CNV1 and CNV2 with the SNPs rs1474526 and rs714816 in the HTRA1/ARMS2 region. By directly analysing copy number variation, we found no evidence of association of CNV1 or CNV2 with AMD. CONCLUSIONS We have shown that copy number variation at DMBT1 does not affect risk of developing age-related macular degeneration and can therefore be ruled out from future studies investigating the association of structural variation at 10q26 with AMD.
Collapse
Affiliation(s)
- Shamik Polley
- Department of Genetics, University of Leicester, Leicester, UK
| | - Valentina Cipriani
- UCL Institute of Ophthalmology, University College London, London, UK
- UCL Genetics Institute, University College London, London, UK
- Moorfields Eye Hospital, London, UK
| | - Jane C Khan
- Department of Medical Genetics, University of Cambridge, Cambridge, UK
- Centre for Ophthalmology and Visual Science, Lions Eye Institute, University of Western Australia, Perth, Australia
- Department of Ophthalmology, Royal Perth Hospital, Perth, Australia
| | - Humma Shahid
- Department of Medical Genetics, University of Cambridge, Cambridge, UK
- Department of Ophthamology, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Anthony T Moore
- UCL Institute of Ophthalmology, University College London, London, UK
- Moorfields Eye Hospital, London, UK
- Department of Ophthalmology UCSF Medical School, San Francisco, USA
| | - John R W Yates
- UCL Institute of Ophthalmology, University College London, London, UK
- Department of Medical Genetics, University of Cambridge, Cambridge, UK
| | - Edward J Hollox
- Department of Genetics, University of Leicester, Leicester, UK.
| |
Collapse
|
232
|
Milligan G, Shimpukade B, Ulven T, Hudson BD. Complex Pharmacology of Free Fatty Acid Receptors. Chem Rev 2016; 117:67-110. [PMID: 27299848 DOI: 10.1021/acs.chemrev.6b00056] [Citation(s) in RCA: 185] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
G protein-coupled receptors (GPCRs) are historically the most successful family of drug targets. In recent times it has become clear that the pharmacology of these receptors is far more complex than previously imagined. Understanding of the pharmacological regulation of GPCRs now extends beyond simple competitive agonism or antagonism by ligands interacting with the orthosteric binding site of the receptor to incorporate concepts of allosteric agonism, allosteric modulation, signaling bias, constitutive activity, and inverse agonism. Herein, we consider how evolving concepts of GPCR pharmacology have shaped understanding of the complex pharmacology of receptors that recognize and are activated by nonesterified or "free" fatty acids (FFAs). The FFA family of receptors is a recently deorphanized set of GPCRs, the members of which are now receiving substantial interest as novel targets for the treatment of metabolic and inflammatory diseases. Further understanding of the complex pharmacology of these receptors will be critical to unlocking their ultimate therapeutic potential.
Collapse
Affiliation(s)
- Graeme Milligan
- Centre for Translational Pharmacology, Institute of Molecular, Cell and Systems Biology, College of Medical, Veterinary and Life Sciences, University of Glasgow , Glasgow G12 8QQ, Scotland, United Kingdom
| | - Bharat Shimpukade
- Department of Physics, Chemistry and Pharmacy, University of Southern Denmark , Campusvej 55, DK-5230 Odense M, Denmark
| | - Trond Ulven
- Department of Physics, Chemistry and Pharmacy, University of Southern Denmark , Campusvej 55, DK-5230 Odense M, Denmark
| | - Brian D Hudson
- Centre for Translational Pharmacology, Institute of Molecular, Cell and Systems Biology, College of Medical, Veterinary and Life Sciences, University of Glasgow , Glasgow G12 8QQ, Scotland, United Kingdom
| |
Collapse
|
233
|
Zhao X, Emery SB, Myers B, Kidd JM, Mills RE. Resolving complex structural genomic rearrangements using a randomized approach. Genome Biol 2016; 17:126. [PMID: 27287201 PMCID: PMC4901421 DOI: 10.1186/s13059-016-0993-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 05/25/2016] [Indexed: 12/27/2022] Open
Abstract
Complex chromosomal rearrangements are structural genomic alterations involving multiple instances of deletions, duplications, inversions, or translocations that co-occur either on the same chromosome or represent different overlapping events on homologous chromosomes. We present SVelter, an algorithm that identifies regions of the genome suspected to harbor a complex event and then resolves the structure by iteratively rearranging the local genome structure, in a randomized fashion, with each structure scored against characteristics of the observed sequencing data. SVelter is able to accurately reconstruct complex chromosomal rearrangements when compared to well-characterized genomes that have been deeply sequenced with both short and long reads.
Collapse
Affiliation(s)
- Xuefang Zhao
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Sarah B Emery
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Bridget Myers
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Jeffrey M Kidd
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA.,Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Ryan E Mills
- Department of Computational Medicine & Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA. .,Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
234
|
Poznik GD, Xue Y, Mendez FL, Willems TF, Massaia A, Wilson Sayres MA, Ayub Q, McCarthy SA, Narechania A, Kashin S, Chen Y, Banerjee R, Rodriguez-Flores JL, Cerezo M, Shao H, Gymrek M, Malhotra A, Louzada S, Desalle R, Ritchie GRS, Cerveira E, Fitzgerald TW, Garrison E, Marcketta A, Mittelman D, Romanovitch M, Zhang C, Zheng-Bradley X, Abecasis GR, McCarroll SA, Flicek P, Underhill PA, Coin L, Zerbino DR, Yang F, Lee C, Clarke L, Auton A, Erlich Y, Handsaker RE, Bustamante CD, Tyler-Smith C. Punctuated bursts in human male demography inferred from 1,244 worldwide Y-chromosome sequences. Nat Genet 2016; 48:593-9. [PMID: 27111036 PMCID: PMC4884158 DOI: 10.1038/ng.3559] [Citation(s) in RCA: 194] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2015] [Accepted: 04/01/2016] [Indexed: 12/21/2022]
Abstract
We report the sequences of 1,244 human Y chromosomes randomly ascertained from 26 worldwide populations by the 1000 Genomes Project. We discovered more than 65,000 variants, including single-nucleotide variants, multiple-nucleotide variants, insertions and deletions, short tandem repeats, and copy number variants. Of these, copy number variants contribute the greatest predicted functional impact. We constructed a calibrated phylogenetic tree on the basis of binary single-nucleotide variants and projected the more complex variants onto it, estimating the number of mutations for each class. Our phylogeny shows bursts of extreme expansion in male numbers that have occurred independently among each of the five continental superpopulations examined, at times of known migrations and technological innovations.
Collapse
Affiliation(s)
- G David Poznik
- Program in Biomedical Informatics, Stanford University, Stanford, California, USA
- Department of Genetics, Stanford University, Stanford, California, USA
| | - Yali Xue
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Fernando L Mendez
- Department of Genetics, Stanford University, Stanford, California, USA
| | - Thomas F Willems
- Computational and Systems Biology Program, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
- New York Genome Center, New York, New York, USA
| | - Andrea Massaia
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Melissa A Wilson Sayres
- School of Life Sciences, Arizona State University, Tempe, Arizona, USA
- Center for Evolution and Medicine, Biodesign Institute, Arizona State University, Tempe, Arizona, USA
| | - Qasim Ayub
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Shane A McCarthy
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Apurva Narechania
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, New York, USA
| | - Seva Kashin
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Yuan Chen
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Ruby Banerjee
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | | | - Maria Cerezo
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Haojing Shao
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia
| | - Melissa Gymrek
- New York Genome Center, New York, New York, USA
- Harvard-MIT Division of Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | - Ankit Malhotra
- Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA
| | - Sandra Louzada
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Rob Desalle
- Sackler Institute for Comparative Genomics, American Museum of Natural History, New York, New York, USA
| | - Graham R S Ritchie
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Eliza Cerveira
- Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA
| | | | - Erik Garrison
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Anthony Marcketta
- Department of Genetics, Albert Einstein College of Medicine, Bronx, New York, USA
| | - David Mittelman
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, Virginia, USA
- Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia, USA
| | | | - Chengsheng Zhang
- Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA
| | - Xiangqun Zheng-Bradley
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Gonçalo R Abecasis
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, Michigan, USA
| | - Steven A McCarroll
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Peter A Underhill
- Department of Genetics, Stanford University, Stanford, California, USA
| | - Lachlan Coin
- Institute for Molecular Bioscience, University of Queensland, St Lucia, Queensland, Australia
| | - Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Fengtang Yang
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Charles Lee
- Jackson Laboratory for Genomic Medicine, Farmington, Connecticut, USA
- Department of Life Sciences, Ewha Womans University, Seoul, Republic of Korea
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, UK
| | - Adam Auton
- Department of Genetics, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Yaniv Erlich
- New York Genome Center, New York, New York, USA
- Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, New York, USA
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, USA
| | - Robert E Handsaker
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
| | - Carlos D Bustamante
- Department of Genetics, Stanford University, Stanford, California, USA
- Department of Biomedical Data Science, Stanford University, Stanford, California, USA
| | - Chris Tyler-Smith
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
235
|
Abstract
Tandem gene duplication is an important mutational process in evolutionary adaptation and human disease. Hypothetically, two tandem gene copies should produce twice the output of a single gene, but this expectation has not been rigorously investigated. Here, we show that tandem duplication often results in more than double the gene activity. A naturally occurring tandem duplication of the Alcohol dehydrogenase (Adh) gene exhibits 2.6-fold greater expression than the single-copy gene in transgenic Drosophila This tandem duplication also exhibits greater activity than two copies of the gene in trans, demonstrating that it is the tandem arrangement and not copy number that is the cause of overactivity. We also show that tandem duplication of an unrelated synthetic reporter gene is overactive (2.3- to 5.1-fold) at all sites in the genome that we tested, suggesting that overactivity could be a general property of tandem gene duplicates. Overactivity occurs at the level of RNA transcription, and therefore tandem duplicate overactivity appears to be a previously unidentified form of position effect. The increment of surplus gene expression observed is comparable to many regulatory mutations fixed in nature and, if typical of other genomes, would shape the fate of tandem duplicates in evolution.
Collapse
|
236
|
Yong RY, Mustaffa SB, Wasan PS, Sheng L, Marshall CR, Scherer SW, Teo YY, Yap EP. Complex Copy Number Variation of AMY1
does not Associate with Obesity in two East Asian Cohorts. Hum Mutat 2016; 37:669-78. [DOI: 10.1002/humu.22996] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Accepted: 03/08/2016] [Indexed: 11/12/2022]
Affiliation(s)
- Rita Y.Y. Yong
- Defence Medical and Environmental Research Institute; DSO National Laboratories; Singapore
- Saw Swee Hock School of Public Health; National University of Singapore; Singapore
| | - Su'Aidah B. Mustaffa
- Defence Medical and Environmental Research Institute; DSO National Laboratories; Singapore
- Lee Kong Chian School of Medicine; Nanyang Technological University; Singapore
| | - Pavandip S. Wasan
- Defence Medical and Environmental Research Institute; DSO National Laboratories; Singapore
- Saw Swee Hock School of Public Health; National University of Singapore; Singapore
| | - Liang Sheng
- Unit of Biostatistics; Yong Loo Lin School of Medicine; National University of Singapore; Singapore
| | - Christian R. Marshall
- The Centre for Applied Genomics; Genetics and Genome Biology; The Hospital for Sick Children; Toronto ON Canada
| | - Stephen W. Scherer
- The Centre for Applied Genomics; Genetics and Genome Biology; The Hospital for Sick Children; Toronto ON Canada
- Department of Molecular Genetics and McLaughlin Centre; University of Toronto; Toronto ON Canada
| | - Yik-Ying Teo
- Saw Swee Hock School of Public Health; National University of Singapore; Singapore
- Department of Statistics and Applied Probability; Faculty of Science; National University of Singapore; Singapore
| | - Eric P.H. Yap
- Defence Medical and Environmental Research Institute; DSO National Laboratories; Singapore
- Saw Swee Hock School of Public Health; National University of Singapore; Singapore
- Lee Kong Chian School of Medicine; Nanyang Technological University; Singapore
| |
Collapse
|
237
|
Khrunin AV, Filippova IN, Aliev AM, Tupitsina TV, Slominsky PA, Limborska SA. GSTM1 copy number variation in the context of single nucleotide polymorphisms in the human GSTM cluster. Mol Cytogenet 2016; 9:30. [PMID: 27099630 PMCID: PMC4837583 DOI: 10.1186/s13039-016-0241-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Accepted: 04/01/2016] [Indexed: 02/06/2023] Open
Abstract
Background GSTM1 gene deletion is one of the most known copy number polymorphisms in human genome. It is most likely caused by homologous recombination between the repeats flanking the gene. However, taking into account that the deletion has no crucial effects on human well-being, and the ability of other GSTMs to compensate for the lack of GSTM1, a role for additional factors affecting GSTM1 deletion can be proposed. Our goal was to explore the relationships between GSTM1 deletion polymorphism and single nucleotide polymorphisms (SNPs) in the region of the GSTM cluster that includes GSTM2, GSTM3, GSTM4, and GSTM5 in addition to GSTM1. Results Real-time polymerase chain reaction was used to quantify the number of GSTM1 copies. Fourteen SNPs from the region were tested and their allelic patterns were compared in groups of Russian individuals subdivided according to their GSTM1 deletion genotypes. Linkage disequilibrium-based haplotype analysis showed substantial differences of haplotype frequencies between the groups, especially between individuals with homozygous GSTM1 −/− and +/+ genotypes. Exploration of the results of phasing of GSTM1 and SNP genotypes revealed unequal segregation of GSTM1 + and − alleles at different haplotypes. Conclusions The observed differences in haplotype patterns suggest the potential role of genetic context in GSTM1 deletion frequency (appearance) and in the determination of the deletion-related effects.
Collapse
Affiliation(s)
- Andrey V Khrunin
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Kurchatov sq. 2, Moscow, 123182 Russia
| | - Irina N Filippova
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Kurchatov sq. 2, Moscow, 123182 Russia
| | - Aydar M Aliev
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Kurchatov sq. 2, Moscow, 123182 Russia
| | - Tat'yana V Tupitsina
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Kurchatov sq. 2, Moscow, 123182 Russia
| | - Petr A Slominsky
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Kurchatov sq. 2, Moscow, 123182 Russia
| | - Svetlana A Limborska
- Department of Molecular Bases of Human Genetics, Institute of Molecular Genetics of Russian Academy of Sciences, Kurchatov sq. 2, Moscow, 123182 Russia
| |
Collapse
|
238
|
Bickhart DM, Xu L, Hutchison JL, Cole JB, Null DJ, Schroeder SG, Song J, Garcia JF, Sonstegard TS, Van Tassell CP, Schnabel RD, Taylor JF, Lewin HA, Liu GE. Diversity and population-genetic properties of copy number variations and multicopy genes in cattle. DNA Res 2016; 23:253-62. [PMID: 27085184 PMCID: PMC4909312 DOI: 10.1093/dnares/dsw013] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 02/29/2016] [Indexed: 11/14/2022] Open
Abstract
The diversity and population genetics of copy number variation (CNV) in domesticated animals are not well understood. In this study, we analysed 75 genomes of major taurine and indicine cattle breeds (including Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, and Romagnola), sequenced to 11-fold coverage to identify 1,853 non-redundant CNV regions. Supported by high validation rates in array comparative genomic hybridization (CGH) and qPCR experiments, these CNV regions accounted for 3.1% (87.5 Mb) of the cattle reference genome, representing a significant increase over previous estimates of the area of the genome that is copy number variable (∼2%). Further population genetics and evolutionary genomics analyses based on these CNVs revealed the population structures of the cattle taurine and indicine breeds and uncovered potential diversely selected CNVs near important functional genes, including AOX1, ASZ1, GAT, GLYAT, and KRTAP9-1. Additionally, 121 CNV gene regions were found to be either breed specific or differentially variable across breeds, such as RICTOR in dairy breeds and PNPLA3 in beef breeds. In contrast, clusters of the PRP and PAG genes were found to be duplicated in all sequenced animals, suggesting that subfunctionalization, neofunctionalization, or overdominance play roles in diversifying those fertility-related genes. These CNV results provide a new glimpse into the diverse selection histories of cattle breeds and a basis for correlating structural variation with complex traits in the future.
Collapse
Affiliation(s)
- Derek M Bickhart
- USDA-ARS, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705, USA
| | - Lingyang Xu
- USDA-ARS, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705, USA Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA
| | - Jana L Hutchison
- USDA-ARS, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705, USA
| | - John B Cole
- USDA-ARS, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705, USA
| | - Daniel J Null
- USDA-ARS, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705, USA
| | - Steven G Schroeder
- USDA-ARS, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705, USA
| | - Jiuzhou Song
- Department of Animal and Avian Sciences, University of Maryland, College Park, MD 20742, USA
| | | | - Tad S Sonstegard
- USDA-ARS, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705, USA
| | | | - Robert D Schnabel
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA Informatics Institute, University of Missouri, Columbia, MO, USA
| | - Jeremy F Taylor
- Division of Animal Sciences, University of Missouri, Columbia, MO 65211, USA
| | - Harris A Lewin
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - George E Liu
- USDA-ARS, Animal Genomics and Improvement Laboratory, Beltsville, MD 20705, USA
| |
Collapse
|
239
|
Quilez J, Guilmatre A, Garg P, Highnam G, Gymrek M, Erlich Y, Joshi RS, Mittelman D, Sharp AJ. Polymorphic tandem repeats within gene promoters act as modifiers of gene expression and DNA methylation in humans. Nucleic Acids Res 2016; 44:3750-62. [PMID: 27060133 PMCID: PMC4857002 DOI: 10.1093/nar/gkw219] [Citation(s) in RCA: 92] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2015] [Accepted: 03/22/2016] [Indexed: 01/23/2023] Open
Abstract
Despite representing an important source of genetic variation, tandem repeats (TRs) remain poorly studied due to technical difficulties. We hypothesized that TRs can operate as expression (eQTLs) and methylation (mQTLs) quantitative trait loci. To test this we analyzed the effect of variation at 4849 promoter-associated TRs, genotyped in 120 individuals, on neighboring gene expression and DNA methylation. Polymorphic promoter TRs were associated with increased variance in local gene expression and DNA methylation, suggesting functional consequences related to TR variation. We identified >100 TRs associated with expression/methylation levels of adjacent genes. These potential eQTL/mQTL TRs were enriched for overlaps with transcription factor binding and DNaseI hypersensitivity sites, providing a rationale for their effects. Moreover, we showed that most TR variants are poorly tagged by nearby single nucleotide polymorphisms (SNPs) markers, indicating that many functional TR variants are not effectively assayed by SNP-based approaches. Our study assigns biological significance to TR variations in the human genome, and suggests that a significant fraction of TR variations exert functional effects via alterations of local gene expression or epigenetics. We conclude that targeted studies that focus on genotyping TR variants are required to fully ascertain functional variation in the genome.
Collapse
Affiliation(s)
- Javier Quilez
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Audrey Guilmatre
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Paras Garg
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Gareth Highnam
- Virginia Bioinformatics Institute and Department of Biological Sciences, Virginia Tech, Blacksburg, VA 24061, USA
| | - Melissa Gymrek
- Harvard-MIT Division of Health Sciences and Technology, MIT, Cambridge, MA 02139, USA Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA New York Genome Center, New York, NY 10038, USA
| | - Yaniv Erlich
- Harvard-MIT Division of Health Sciences and Technology, MIT, Cambridge, MA 02139, USA Department of Computer Science, Fu Foundation School of Engineering, Columbia University, New York, NY 10027, USA
| | - Ricky S Joshi
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - David Mittelman
- Virginia Bioinformatics Institute and Department of Biological Sciences, Virginia Tech, Blacksburg, VA 24061, USA
| | - Andrew J Sharp
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| |
Collapse
|
240
|
Daly AF, Yuan B, Fina F, Caberg JH, Trivellin G, Rostomyan L, de Herder WW, Naves LA, Metzger D, Cuny T, Rabl W, Shah N, Jaffrain-Rea ML, Zatelli MC, Faucz FR, Castermans E, Nanni-Metellus I, Lodish M, Muhammad A, Palmeira L, Potorac I, Mantovani G, Neggers SJ, Klein M, Barlier A, Liu P, Ouafik L, Bours V, Lupski JR, Stratakis CA, Beckers A. Somatic mosaicism underlies X-linked acrogigantism syndrome in sporadic male subjects. Endocr Relat Cancer 2016; 23:221-33. [PMID: 26935837 PMCID: PMC4877443 DOI: 10.1530/erc-16-0082] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Accepted: 03/01/2016] [Indexed: 12/15/2022]
Abstract
Somatic mosaicism has been implicated as a causative mechanism in a number of genetic and genomic disorders. X-linked acrogigantism (XLAG) syndrome is a recently characterized genomic form of pediatric gigantism due to aggressive pituitary tumors that is caused by submicroscopic chromosome Xq26.3 duplications that include GPR101 We studied XLAG syndrome patients (n= 18) to determine if somatic mosaicism contributed to the genomic pathophysiology. Eighteen subjects with XLAG syndrome caused by Xq26.3 duplications were identified using high-definition array comparative genomic hybridization (HD-aCGH). We noted that males with XLAG had a decreased log2ratio (LR) compared with expected values, suggesting potential mosaicism, whereas females showed no such decrease. Compared with familial male XLAG cases, sporadic males had more marked evidence for mosaicism, with levels of Xq26.3 duplication between 16.1 and 53.8%. These characteristics were replicated using a novel, personalized breakpoint junction-specific quantification droplet digital polymerase chain reaction (ddPCR) technique. Using a separate ddPCR technique, we studied the feasibility of identifying XLAG syndrome cases in a distinct patient population of 64 unrelated subjects with acromegaly/gigantism, and identified one female gigantism patient who had had increased copy number variation (CNV) threshold for GPR101 that was subsequently diagnosed as having XLAG syndrome on HD-aCGH. Employing a combination of HD-aCGH and novel ddPCR approaches, we have demonstrated, for the first time, that XLAG syndrome can be caused by variable degrees of somatic mosaicism for duplications at chromosome Xq26.3. Somatic mosaicism was shown to occur in sporadic males but not in females with XLAG syndrome, although the clinical characteristics of the disease were similarly severe in both sexes.
Collapse
Affiliation(s)
- Adrian F Daly
- Department of Endocrinology, Centre Hospitalier Universitaire de Liege, University of Liege, Liege, Belgium
| | - Bo Yuan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TexasUSA
| | - Frederic Fina
- Assistance Publique Hôpitaux de Marseille (AP-HM), Hôpital Nord, Service de Transfert d'Oncologie Biologique, Marseille, France Laboratoire de Biologie Médicale, and Aix-Marseille UniversitéInserm, CRO2 UMR_S 911, Marseille, France
| | - Jean-Hubert Caberg
- Department of Human Genetics, Centre Hospitalier Universitaire de Liege, University of Liege, Liege, Belgium
| | - Giampaolo Trivellin
- Section on Endocrinology and Genetics, Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Liliya Rostomyan
- Department of Endocrinology, Centre Hospitalier Universitaire de Liege, University of Liege, Liege, Belgium
| | - Wouter W de Herder
- Section of Endocrinology, Department of Medicine, Erasmus University Medical Center Rotterdam and Pituitary Center Rotterdam, Rotterdam, The Netherlands
| | - Luciana A Naves
- Department of Endocrinology, University of Brasilia, Brasilia, Brazil
| | - Daniel Metzger
- Endocrinology and Diabetes Unit, BC Children's Hospital, Vancouver, British Columbia, Canada
| | - Thomas Cuny
- Department of Endocrinology, University Hospital, Nancy, France
| | - Wolfgang Rabl
- Kinderklinik, Technische Universität München, Munich, Germany
| | - Nalini Shah
- Department of Endocrinology, KEM Hospital, Mumbai, India
| | - Marie-Lise Jaffrain-Rea
- Department of Biotechnological and Applied Clinical Sciences, University of L'Aquila, L'Aquila and Neuromed Institute, IRCCS, Pozzilli, Italy
| | - Maria Chiara Zatelli
- Section of Endocrinology, Department of Medical Sciences, University of Ferrara, Ferrara, Italy
| | - Fabio R Faucz
- Section on Endocrinology and Genetics, Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Emilie Castermans
- Department of Human Genetics, Centre Hospitalier Universitaire de Liege, University of Liege, Liege, Belgium
| | - Isabelle Nanni-Metellus
- Assistance Publique Hôpitaux de Marseille (AP-HM), Hôpital Nord, Service de Transfert d'Oncologie Biologique, Marseille, France
| | - Maya Lodish
- Section on Endocrinology and Genetics, Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Ammar Muhammad
- Section of Endocrinology, Department of Medicine, Erasmus University Medical Center Rotterdam and Pituitary Center Rotterdam, Rotterdam, The Netherlands
| | - Leonor Palmeira
- Department of Endocrinology, Centre Hospitalier Universitaire de Liege, University of Liege, Liege, Belgium
| | - Iulia Potorac
- Department of Endocrinology, Centre Hospitalier Universitaire de Liege, University of Liege, Liege, Belgium Department of Human GeneticsCentre Hospitalier Universitaire de Liege, University of Liege, Liege, Belgium
| | - Giovanna Mantovani
- Endocrinology and Diabetology Unit, Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, Department of Clinical Sciences and Community Health, University of Milan, Milan, Italy
| | - Sebastian J Neggers
- Section of Endocrinology, Department of Medicine, Erasmus University Medical Center Rotterdam and Pituitary Center Rotterdam, Rotterdam, The Netherlands
| | - Marc Klein
- Department of Endocrinology, University Hospital, Nancy, France
| | - Anne Barlier
- Laboratory of Molecular Biology, APHM, Hopital la Conception, Aix Marseille Universite, Marseilles, France CRNSCRN2M-UMR 7286, Marseille, France
| | - Pengfei Liu
- Assistance Publique Hôpitaux de Marseille (AP-HM), Hôpital Nord, Service de Transfert d'Oncologie Biologique, Marseille, France
| | - L'Houcine Ouafik
- Laboratoire de Biologie Médicale, and Aix-Marseille Université, Inserm, CRO2 UMR_S 911, Marseille, France
| | - Vincent Bours
- Department of Human Genetics, Centre Hospitalier Universitaire de Liege, University of Liege, Liege, Belgium
| | - James R Lupski
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas, USA
| | - Constantine A Stratakis
- Section on Endocrinology and Genetics, Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD), National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Albert Beckers
- Department of Endocrinology, Centre Hospitalier Universitaire de Liege, University of Liege, Liege, Belgium
| |
Collapse
|
241
|
Hollox EJ, Wain LV. Recurrent mutation at the classical haptoglobin structural polymorphism. Nat Genet 2016; 48:347-8. [DOI: 10.1038/ng.3534] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
242
|
Xu L, Hou Y, Bickhart DM, Zhou Y, Hay EHA, Song J, Sonstegard TS, Van Tassell CP, Liu GE. Population-genetic properties of differentiated copy number variations in cattle. Sci Rep 2016; 6:23161. [PMID: 27005566 PMCID: PMC4804293 DOI: 10.1038/srep23161] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2015] [Accepted: 02/25/2016] [Indexed: 01/24/2023] Open
Abstract
While single nucleotide polymorphism (SNP) is typically the variant of choice for population genetics, copy number variation (CNV) which comprises insertion, deletion and duplication of genomic sequence, is an informative type of genetic variation. CNVs have been shown to be both common in mammals and important for understanding the relationship between genotype and phenotype. However, CNV differentiation, selection and its population genetic properties are not well understood across diverse populations. We performed a population genetics survey based on CNVs derived from the BovineHD SNP array data of eight distinct cattle breeds. We generated high resolution results that show geographical patterns of variations and genome-wide admixture proportions within and among breeds. Similar to the previous SNP-based studies, our CNV-based results displayed a strong correlation of population structure and geographical location. By conducting three pairwise comparisons among European taurine, African taurine, and indicine groups, we further identified 78 unique CNV regions that were highly differentiated, some of which might be due to selection. These CNV regions overlapped with genes involved in traits related to parasite resistance, immunity response, body size, fertility, and milk production. Our results characterize CNV diversity among cattle populations and provide a list of lineage-differentiated CNVs.
Collapse
Affiliation(s)
- Lingyang Xu
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA.,Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Yali Hou
- Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Derek M Bickhart
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Yang Zhou
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA.,College of Animal Science and Technology, Northwest A&F University, Shaanxi Key Laboratory of Agricultural Molecular Biology, Yangling, Shaanxi, 712100, China
| | - El Hamidi Abdel Hay
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Jiuzhou Song
- Department of Animal and Avian Sciences, University of Maryland, College Park, Maryland 20742, USA
| | - Tad S Sonstegard
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - Curtis P Van Tassell
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, Maryland 20705, USA
| |
Collapse
|
243
|
Firtina C, Alkan C. On genomic repeats and reproducibility. ACTA ACUST UNITED AC 2016; 32:2243-7. [PMID: 27153582 DOI: 10.1093/bioinformatics/btw139] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 03/07/2016] [Indexed: 12/30/2022]
Abstract
RESULTS Here, we present a comprehensive analysis on the reproducibility of computational characterization of genomic variants using high throughput sequencing data. We reanalyzed the same datasets twice, using the same tools with the same parameters, where we only altered the order of reads in the input (i.e. FASTQ file). Reshuffling caused the reads from repetitive regions being mapped to different locations in the second alignment, and we observed similar results when we only applied a scatter/gather approach for read mapping-without prior shuffling. Our results show that, some of the most common variation discovery algorithms do not handle the ambiguous read mappings accurately when random locations are selected. In addition, we also observed that even when the exact same alignment is used, the GATK HaplotypeCaller generates slightly different call sets, which we pinpoint to the variant filtration step. We conclude that, algorithms at each step of genomic variation discovery and characterization need to treat ambiguous mappings in a deterministic fashion to ensure full replication of results. AVAILABILITY AND IMPLEMENTATION Code, scripts and the generated VCF files are available at DOI:10.5281/zenodo.32611. CONTACT calkan@cs.bilkent.edu.tr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Can Firtina
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| |
Collapse
|
244
|
Sekar A, Bialas AR, de Rivera H, Davis A, Hammond TR, Kamitaki N, Tooley K, Presumey J, Baum M, Van Doren V, Genovese G, Rose SA, Handsaker RE, Daly MJ, Carroll MC, Stevens B, McCarroll SA. Schizophrenia risk from complex variation of complement component 4. Nature 2016; 530:177-83. [PMID: 26814963 PMCID: PMC4752392 DOI: 10.1038/nature16549] [Citation(s) in RCA: 1554] [Impact Index Per Article: 194.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 12/18/2015] [Indexed: 02/07/2023]
Abstract
Schizophrenia is a heritable brain illness with unknown pathogenic mechanisms. Schizophrenia's strongest genetic association at a population level involves variation in the major histocompatibility complex (MHC) locus, but the genes and molecular mechanisms accounting for this have been challenging to identify. Here we show that this association arises in part from many structurally diverse alleles of the complement component 4 (C4) genes. We found that these alleles generated widely varying levels of C4A and C4B expression in the brain, with each common C4 allele associating with schizophrenia in proportion to its tendency to generate greater expression of C4A. Human C4 protein localized to neuronal synapses, dendrites, axons, and cell bodies. In mice, C4 mediated synapse elimination during postnatal development. These results implicate excessive complement activity in the development of schizophrenia and may help explain the reduced numbers of synapses in the brains of individuals with schizophrenia.
Collapse
Affiliation(s)
- Aswin Sekar
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- MD-PhD Program, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Allison R Bialas
- Department of Neurology, F.M. Kirby Neurobiology Center, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA
- Program in Cellular and Molecular Medicine, Boston Children's Hospital, Boston, Massachusetts 02115, USA
| | - Heather de Rivera
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Avery Davis
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Timothy R Hammond
- Department of Neurology, F.M. Kirby Neurobiology Center, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Nolan Kamitaki
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Katherine Tooley
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Jessy Presumey
- Program in Cellular and Molecular Medicine, Boston Children's Hospital, Boston, Massachusetts 02115, USA
| | - Matthew Baum
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- MD-PhD Program, Harvard Medical School, Boston, Massachusetts 02115, USA
- Department of Neurology, F.M. Kirby Neurobiology Center, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Vanessa Van Doren
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Giulio Genovese
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Samuel A Rose
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Robert E Handsaker
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Mark J Daly
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Analytical and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Michael C Carroll
- Program in Cellular and Molecular Medicine, Boston Children's Hospital, Boston, Massachusetts 02115, USA
| | - Beth Stevens
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
- Department of Neurology, F.M. Kirby Neurobiology Center, Boston Children's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA
| | - Steven A McCarroll
- Department of Genetics, Harvard Medical School, Boston, Massachusetts 02115, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| |
Collapse
|
245
|
Zhang Y, Tang ET, Du Z. Detection of MET Gene Copy Number in Cancer Samples Using the Droplet Digital PCR Method. PLoS One 2016; 11:e0146784. [PMID: 26765781 PMCID: PMC4713204 DOI: 10.1371/journal.pone.0146784] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2015] [Accepted: 12/22/2015] [Indexed: 11/18/2022] Open
Abstract
PURPOSE The analysis of MET gene copy number (CN) has been considered to be a potential biomarker to predict the response to MET-targeted therapies in various cancers. However, the current standard methods to determine MET CN are SNP 6.0 in the genomic DNA of cancer cell lines and fluorescence in situ hybridization (FISH) in tumor models, respectively, which are costly and require advanced technical skills and result in relatively subjective judgments. Therefore, we employed a novel method, droplet digital PCR (ddPCR), to determine the MET gene copy number with high accuracy and precision. METHODS The genomic DNA of cancer cell lines or tumor models were tested and compared with the MET gene CN and MET/CEN-7 ratio determined by SNP 6.0 and FISH, respectively. RESULTS In cell lines, the linear association of the MET CN detected by ddPCR and SNP 6.0 is strong (Pearson correlation = 0.867). In tumor models, the MET CN detected by ddPCR was significantly different between the MET gene amplification and non-amplification groups according to FISH (mean: 15.4 vs 2.1; P = 0.044). Given that MET gene amplification is defined as MET CN >5.5 by ddPCR, the concordance rate between ddPCR and FISH was 98.0%, and Cohen's kappa coefficient was 0.760 (95% CI, 0.498-1.000; P <0.001). CONCLUSIONS The results demonstrated that the ddPCR method has the potential to quantify the MET gene copy number with high precision and accuracy as compared with the results from SNP 6.0 and FISH in cancer cell lines and tumor samples, respectively.
Collapse
Affiliation(s)
- Yanni Zhang
- Amgen Biopharmaceutical Research & Development (Shanghai) Co., Ltd, Shanghai, China
| | - En-Tzu Tang
- Amgen Biopharmaceutical Research & Development (Shanghai) Co., Ltd, Shanghai, China
| | - Zhiqiang Du
- Amgen Biopharmaceutical Research & Development (Shanghai) Co., Ltd, Shanghai, China
- * E-mail:
| |
Collapse
|
246
|
Kronenberg ZN, Osborne EJ, Cone KR, Kennedy BJ, Domyan ET, Shapiro MD, Elde NC, Yandell M. Wham: Identifying Structural Variants of Biological Consequence. PLoS Comput Biol 2015; 11:e1004572. [PMID: 26625158 PMCID: PMC4666669 DOI: 10.1371/journal.pcbi.1004572] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 09/30/2015] [Indexed: 11/22/2022] Open
Abstract
Existing methods for identifying structural variants (SVs) from short read datasets are inaccurate. This complicates disease-gene identification and efforts to understand the consequences of genetic variation. In response, we have created Wham (Whole-genome Alignment Metrics) to provide a single, integrated framework for both structural variant calling and association testing, thereby bypassing many of the difficulties that currently frustrate attempts to employ SVs in association testing. Here we describe Wham, benchmark it against three other widely used SV identification tools–Lumpy, Delly and SoftSearch–and demonstrate Wham’s ability to identify and associate SVs with phenotypes using data from humans, domestic pigeons, and vaccinia virus. Wham and all associated software are covered under the MIT License and can be freely downloaded from github (https://github.com/zeeev/wham), with documentation on a wiki (http://zeeev.github.io/wham/). For community support please post questions to https://www.biostars.org/.
Collapse
Affiliation(s)
- Zev N. Kronenberg
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
| | - Edward J. Osborne
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
- Utah Center for Genetic Discovery, University of Utah, Salt Lake City, Utah, United States of America
| | - Kelsey R. Cone
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
| | - Brett J. Kennedy
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
- Utah Center for Genetic Discovery, University of Utah, Salt Lake City, Utah, United States of America
| | - Eric T. Domyan
- Department of Biology, University of Utah, Salt Lake City, Utah, United States of America
| | - Michael D. Shapiro
- Department of Biology, University of Utah, Salt Lake City, Utah, United States of America
| | - Nels C. Elde
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
| | - Mark Yandell
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah, United States of America
- Utah Center for Genetic Discovery, University of Utah, Salt Lake City, Utah, United States of America
- * E-mail:
| |
Collapse
|
247
|
Stadler J, Eder J, Pratscher B, Brandt S, Schneller D, Müllegger R, Vogl C, Trautinger F, Brem G, Burgstaller JP. SNPase-ARMS qPCR: Ultrasensitive Mutation-Based Detection of Cell-Free Tumor DNA in Melanoma Patients. PLoS One 2015; 10:e0142273. [PMID: 26562020 PMCID: PMC4642939 DOI: 10.1371/journal.pone.0142273] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 10/20/2015] [Indexed: 12/18/2022] Open
Abstract
Cell-free circulating tumor DNA in the plasma of cancer patients has become a common point of interest as indicator of therapy options and treatment response in clinical cancer research. Especially patient- and tumor-specific single nucleotide variants that accurately distinguish tumor DNA from wild type DNA are promising targets. The reliable detection and quantification of these single-base DNA variants is technically challenging. Currently, a variety of techniques is applied, with no apparent “gold standard”. Here we present a novel qPCR protocol that meets the conditions of extreme sensitivity and specificity that are required for detection and quantification of tumor DNA. By consecutive application of two polymerases, one of them designed for extreme base-specificity, the method reaches unprecedented sensitivity and specificity. Three qPCR assays were tested with spike-in experiments, specific for point mutations BRAF V600E, PTEN T167A and NRAS Q61L of melanoma cell lines. It was possible to detect down to one copy of tumor DNA per reaction (Poisson distribution), at a background of up to 200 000 wild type DNAs. To prove its clinical applicability, the method was successfully tested on a small cohort of BRAF V600E positive melanoma patients.
Collapse
Affiliation(s)
- Julia Stadler
- Biotechnology in Animal Production, Department for Agrobiotechnology, IFA Tulln, Tulln, Lower Austria, Austria
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Johanna Eder
- Department of Dermatology and Venereology, Karl Landsteiner University of Health Sciences, St. Poelten, Lower Austria, Austria
- Karl Landsteiner Institute of Dermatological Research, St. Poelten, Lower Austria, Austria
| | - Barbara Pratscher
- Research Group Oncology of the Equine Clinic, Department for Companion Animal and Horses, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Sabine Brandt
- Research Group Oncology of the Equine Clinic, Department for Companion Animal and Horses, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Doris Schneller
- Biotechnology in Animal Production, Department for Agrobiotechnology, IFA Tulln, Tulln, Lower Austria, Austria
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Robert Müllegger
- Department of Dermatology and Venereology, Landesklinikum Wiener Neustadt, Wiener Neustadt, Lower Austria, Austria
| | - Claus Vogl
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Franz Trautinger
- Department of Dermatology and Venereology, Karl Landsteiner University of Health Sciences, St. Poelten, Lower Austria, Austria
- Karl Landsteiner Institute of Dermatological Research, St. Poelten, Lower Austria, Austria
| | - Gottfried Brem
- Biotechnology in Animal Production, Department for Agrobiotechnology, IFA Tulln, Tulln, Lower Austria, Austria
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Vienna, Austria
| | - Joerg P. Burgstaller
- Biotechnology in Animal Production, Department for Agrobiotechnology, IFA Tulln, Tulln, Lower Austria, Austria
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Vienna, Vienna, Austria
- * E-mail:
| |
Collapse
|
248
|
Forni D, Martin D, Abujaber R, Sharp AJ, Sironi M, Hollox EJ. Determining multiallelic complex copy number and sequence variation from high coverage exome sequencing data. BMC Genomics 2015; 16:891. [PMID: 26526070 PMCID: PMC4630827 DOI: 10.1186/s12864-015-2123-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 10/22/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Copy number variation (CNV) is a major component of genomic variation, yet methods to accurately type genomic CNV lag behind methods that type single nucleotide variation. High-throughput sequencing can contribute to these methods by using sequence read depth, which takes the number of reads that map to a given part of the reference genome as a proxy for copy number of that region, and compares across samples. Furthermore, high-throughput sequencing also provides information on the sequence differences between copies within and between individuals. METHODS In this study we use high-coverage phase 3 exome sequences of the 1000 Genomes project to infer diploid copy number of the beta-defensin genomic region, a well-studied CNV that carries several beta-defensin genes involved in the antimicrobial response, signalling, and fertility. We also use these data to call sequence variants, a particular challenge given the multicopy nature of the region. RESULTS We confidently call copy number and sequence variation of the beta-defensin genes on 1285 samples from 26 global populations, validate copy number using Nanostring nCounter and triplex paralogue ratio test data. We use the copy number calls to verify the genomic extent of the CNV and validate sequence calls using analysis of cloned PCR products. We identify novel variation, mostly individually rare, predicted to alter amino-acid sequence in the beta-defensin genes. Such novel variants may alter antimicrobial properties or have off-target receptor interactions, and may contribute to individuality in immunological response and fertility. CONCLUSIONS Given that 81% of identified sequence variants were not previously in dbSNP, we show that sequence variation in multiallelic CNVs represent an unappreciated source of genomic diversity.
Collapse
Affiliation(s)
- Diego Forni
- Department of Genetics, University of Leicester, Leicester, UK.,Bioinformatics, Scientific Institute IRCCS E.MEDEA, Bosisio, Parini, Italy
| | - Diana Martin
- Department of Genetics, University of Leicester, Leicester, UK
| | - Razan Abujaber
- Department of Genetics, University of Leicester, Leicester, UK
| | - Andrew J Sharp
- Department of Genetics and Genome Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Manuela Sironi
- Bioinformatics, Scientific Institute IRCCS E.MEDEA, Bosisio, Parini, Italy
| | - Edward J Hollox
- Department of Genetics, University of Leicester, Leicester, UK.
| |
Collapse
|
249
|
Fakhro KA, Yousri NA, Rodriguez-Flores JL, Robay A, Staudt MR, Agosto-Perez F, Salit J, Malek JA, Suhre K, Jayyousi A, Zirie M, Stadler D, Mezey JG, Crystal RG. Copy number variations in the genome of the Qatari population. BMC Genomics 2015; 16:834. [PMID: 26490036 PMCID: PMC4618522 DOI: 10.1186/s12864-015-1991-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Accepted: 10/06/2015] [Indexed: 12/25/2022] Open
Abstract
Background The populations of the Arabian Peninsula remain the least represented in public genetic databases, both in terms of single nucleotide variants and of larger genomic mutations. We present the first high-resolution copy number variation (CNV) map for a Gulf Arab population, using a hybrid approach that integrates array genotyping intensity data and next-generation sequencing reads to call CNVs in the Qatari population. Methods CNVs were detected in 97 unrelated Qatari individuals by running two calling algorithms on each of two primary datasets: high-resolution genotyping (Illumina Omni 2.5M) and high depth whole-genome sequencing (Illumina PE 100bp). The four call-sets were integrated to identify high confidence CNV regions, which were subsequently annotated for putative functional effect and compared to public databases of CNVs in other populations. The availability of genome sequence was leveraged to identify tagging SNPs in high LD with common deletions in this population, enabling their imputation from genotyping experiments in the future. Results Genotyping intensities and genome sequencing data from 97 Qataris were analyzed with four different algorithms and integrated to discover 16,660 high confidence CNV regions (CNVRs) in the total population, affecting ~28 Mb in the median Qatari genome. Up to 40 % of all CNVs affected genes, including novel CNVs affecting Mendelian disease genes, segregating at different frequencies in the 3 major Qatari subpopulations, including those with Bedouin, Persian/South Asian, and African ancestry. Consistent with high consanguinity levels in the Bedouin subpopulation, we found an increased burden for homozygous deletions in this group. In comparison to known CNVs in the comprehensive Database of Genomic Variants, we found that 5 % of all CNVRs in Qataris were completely novel, with an enrichment of CNVs affecting several known chromosomal disorder loci and genes known to regulate sugar metabolism and type 2 diabetes in the Qatari cohort. Finally, we leveraged the availability of genome sequence to find suitable tagging SNPs for common deletions in this population. Conclusion We combine four independently generated datasets from 97 individuals to study CNVs for the first time at high-resolution in a Gulf Arab population. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1991-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Khalid A Fakhro
- Department of Genetic Medicine, Weill Cornell Medical College in Qatar, Doha, Qatar. .,Division of Translational Medicine, Sidra Medical Research Centre, Doha, Qatar.
| | - Noha A Yousri
- Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Doha, Qatar. .,Computer and Systems Engineering, Alexandria University, Alexandria, Egypt.
| | - Juan L Rodriguez-Flores
- Department of Genetic Medicine, Weill Cornell Medical College, 1300 York Avenue, Box 164, New York, NY, 10065, USA.
| | - Amal Robay
- Department of Genetic Medicine, Weill Cornell Medical College in Qatar, Doha, Qatar.
| | - Michelle R Staudt
- Department of Genetic Medicine, Weill Cornell Medical College, 1300 York Avenue, Box 164, New York, NY, 10065, USA.
| | - Francisco Agosto-Perez
- Department of Genetic Medicine, Weill Cornell Medical College, 1300 York Avenue, Box 164, New York, NY, 10065, USA.
| | - Jacqueline Salit
- Department of Genetic Medicine, Weill Cornell Medical College, 1300 York Avenue, Box 164, New York, NY, 10065, USA.
| | - Joel A Malek
- Department of Genetic Medicine, Weill Cornell Medical College in Qatar, Doha, Qatar.
| | - Karsten Suhre
- Department of Physiology and Biophysics, Weill Cornell Medical College in Qatar, Doha, Qatar.
| | - Amin Jayyousi
- Department of Medicine, Hamad Medical Corporation, Doha, Qatar.
| | - Mahmoud Zirie
- Department of Medicine, Hamad Medical Corporation, Doha, Qatar.
| | - Dora Stadler
- Department of Medicine, Weill Cornell Medical College in Qatar, Doha, Qatar.
| | - Jason G Mezey
- Computer and Systems Engineering, Alexandria University, Alexandria, Egypt. .,Department Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, USA.
| | - Ronald G Crystal
- Department of Genetic Medicine, Weill Cornell Medical College in Qatar, Doha, Qatar. .,Department of Genetic Medicine, Weill Cornell Medical College, 1300 York Avenue, Box 164, New York, NY, 10065, USA.
| |
Collapse
|
250
|
Reinius B, Sandberg R. Random monoallelic expression of autosomal genes: stochastic transcription and allele-level regulation. Nat Rev Genet 2015; 16:653-64. [PMID: 26442639 DOI: 10.1038/nrg3888] [Citation(s) in RCA: 124] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Random monoallelic expression (RME) of genes represents a striking example of how stochastic molecular processes can result in cellular heterogeneity. Recent transcriptome-wide studies have revealed both mitotically stable and cell-to-cell dynamic forms of autosomal RME, with the latter presumably resulting from burst-like stochastic transcription. Here, we discuss the distinguishing features of these two forms of RME and revisit literature on their nature, pervasiveness and regulation. Finally, we explore how RME may contribute to phenotypic variation, including the incomplete penetrance and variable expressivity often seen in genetic disease.
Collapse
Affiliation(s)
- Björn Reinius
- Ludwig Institute for Cancer Research, Box 240, and the Department of Cell and Molecular Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| | - Rickard Sandberg
- Ludwig Institute for Cancer Research, Box 240, and the Department of Cell and Molecular Biology, Karolinska Institutet, 171 77 Stockholm, Sweden
| |
Collapse
|