2
|
Sudmant PH, Rausch T, Gardner EJ, Handsaker RE, Abyzov A, Huddleston J, Zhang Y, Ye K, Jun G, Hsi-Yang Fritz M, Konkel MK, Malhotra A, Stütz AM, Shi X, Paolo Casale F, Chen J, Hormozdiari F, Dayama G, Chen K, Malig M, Chaisson MJP, Walter K, Meiers S, Kashin S, Garrison E, Auton A, Lam HYK, Jasmine Mu X, Alkan C, Antaki D, Bae T, Cerveira E, Chines P, Chong Z, Clarke L, Dal E, Ding L, Emery S, Fan X, Gujral M, Kahveci F, Kidd JM, Kong Y, Lameijer EW, McCarthy S, Flicek P, Gibbs RA, Marth G, Mason CE, Menelaou A, Muzny DM, Nelson BJ, Noor A, Parrish NF, Pendleton M, Quitadamo A, Raeder B, Schadt EE, Romanovitch M, Schlattl A, Sebra R, Shabalin AA, Untergasser A, Walker JA, Wang M, Yu F, Zhang C, Zhang J, Zheng-Bradley X, Zhou W, Zichner T, Sebat J, Batzer MA, McCarroll SA, Mills RE, Gerstein MB, Bashir A, Stegle O, Devine SE, Lee C, Eichler EE, Korbel JO. An integrated map of structural variation in 2,504 human genomes. Nature 2015; 526:75-81. [PMID: 26432246 PMCID: PMC4617611 DOI: 10.1038/nature15394] [Citation(s) in RCA: 1364] [Impact Index Per Article: 151.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Accepted: 08/20/2015] [Indexed: 12/11/2022]
Abstract
Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.
Collapse
Affiliation(s)
- Peter H. Sudmant
- Department of Genome Sciences, University of Washington, 3720 15th Avenue NE, Seattle, 98195-5065 Washington USA
| | - Tobias Rausch
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
| | - Eugene J. Gardner
- Institute for Genome Sciences, University of Maryland School of Medicine, 801 W Baltimore Street, Baltimore, 21201 Maryland USA
| | - Robert E. Handsaker
- Department of Genetics, Harvard Medical School, 25 Shattuck Street, Boston, Boston, 02115 Massachusetts USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, 02142 Massachusetts USA
| | - Alexej Abyzov
- Department of Health Sciences Research, Center for Individualized Medicine, Mayo Clinic, 200 First Street SW, Rochester, 55905 Minnesota USA
| | - John Huddleston
- Department of Genome Sciences, University of Washington, 3720 15th Avenue NE, Seattle, 98195-5065 Washington USA
- Howard Hughes Medical Institute, University of Washington, Seattle, 98195 Washington USA
| | - Yan Zhang
- Program in Computational Biology and Bioinformatics, Yale University, BASS 432 & 437, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
- Department of Molecular Biophysics and Biochemistry, School of Medicine, Yale University, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
| | - Kai Ye
- The Genome Institute, Washington University School of Medicine, 4444 Forest Park Avenue, St Louis, 63108 Missouri USA
- Department of Genetics, Washington University in St Louis, 4444 Forest Park Avenue, St Louis, 63108 Missouri USA
| | - Goo Jun
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, 1415 Washington Heights, Ann Arbor, 48109 Michigan USA
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, 1200 Pressler St., Houston, 77030 Texas USA
| | - Markus Hsi-Yang Fritz
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
| | - Miriam K. Konkel
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Building, Baton Rouge, 70803 Louisiana USA
| | - Ankit Malhotra
- The Jackson Laboratory for Genomic Medicine, 10 Discovery 263 Farmington Avenue, Farmington, 06030 Connecticut USA
| | - Adrian M. Stütz
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
| | - Xinghua Shi
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd., Charlotte, 28223 North Carolina USA
| | - Francesco Paolo Casale
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge UK
| | - Jieming Chen
- Program in Computational Biology and Bioinformatics, Yale University, BASS 432 & 437, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
- Integrated Graduate Program in Physical and Engineering Biology, Yale University, New Haven, 06520 Connecticut USA
| | - Fereydoun Hormozdiari
- Department of Genome Sciences, University of Washington, 3720 15th Avenue NE, Seattle, 98195-5065 Washington USA
| | - Gargi Dayama
- Department of Computational Medicine & Bioinformatics, University of Michigan, 500 S. State Street, Ann Arbor, 48109 Michigan USA
| | - Ken Chen
- The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, 77030 Texas USA
| | - Maika Malig
- Department of Genome Sciences, University of Washington, 3720 15th Avenue NE, Seattle, 98195-5065 Washington USA
| | - Mark J. P. Chaisson
- Department of Genome Sciences, University of Washington, 3720 15th Avenue NE, Seattle, 98195-5065 Washington USA
| | - Klaudia Walter
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SA Cambridge UK
| | - Sascha Meiers
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
| | - Seva Kashin
- Department of Genetics, Harvard Medical School, 25 Shattuck Street, Boston, Boston, 02115 Massachusetts USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, 02142 Massachusetts USA
| | - Erik Garrison
- Department of Biology, Boston College, 355 Higgins Hall, 140 Commonwealth Avenue, Chestnut Hill, 02467 Massachusetts USA
| | - Adam Auton
- Department of Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, 10461 New York USA
| | - Hugo Y. K. Lam
- Bina Technologies, Roche Sequencing, 555 Twin Dolphin Drive, Redwood City, 94065 California USA
| | - Xinmeng Jasmine Mu
- Program in Computational Biology and Bioinformatics, Yale University, BASS 432 & 437, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
- Cancer Program, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, 02142 Massachusetts USA
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara, 06800 Turkey
| | - Danny Antaki
- University of California San Diego (UCSD), 9500 Gilman Drive, La Jolla, 92093 California USA
| | - Taejeong Bae
- Department of Health Sciences Research, Center for Individualized Medicine, Mayo Clinic, 200 First Street SW, Rochester, 55905 Minnesota USA
| | - Eliza Cerveira
- The Jackson Laboratory for Genomic Medicine, 10 Discovery 263 Farmington Avenue, Farmington, 06030 Connecticut USA
| | - Peter Chines
- National Human Genome Research Institute, National Institutes of Health, Bethesda, 20892 Maryland USA
| | - Zechen Chong
- The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, 77030 Texas USA
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge UK
| | - Elif Dal
- Department of Computer Engineering, Bilkent University, Ankara, 06800 Turkey
| | - Li Ding
- The Genome Institute, Washington University School of Medicine, 4444 Forest Park Avenue, St Louis, 63108 Missouri USA
- Department of Genetics, Washington University in St Louis, 4444 Forest Park Avenue, St Louis, 63108 Missouri USA
- Department of Medicine, Washington University in St Louis, 4444 Forest Park Avenue, St Louis, 63108 Missouri USA
- Siteman Cancer Center, 660 South Euclid Avenue, St Louis, 63110 Missouri USA
| | - Sarah Emery
- Department of Human Genetics, University of Michigan, 1241 Catherine Street, Ann Arbor, 48109 Michigan USA
| | - Xian Fan
- The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, 77030 Texas USA
| | - Madhusudan Gujral
- University of California San Diego (UCSD), 9500 Gilman Drive, La Jolla, 92093 California USA
| | - Fatma Kahveci
- Department of Computer Engineering, Bilkent University, Ankara, 06800 Turkey
| | - Jeffrey M. Kidd
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, 1415 Washington Heights, Ann Arbor, 48109 Michigan USA
- Department of Human Genetics, University of Michigan, 1241 Catherine Street, Ann Arbor, 48109 Michigan USA
| | - Yu Kong
- Department of Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, 10461 New York USA
| | - Eric-Wubbo Lameijer
- Molecular Epidemiology, Leiden University Medical Center, Leiden, 2300RA The Netherlands
| | - Shane McCarthy
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SA Cambridge UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge UK
| | - Richard A. Gibbs
- Baylor College of Medicine, 1 Baylor Plaza, Houston, 77030 Texas USA
| | - Gabor Marth
- Department of Biology, Boston College, 355 Higgins Hall, 140 Commonwealth Avenue, Chestnut Hill, 02467 Massachusetts USA
| | - Christopher E. Mason
- The Department of Physiology and Biophysics and the HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, 1305 York Avenue, Weill Cornell Medical College, New York, 10065 New York USA
- The Feil Family Brain and Mind Research Institute, 413 East 69th St, Weill Cornell Medical College, New York, 10065 New York USA
| | - Androniki Menelaou
- University of Oxford, 1 South Parks Road, Oxford, OX3 9DS UK
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, 3584 CG The Netherlands
| | - Donna M. Muzny
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York School of Natural Sciences, 1428 Madison Avenue, New York, 10029 New York USA
| | - Bradley J. Nelson
- Department of Genome Sciences, University of Washington, 3720 15th Avenue NE, Seattle, 98195-5065 Washington USA
| | - Amina Noor
- University of California San Diego (UCSD), 9500 Gilman Drive, La Jolla, 92093 California USA
| | - Nicholas F. Parrish
- Institute for Virus Research, Kyoto University, 53 Shogoin Kawahara-cho, Sakyo-ku, 606-8507 Kyoto Japan
| | - Matthew Pendleton
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York School of Natural Sciences, 1428 Madison Avenue, New York, 10029 New York USA
| | - Andrew Quitadamo
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd., Charlotte, 28223 North Carolina USA
| | - Benjamin Raeder
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
| | - Eric E. Schadt
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York School of Natural Sciences, 1428 Madison Avenue, New York, 10029 New York USA
| | - Mallory Romanovitch
- The Jackson Laboratory for Genomic Medicine, 10 Discovery 263 Farmington Avenue, Farmington, 06030 Connecticut USA
| | - Andreas Schlattl
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
| | - Robert Sebra
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York School of Natural Sciences, 1428 Madison Avenue, New York, 10029 New York USA
| | - Andrey A. Shabalin
- Center for Biomarker Research and Precision Medicine, Virginia Commonwealth University, 1112 East Clay Street, McGuire Hall, Richmond, 23298-0581 Virginia USA
| | - Andreas Untergasser
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
- Zentrum für Molekulare Biologie, University of Heidelberg, Im Neuenheimer Feld 282, Heidelberg, 69120 Germany
| | - Jerilyn A. Walker
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Building, Baton Rouge, 70803 Louisiana USA
| | - Min Wang
- Baylor College of Medicine, 1 Baylor Plaza, Houston, 77030 Texas USA
| | - Fuli Yu
- Baylor College of Medicine, 1 Baylor Plaza, Houston, 77030 Texas USA
| | - Chengsheng Zhang
- The Jackson Laboratory for Genomic Medicine, 10 Discovery 263 Farmington Avenue, Farmington, 06030 Connecticut USA
| | - Jing Zhang
- Program in Computational Biology and Bioinformatics, Yale University, BASS 432 & 437, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
- Department of Molecular Biophysics and Biochemistry, School of Medicine, Yale University, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
| | - Xiangqun Zheng-Bradley
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge UK
| | - Wanding Zhou
- The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, 77030 Texas USA
| | - Thomas Zichner
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
| | - Jonathan Sebat
- University of California San Diego (UCSD), 9500 Gilman Drive, La Jolla, 92093 California USA
| | - Mark A. Batzer
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Building, Baton Rouge, 70803 Louisiana USA
| | - Steven A. McCarroll
- Department of Genetics, Harvard Medical School, 25 Shattuck Street, Boston, Boston, 02115 Massachusetts USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, 02142 Massachusetts USA
| | - The 1000 Genomes Project Consortium
- Department of Genome Sciences, University of Washington, 3720 15th Avenue NE, Seattle, 98195-5065 Washington USA
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
- Institute for Genome Sciences, University of Maryland School of Medicine, 801 W Baltimore Street, Baltimore, 21201 Maryland USA
- Department of Genetics, Harvard Medical School, 25 Shattuck Street, Boston, Boston, 02115 Massachusetts USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, 02142 Massachusetts USA
- Department of Health Sciences Research, Center for Individualized Medicine, Mayo Clinic, 200 First Street SW, Rochester, 55905 Minnesota USA
- Howard Hughes Medical Institute, University of Washington, Seattle, 98195 Washington USA
- Program in Computational Biology and Bioinformatics, Yale University, BASS 432 & 437, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
- Department of Molecular Biophysics and Biochemistry, School of Medicine, Yale University, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
- The Genome Institute, Washington University School of Medicine, 4444 Forest Park Avenue, St Louis, 63108 Missouri USA
- Department of Genetics, Washington University in St Louis, 4444 Forest Park Avenue, St Louis, 63108 Missouri USA
- Department of Biostatistics and Center for Statistical Genetics, University of Michigan, 1415 Washington Heights, Ann Arbor, 48109 Michigan USA
- Human Genetics Center, School of Public Health, The University of Texas Health Science Center at Houston, 1200 Pressler St., Houston, 77030 Texas USA
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Building, Baton Rouge, 70803 Louisiana USA
- The Jackson Laboratory for Genomic Medicine, 10 Discovery 263 Farmington Avenue, Farmington, 06030 Connecticut USA
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, 9201 University City Blvd., Charlotte, 28223 North Carolina USA
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge UK
- Integrated Graduate Program in Physical and Engineering Biology, Yale University, New Haven, 06520 Connecticut USA
- Department of Computational Medicine & Bioinformatics, University of Michigan, 500 S. State Street, Ann Arbor, 48109 Michigan USA
- The University of Texas MD Anderson Cancer Center, 1515 Holcombe Boulevard, Houston, 77030 Texas USA
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SA Cambridge UK
- Department of Biology, Boston College, 355 Higgins Hall, 140 Commonwealth Avenue, Chestnut Hill, 02467 Massachusetts USA
- Department of Genetics, Albert Einstein College of Medicine, 1301 Morris Park Avenue, Bronx, 10461 New York USA
- Bina Technologies, Roche Sequencing, 555 Twin Dolphin Drive, Redwood City, 94065 California USA
- Cancer Program, Broad Institute of MIT and Harvard, 415 Main Street, Cambridge, 02142 Massachusetts USA
- Department of Computer Engineering, Bilkent University, Ankara, 06800 Turkey
- University of California San Diego (UCSD), 9500 Gilman Drive, La Jolla, 92093 California USA
- National Human Genome Research Institute, National Institutes of Health, Bethesda, 20892 Maryland USA
- Department of Medicine, Washington University in St Louis, 4444 Forest Park Avenue, St Louis, 63108 Missouri USA
- Siteman Cancer Center, 660 South Euclid Avenue, St Louis, 63110 Missouri USA
- Department of Human Genetics, University of Michigan, 1241 Catherine Street, Ann Arbor, 48109 Michigan USA
- Molecular Epidemiology, Leiden University Medical Center, Leiden, 2300RA The Netherlands
- Baylor College of Medicine, 1 Baylor Plaza, Houston, 77030 Texas USA
- The Department of Physiology and Biophysics and the HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, 1305 York Avenue, Weill Cornell Medical College, New York, 10065 New York USA
- The Feil Family Brain and Mind Research Institute, 413 East 69th St, Weill Cornell Medical College, New York, 10065 New York USA
- University of Oxford, 1 South Parks Road, Oxford, OX3 9DS UK
- Department of Medical Genetics, Center for Molecular Medicine, University Medical Center Utrecht, Utrecht, 3584 CG The Netherlands
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York School of Natural Sciences, 1428 Madison Avenue, New York, 10029 New York USA
- Institute for Virus Research, Kyoto University, 53 Shogoin Kawahara-cho, Sakyo-ku, 606-8507 Kyoto Japan
- Center for Biomarker Research and Precision Medicine, Virginia Commonwealth University, 1112 East Clay Street, McGuire Hall, Richmond, 23298-0581 Virginia USA
- Zentrum für Molekulare Biologie, University of Heidelberg, Im Neuenheimer Feld 282, Heidelberg, 69120 Germany
- Department of Computer Science, Yale University, 51 Prospect Street, New Haven, 06511 Connecticut USA
- Department of Graduate Studies – Life Sciences, Ewha Womans University, Ewhayeodae-gil, Seodaemun-gu, 120-750 Seoul South Korea
| | - Ryan E. Mills
- Department of Computational Medicine & Bioinformatics, University of Michigan, 500 S. State Street, Ann Arbor, 48109 Michigan USA
- Department of Human Genetics, University of Michigan, 1241 Catherine Street, Ann Arbor, 48109 Michigan USA
| | - Mark B. Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, BASS 432 & 437, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
- Department of Molecular Biophysics and Biochemistry, School of Medicine, Yale University, 266 Whitney Avenue, New Haven, 06520 Connecticut USA
- Department of Computer Science, Yale University, 51 Prospect Street, New Haven, 06511 Connecticut USA
| | - Ali Bashir
- Department of Genetics and Genomic Sciences, Icahn School of Medicine, New York School of Natural Sciences, 1428 Madison Avenue, New York, 10029 New York USA
| | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge UK
| | - Scott E. Devine
- Institute for Genome Sciences, University of Maryland School of Medicine, 801 W Baltimore Street, Baltimore, 21201 Maryland USA
| | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, 10 Discovery 263 Farmington Avenue, Farmington, 06030 Connecticut USA
- Department of Graduate Studies – Life Sciences, Ewha Womans University, Ewhayeodae-gil, Seodaemun-gu, 120-750 Seoul South Korea
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, 3720 15th Avenue NE, Seattle, 98195-5065 Washington USA
- Howard Hughes Medical Institute, University of Washington, Seattle, 98195 Washington USA
| | - Jan O. Korbel
- European Molecular Biology Laboratory (EMBL), Genome Biology Unit, Meyerhofstrasse 1, Heidelberg, 69117 Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, CB10 1SD Cambridge UK
| |
Collapse
|
3
|
Pendleton M, Sebra R, Pang AWC, Ummat A, Franzen O, Rausch T, Stütz AM, Stedman W, Anantharaman T, Hastie A, Dai H, Fritz MHY, Cao H, Cohain A, Deikus G, Durrett RE, Blanchard SC, Altman R, Chin CS, Guo Y, Paxinos EE, Korbel JO, Darnell RB, McCombie WR, Kwok PY, Mason CE, Schadt EE, Bashir A. Assembly and diploid architecture of an individual human genome via single-molecule technologies. Nat Methods 2015; 12:780-6. [PMID: 26121404 PMCID: PMC4646949 DOI: 10.1038/nmeth.3454] [Citation(s) in RCA: 330] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2014] [Accepted: 05/28/2015] [Indexed: 12/30/2022]
Abstract
We present the first comprehensive analysis of a diploid human genome that combines single-molecule sequencing with single-molecule genome maps. Our hybrid assembly markedly improves upon the contiguity observed from traditional shotgun sequencing approaches, with scaffold N50 values approaching 30 Mb, and we identified complex structural variants (SVs) missed by other high-throughput approaches. Furthermore, by combining Illumina short-read data with long reads, we phased both single-nucleotide variants and SVs, generating haplotypes with over 99% consistency with previous trio-based studies. Our work shows that it is now possible to integrate single-molecule and high-throughput sequence data to generate de novo assembled genomes that approach reference quality.
Collapse
Affiliation(s)
- Matthew Pendleton
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Robert Sebra
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | | | - Ajay Ummat
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Oscar Franzen
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Tobias Rausch
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Adrian M Stütz
- Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | | | | | - Alex Hastie
- BioNano Genomics, San Diego, California, USA
| | - Heng Dai
- BioNano Genomics, San Diego, California, USA
| | | | - Han Cao
- BioNano Genomics, San Diego, California, USA
| | - Ariella Cohain
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Gintaras Deikus
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Russell E Durrett
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, New York, USA
| | - Scott C Blanchard
- Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York, USA
| | - Roger Altman
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, New York, USA
| | | | - Yan Guo
- Pacific Biosciences, Menlo Park, California, USA
| | | | - Jan O Korbel
- 1] Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany. [2] European Bioinformatics Institute, European Molecular Biology Laboratory, Hinxton, UK
| | - Robert B Darnell
- 1] Laboratory of Neuro-Oncology, The Rockefeller University, New York, New York, USA. [2] Howard Hughes Medical Institute, New York, New York, USA
| | - W Richard McCombie
- 1] The Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA. [2] The Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, USA
| | - Pui-Yan Kwok
- Institute for Human Genetics, University of California-San Francisco, San Francisco, California, USA
| | - Christopher E Mason
- 1] The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medical College, New York, New York, USA. [2] Department of Medicine, Division of Hematology/Oncology, Weill Cornell Medical College, New York, New York, USA. [3] The Feil Family Brain and Mind Research Institute, Weill Cornell Medical College, New York, New York, USA
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| | - Ali Bashir
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, USA
| |
Collapse
|
6
|
Green RE, Krause J, Briggs AW, Maricic T, Stenzel U, Kircher M, Patterson N, Li H, Zhai W, Fritz MHY, Hansen NF, Durand EY, Malaspinas AS, Jensen JD, Marques-Bonet T, Alkan C, Prüfer K, Meyer M, Burbano HA, Good JM, Schultz R, Aximu-Petri A, Butthof A, Höber B, Höffner B, Siegemund M, Weihmann A, Nusbaum C, Lander ES, Russ C, Novod N, Affourtit J, Egholm M, Verna C, Rudan P, Brajkovic D, Kucan Ž, Gušic I, Doronichev VB, Golovanova LV, Lalueza-Fox C, de la Rasilla M, Fortea J, Rosas A, Schmitz RW, Johnson PLF, Eichler EE, Falush D, Birney E, Mullikin JC, Slatkin M, Nielsen R, Kelso J, Lachmann M, Reich D, Pääbo S. A draft sequence of the Neandertal genome. Science 2010; 328:710-722. [PMID: 20448178 PMCID: PMC5100745 DOI: 10.1126/science.1188021] [Citation(s) in RCA: 2097] [Impact Index Per Article: 149.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Neandertals, the closest evolutionary relatives of present-day humans, lived in large parts of Europe and western Asia before disappearing 30,000 years ago. We present a draft sequence of the Neandertal genome composed of more than 4 billion nucleotides from three individuals. Comparisons of the Neandertal genome to the genomes of five present-day humans from different parts of the world identify a number of genomic regions that may have been affected by positive selection in ancestral modern humans, including genes involved in metabolism and in cognitive and skeletal development. We show that Neandertals shared more genetic variants with present-day humans in Eurasia than with present-day humans in sub-Saharan Africa, suggesting that gene flow from Neandertals into the ancestors of non-Africans occurred before the divergence of Eurasian groups from each other.
Collapse
Affiliation(s)
- Richard E. Green
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Johannes Krause
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Adrian W. Briggs
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Tomislav Maricic
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Udo Stenzel
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Martin Kircher
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Nick Patterson
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Heng Li
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Weiwei Zhai
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Markus Hsi-Yang Fritz
- European Molecular Biology Laboratory–European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Nancy F. Hansen
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Eric Y. Durand
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Anna-Sapfo Malaspinas
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Jeffrey D. Jensen
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, Worcester, MA 01655, USA
| | - Tomas Marques-Bonet
- Howard Hughes Medical Institute, Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
- Institute of Evolutionary Biology (UPF-CSIC), Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Can Alkan
- Howard Hughes Medical Institute, Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Kay Prüfer
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Matthias Meyer
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Hernán A. Burbano
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Jeffrey M. Good
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
- Division of Biological Sciences, University of Montana, Missoula, MT 59812, USA
| | - Rigo Schultz
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Ayinuer Aximu-Petri
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Anne Butthof
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Barbara Höber
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Barbara Höffner
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Madlen Siegemund
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Antje Weihmann
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Chad Nusbaum
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Eric S. Lander
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Carsten Russ
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Nathaniel Novod
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | | | | | - Christine Verna
- Department of Human Evolution, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Pavao Rudan
- Croatian Academy of Sciences and Arts, Zrinski trg 11, HR-10000 Zagreb, Croatia
| | - Dejana Brajkovic
- Croatian Academy of Sciences and Arts, Institute for Quaternary Paleontology and Geology, Ante Kovacica 5, HR-10000 Zagreb, Croatia
| | - Željko Kucan
- Croatian Academy of Sciences and Arts, Zrinski trg 11, HR-10000 Zagreb, Croatia
| | - Ivan Gušic
- Croatian Academy of Sciences and Arts, Zrinski trg 11, HR-10000 Zagreb, Croatia
| | | | | | - Carles Lalueza-Fox
- Institute of Evolutionary Biology (UPF-CSIC), Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Marco de la Rasilla
- Área de Prehistoria Departamento de Historia Universidad de Oviedo, Oviedo, Spain
| | - Javier Fortea
- Área de Prehistoria Departamento de Historia Universidad de Oviedo, Oviedo, Spain
| | - Antonio Rosas
- Departamento de Paleobiología, Museo Nacional de Ciencias Naturales, CSIC, Madrid, Spain
| | - Ralf W. Schmitz
- Der Landschaftverband Rheinlund–Landesmuseum Bonn, Bachstrasse 5-9, D-53115 Bonn, Germany
- Abteilung für Vor- und Frühgeschichtliche Archäologie, Universität Bonn, Germany
| | | | - Evan E. Eichler
- Howard Hughes Medical Institute, Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Daniel Falush
- Department of Microbiology, University College Cork, Cork, Ireland
| | - Ewan Birney
- European Molecular Biology Laboratory–European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK
| | - James C. Mullikin
- Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | - Montgomery Slatkin
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Rasmus Nielsen
- Department of Integrative Biology, University of California, Berkeley, CA 94720, USA
| | - Janet Kelso
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - Michael Lachmann
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| | - David Reich
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
| | - Svante Pääbo
- Department of Evolutionary Genetics, Max-Planck Institute for Evolutionary Anthropology, D-04103 Leipzig, Germany
| |
Collapse
|