1
|
Ihnatovych I, Saddler RA, Sule N, Szigeti K. Translational implications of CHRFAM7A, an elusive human-restricted fusion gene. Mol Psychiatry 2024; 29:1020-1032. [PMID: 38200291 PMCID: PMC11176066 DOI: 10.1038/s41380-023-02389-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 12/08/2023] [Accepted: 12/15/2023] [Indexed: 01/12/2024]
Abstract
Genes restricted to humans may contribute to human-specific traits and provide a different context for diseases. CHRFAM7A is a uniquely human fusion gene and a negative regulator of the α7 nicotinic acetylcholine receptor (α7 nAChR). The α7 nAChR has been a promising target for diseases affecting cognition and higher cortical functions, however, the treatment effect observed in animal models failed to translate into human clinical trials. As CHRFAM7A was not accounted for in preclinical drug screens it may have contributed to the translational gap. Understanding the complex genetic architecture of the locus, deciphering the functional impact of CHRFAM7A on α7 nAChR neurobiology and utilizing human-relevant models may offer novel approaches to explore α7 nAChR as a drug target.
Collapse
Affiliation(s)
- Ivanna Ihnatovych
- Department of Neurology, State University of New York at Buffalo, 875 Ellicott St., Buffalo, NY, 14203, USA
| | - Ruth-Ann Saddler
- Department of Neurology, State University of New York at Buffalo, 875 Ellicott St., Buffalo, NY, 14203, USA
| | - Norbert Sule
- Roswell Park Comprehensive Cancer Center, 665 Elm St, Buffalo, NY, 14203, USA
| | - Kinga Szigeti
- Department of Neurology, State University of New York at Buffalo, 875 Ellicott St., Buffalo, NY, 14203, USA.
| |
Collapse
|
2
|
Cornetti L, Fields PD, Ebert D. Genomic characterization of selfing in the cyclic parthenogen Daphnia magna. J Evol Biol 2021; 34:792-802. [PMID: 33704857 DOI: 10.1111/jeb.13780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 02/23/2021] [Accepted: 03/06/2021] [Indexed: 11/29/2022]
Abstract
Inbreeding refers to the fusion of related individuals' gametes, with self-fertilization (selfing) being an extreme form of inbreeding-involving gametes produced by the same individual. Selfing is expected to reduce heterozygosity by an average of 50% in one generation; however, little is known about the empirical variation on a genome level surrounding this figure and the factors that affect variation. We selfed genotypes of the cyclic parthenogen Daphnia magna and analysed whole genomes of mothers and selfed offspring, observing the predicted 50% heterozygosity reduction on average. We also saw substantial variation around this value and significant differences among mother-offspring pairs. Crossover analysis confirmed the known trend of recombination occurring more often towards the telomeres. This effect was shown, through simulations, to increase the variance of heterozygosity reduction compared to when a uniform distribution of crossovers was used. Similarly, we simulated inbred line production after several generations of selfing and we observed higher variance in achieved homozygosity when we consider a higher recombination rate towards the telomeres. Our empirical and simulation study highlights that the expected mean values of heterozygosity reduction show remarkable variation, which can help understand, for example, differences among inbred individuals.
Collapse
Affiliation(s)
- Luca Cornetti
- Department of Environmental Sciences, Zoology, University of Basel, Basel, Switzerland
| | - Peter D Fields
- Department of Environmental Sciences, Zoology, University of Basel, Basel, Switzerland
| | - Dieter Ebert
- Department of Environmental Sciences, Zoology, University of Basel, Basel, Switzerland
| |
Collapse
|
3
|
Powell DL, García-Olazábal M, Keegan M, Reilly P, Du K, Díaz-Loyo AP, Banerjee S, Blakkan D, Reich D, Andolfatto P, Rosenthal GG, Schartl M, Schumer M. Natural hybridization reveals incompatible alleles that cause melanoma in swordtail fish. Science 2020; 368:731-736. [PMID: 32409469 PMCID: PMC8074799 DOI: 10.1126/science.aba5216] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 03/27/2020] [Indexed: 12/21/2022]
Abstract
The establishment of reproductive barriers between populations can fuel the evolution of new species. A genetic framework for this process posits that "incompatible" interactions between genes can evolve that result in reduced survival or reproduction in hybrids. However, progress has been slow in identifying individual genes that underlie hybrid incompatibilities. We used a combination of approaches to map the genes that drive the development of an incompatibility that causes melanoma in swordtail fish hybrids. One of the genes involved in this incompatibility also causes melanoma in hybrids between distantly related species. Moreover, this melanoma reduces survival in the wild, likely because of progressive degradation of the fin. This work identifies genes underlying a vertebrate hybrid incompatibility and provides a glimpse into the action of these genes in natural hybrid populations.
Collapse
Affiliation(s)
- Daniel L Powell
- Department of Biology, Stanford University and Howard Hughes Medical Institute, Stanford, CA, USA.
- Centro de Investigaciones Científicas de las Huastecas "Aguazarca", A.C., Calnali, Hidalgo, Mexico
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - Mateo García-Olazábal
- Centro de Investigaciones Científicas de las Huastecas "Aguazarca", A.C., Calnali, Hidalgo, Mexico
- Department of Biology, Texas A&M University, College Station, TX, USA
| | | | - Patrick Reilly
- Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA
| | - Kang Du
- Developmental Biochemistry, Biocenter, University of Würzburg, Würzburg, Bavaria, Germany
| | - Alejandra P Díaz-Loyo
- Laboratorio de Ecología de la Conducta, Instituto de Fisiología, Benemérita Universidad Autónoma de Puebla, Puebla, Mexico
| | - Shreya Banerjee
- Department of Biology, Stanford University and Howard Hughes Medical Institute, Stanford, CA, USA
| | - Danielle Blakkan
- Department of Biology, Stanford University and Howard Hughes Medical Institute, Stanford, CA, USA
| | - David Reich
- Department of Genetics, Harvard Medical School, Howard Hughes Medical Institute, and the Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Peter Andolfatto
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Gil G Rosenthal
- Centro de Investigaciones Científicas de las Huastecas "Aguazarca", A.C., Calnali, Hidalgo, Mexico
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - Manfred Schartl
- Centro de Investigaciones Científicas de las Huastecas "Aguazarca", A.C., Calnali, Hidalgo, Mexico
- Department of Biology, Texas A&M University, College Station, TX, USA
- Developmental Biochemistry, Biocenter, University of Würzburg, Würzburg, Bavaria, Germany
- Hagler Institute for Advanced Study, Texas A&M University, College Station, TX, USA
- Xiphophorus Genetic Stock Center, Texas State University San Marcos, San Marcos, TX, USA
| | - Molly Schumer
- Department of Biology, Stanford University and Howard Hughes Medical Institute, Stanford, CA, USA.
| |
Collapse
|
4
|
Karaoğlanoğlu F, Ricketts C, Ebren E, Rasekh ME, Hajirasouliha I, Alkan C. VALOR2: characterization of large-scale structural variants using linked-reads. Genome Biol 2020; 21:72. [PMID: 32192518 PMCID: PMC7083023 DOI: 10.1186/s13059-020-01975-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 02/24/2020] [Indexed: 12/31/2022] Open
Abstract
Most existing methods for structural variant detection focus on discovery and genotyping of deletions, insertions, and mobile elements. Detection of balanced structural variants with no gain or loss of genomic segments, for example, inversions and translocations, is a particularly challenging task. Furthermore, there are very few algorithms to predict the insertion locus of large interspersed segmental duplications and characterize translocations. Here, we propose novel algorithms to characterize large interspersed segmental duplications, inversions, deletions, and translocations using linked-read sequencing data. We redesign our earlier algorithm, VALOR, and implement our new algorithms in a new software package, called VALOR2.
Collapse
Affiliation(s)
- Fatih Karaoğlanoğlu
- Department of Computer Engineering, Bilkent University, Ankara, 06800 Turkey
| | - Camir Ricketts
- Tri-Institutional Computational Biology & Medicine Program, Cornell University, 1300 York Ave, New York, 10065 NY USA
- Department of Physiology and Biophysics, Institute for Computational Biomedicine, Weill Cornell Medicine, 1300 York Ave, New York, 10065 NY USA
| | - Ezgi Ebren
- Department of Computer Engineering, Bilkent University, Ankara, 06800 Turkey
| | - Marzieh Eslami Rasekh
- Graduate Program in Bioinformatics, Boston University, 24 Cummington Mall, Boston, 02215 MA USA
| | - Iman Hajirasouliha
- Department of Physiology and Biophysics, Institute for Computational Biomedicine, Weill Cornell Medicine, 1300 York Ave, New York, 10065 NY USA
- Englander Institute for Precision Medicine, The Meyer Cancer Center, Weill Cornell Medicine, 1300 York Ave, New York, 10065 NY USA
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara, 06800 Turkey
- Bilkent-Hacettepe Health Sciences and Technologies Program, Bilkent University, Ankara, 06800 Turkey
| |
Collapse
|
5
|
Abstract
Identifying structural variation (SV) is essential for genome interpretation but has been historically difficult due to limitations inherent to available genome technologies. Detection methods that use ensemble algorithms and emerging sequencing technologies have enabled the discovery of thousands of SVs, uncovering information about their ubiquity, relationship to disease and possible effects on biological mechanisms. Given the variability in SV type and size, along with unique detection biases of emerging genomic platforms, multiplatform discovery is necessary to resolve the full spectrum of variation. Here, we review modern approaches for investigating SVs and proffer that, moving forwards, studies integrating biological information with detection will be necessary to comprehensively understand the impact of SV in the human genome.
Collapse
Affiliation(s)
- Steve S Ho
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA
| | - Alexander E Urban
- Department of Psychiatry and Behavioral Sciences, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Ryan E Mills
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
6
|
Caspar SM, Dubacher N, Kopps AM, Meienberg J, Henggeler C, Matyas G. Clinical sequencing: From raw data to diagnosis with lifetime value. Clin Genet 2019; 93:508-519. [PMID: 29206278 DOI: 10.1111/cge.13190] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Revised: 11/28/2017] [Accepted: 11/30/2017] [Indexed: 12/22/2022]
Abstract
High-throughput sequencing (HTS) has revolutionized genetics by enabling the detection of sequence variants at hitherto unprecedented large scale. Despite these advances, however, there are still remaining challenges in the complete coverage of targeted regions (genes, exome or genome) as well as in HTS data analysis and interpretation. Moreover, it is easy to get overwhelmed by the plethora of available methods and tools for HTS. Here, we review the step-by-step process from the generation of sequence data to molecular diagnosis of Mendelian diseases. Highlighting advantages and limitations, this review addresses the current state of (1) HTS technologies, considering targeted, whole-exome, and whole-genome sequencing on short- and long-read platforms; (2) read alignment, variant calling and interpretation; as well as (3) regulatory issues related to genetic counseling, reimbursement, and data storage.
Collapse
Affiliation(s)
- S M Caspar
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - N Dubacher
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - A M Kopps
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - J Meienberg
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - C Henggeler
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland
| | - G Matyas
- Center for Cardiovascular Genetics and Gene Diagnostics, Foundation for People with Rare Diseases, Schlieren-Zurich, Switzerland.,Zurich Center for Integrative Human Physiology, University of Zurich, Zurich, Switzerland
| |
Collapse
|
7
|
Darby CA, Fitch JR, Brennan PJ, Kelly BJ, Bir N, Magrini V, Leonard J, Cottrell CE, Gastier-Foster JM, Wilson RK, Mardis ER, White P, Langmead B, Schatz MC. Samovar: Single-Sample Mosaic Single-Nucleotide Variant Calling with Linked Reads. iScience 2019; 18:1-10. [PMID: 31271967 PMCID: PMC6609817 DOI: 10.1016/j.isci.2019.05.037] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 05/06/2019] [Accepted: 05/24/2019] [Indexed: 12/25/2022] Open
Abstract
Linked-read sequencing enables greatly improves haplotype assembly over standard paired-end analysis. The detection of mosaic single-nucleotide variants benefits from haplotype assembly when the model is informed by the mapping between constituent reads and linked reads. Samovar evaluates haplotype-discordant reads identified through linked-read sequencing, thus enabling phasing and mosaic variant detection across the entire genome. Samovar trains a random forest model to score candidate sites using a dataset that considers read quality, phasing, and linked-read characteristics. Samovar calls mosaic single-nucleotide variants (SNVs) within a single sample with accuracy comparable with what previously required trios or matched tumor/normal pairs and outperforms single-sample mosaic variant callers at minor allele frequency 5%-50% with at least 30X coverage. Samovar finds somatic variants in both tumor and normal whole-genome sequencing from 13 pediatric cancer cases that can be corroborated with high recall with whole exome sequencing. Samovar is available open-source at https://github.com/cdarby/samovar under the MIT license.
Collapse
Affiliation(s)
- Charlotte A Darby
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - James R Fitch
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Patrick J Brennan
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Benjamin J Kelly
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Natalie Bir
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
| | - Vincent Magrini
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA; Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Jeffrey Leonard
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA; Department of Neurosurgery, Nationwide Children's Hospital, Columbus, OH, USA
| | - Catherine E Cottrell
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA; Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Julie M Gastier-Foster
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA; Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Richard K Wilson
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA; Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Elaine R Mardis
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA; Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Peter White
- The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA; Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA
| | - Ben Langmead
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.
| | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA; Department of Biology, Johns Hopkins University, Baltimore, MD, USA; Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA.
| |
Collapse
|
8
|
Dal E, Alkan C. Evaluation of genome scaffolding tools using pooled clone sequencing. Turk J Biol 2019; 42:471-476. [PMID: 30983868 PMCID: PMC6451843 DOI: 10.3906/biy-1805-42] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
DNA sequencing technologies hold great promise in generating information that will guide scientists to understand how the genome effects human health and organismal evolution. The process of generating raw genome sequence data becomes cheaper and faster, but more error-prone. Assembly of such data into high-quality finished genome sequences remains challenging. Many genome assembly tools are available, but they differ in terms of their performance and their final output. More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. Here we evaluate the accuracies of several genome scaffolding algorithms using two different types of data generated from the genome of the same human individual: whole genome shotgun sequencing (WGS) and pooled clone sequencing (PCS). We observe that it is possible to obtain better assemblies if PCS data are used, compared to using only WGS data. However, the current scaffolding algorithms are developed only for WGS, and PCS-aware scaffolding algorithms remain an open problem.
Collapse
Affiliation(s)
- Elif Dal
- Department of Computer Engineering, Faculty of Engineering, Bilkent University , Ankara , Turkey
| | - Can Alkan
- Department of Computer Engineering, Faculty of Engineering, Bilkent University , Ankara , Turkey
| |
Collapse
|
9
|
Ma ZS, Li L, Ye C, Peng M, Zhang YP. Hybrid assembly of ultra-long Nanopore reads augmented with 10x-Genomics contigs: Demonstrated with a human genome. Genomics 2018; 111:1896-1901. [PMID: 30594583 DOI: 10.1016/j.ygeno.2018.12.013] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Revised: 11/17/2018] [Accepted: 12/24/2018] [Indexed: 10/27/2022]
Abstract
The 3rd generation of sequencing (3GS) technologies generate ultra-long reads (up to 1 Mb), which makes it possible to eliminate gaps and effectively resolve repeats in genome assembly. However, the 3GS technologies suffer from the high base-level error rates (15%-40%) and high sequencing costs. To address these issues, the hybrid assembly strategy, which utilizes both 3GS reads and inexpensive NGS (next generation sequencing) short reads, was invented. Here, we use 10×-Genomics® technology, which integrates a novel bar-coding strategy with Illumina® NGS with an advantage of revealing long-range sequence information, to replace common NGS short reads for hybrid assembly of long erroneous 3GS reads. We demonstrate the feasibility of integrating the 3GS with 10×-Genomics technologies for a new strategy of hybrid de novo genome assembly by utilizing DBG2OLC and Sparc software packages, previously developed by the authors for regular hybrid assembly. Using a human genome as an example, we show that with only 7× coverage of ultra-long Nanopore® reads, augmented with 10× reads, our approach achieved nearly the same level of quality, compared with non-hybrid assembly with 35× coverage of Nanopore reads. Compared with the assembly with 10×-Genomics reads alone, our assembly is gapless with slightly high cost. These results suggest that our new hybrid assembly with ultra-long 3GS reads augmented with 10×-Genomics reads offers a low-cost (less than ¼ the cost of the non-hybrid assembly) and computationally light-weighted (only took 109 calendar hours with peak memory-usage = 61GB on a dual-CPU office workstation) solution for extending the wide applications of the 3GS technologies.
Collapse
Affiliation(s)
- Zhanshan Sam Ma
- Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, Chinese Academy of Sciences, Kunming, 650223, China.
| | - Lianwei Li
- Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, Chinese Academy of Sciences, Kunming, 650223, China
| | - Chengxi Ye
- Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Minsheng Peng
- Molecular Evolution and Genome Diversity Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, Chinese Academy of Sciences, Kunming, 650223, China; KIZ/CUHK Joint Laboratory of Bio-resources and Molecular Research in Common Diseases, Kunming 650223, China
| | - Ya-Ping Zhang
- Molecular Evolution and Genome Diversity Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China; Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming 650223, China; Kunming College of Life Science, Chinese Academy of Sciences, Kunming, 650223, China; KIZ/CUHK Joint Laboratory of Bio-resources and Molecular Research in Common Diseases, Kunming 650223, China.
| |
Collapse
|
10
|
Ott A, Schnable JC, Yeh CT, Wu L, Liu C, Hu HC, Dalgard CL, Sarkar S, Schnable PS. Linked read technology for assembling large complex and polyploid genomes. BMC Genomics 2018; 19:651. [PMID: 30180802 PMCID: PMC6122573 DOI: 10.1186/s12864-018-5040-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Accepted: 08/27/2018] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Short read DNA sequencing technologies have revolutionized genome assembly by providing high accuracy and throughput data at low cost. But it remains challenging to assemble short read data, particularly for large, complex and polyploid genomes. The linked read strategy has the potential to enhance the value of short reads for genome assembly because all reads originating from a single long molecule of DNA share a common barcode. However, the majority of studies to date that have employed linked reads were focused on human haplotype phasing and genome assembly. RESULTS Here we describe a de novo maize B73 genome assembly generated via linked read technology which contains ~ 172,000 scaffolds with an N50 of 89 kb that cover 50% of the genome. Based on comparisons to the B73 reference genome, 91% of linked read contigs are accurately assembled. Because it was possible to identify errors with > 76% accuracy using machine learning, it may be possible to identify and potentially correct systematic errors. Complex polyploids represent one of the last grand challenges in genome assembly. Linked read technology was able to successfully resolve the two subgenomes of the recent allopolyploid, proso millet (Panicum miliaceum). Our assembly covers ~ 83% of the 1 Gb genome and consists of 30,819 scaffolds with an N50 of 912 kb. CONCLUSIONS Our analysis provides a framework for future de novo genome assemblies using linked reads, and we suggest computational strategies that if implemented have the potential to further improve linked read assemblies, particularly for repetitive genomes.
Collapse
Affiliation(s)
- Alina Ott
- Department of Agronomy, Iowa State University, Ames, IA 50011 USA
- Present address: Roche Sequencing Solutions, 500 S Rosa Road, Madison, WI 53719 USA
| | - James C. Schnable
- Department of Agriculture and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68588 USA
- Data2Bio LLC, 2079 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
- Dryland Genetics LLC, 2073 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
| | - Cheng-Ting Yeh
- Department of Agronomy, Iowa State University, Ames, IA 50011 USA
- Data2Bio LLC, 2079 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
| | - Linjiang Wu
- Department of Mechanical Engineering, Iowa State University, Ames, IA 50011 USA
| | - Chao Liu
- Department of Mechanical Engineering, Iowa State University, Ames, IA 50011 USA
- Present address: Department of Thermal Engineering, Tsinghua University, Beijing, 100084 China
| | - Heng-Cheng Hu
- The American Genome Center, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
- Collaborative Health Initiative Research Program (CHIRP), Uniformed Services University School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
- Present address: Qiagen Sciences Inc, 6951 Executive Way, Frederick, MD 21703 USA
| | - Clifton L. Dalgard
- The American Genome Center, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
- Collaborative Health Initiative Research Program (CHIRP), Uniformed Services University School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
- Department of Anatomy, Physiology and Genetics, Uniformed Services University School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, MD 20814 USA
| | - Soumik Sarkar
- Department of Mechanical Engineering, Iowa State University, Ames, IA 50011 USA
| | - Patrick S. Schnable
- Department of Agronomy, Iowa State University, Ames, IA 50011 USA
- Data2Bio LLC, 2079 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
- Dryland Genetics LLC, 2073 Roy J Carver Co-Laboratory, 1111 WOI Rd, Ames, IA 50011 USA
| |
Collapse
|
11
|
Demaerel W, Hestand MS, Vergaelen E, Swillen A, López-Sánchez M, Pérez-Jurado LA, McDonald-McGinn DM, Zackai E, Emanuel BS, Morrow BE, Breckpot J, Devriendt K, Vermeesch JR. Nested Inversion Polymorphisms Predispose Chromosome 22q11.2 to Meiotic Rearrangements. Am J Hum Genet 2017; 101:616-622. [PMID: 28965848 DOI: 10.1016/j.ajhg.2017.09.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 08/16/2017] [Indexed: 11/17/2022] Open
Abstract
Inversion polymorphisms between low-copy repeats (LCRs) might predispose chromosomes to meiotic non-allelic homologous recombination (NAHR) events and thus lead to genomic disorders. However, for the 22q11.2 deletion syndrome (22q11.2DS), the most common genomic disorder, no such inversions have been uncovered as of yet. Using fiber-FISH, we demonstrate that parents transmitting the de novo 3 Mb LCR22A-D 22q11.2 deletion, the reciprocal duplication, and the smaller 1.5 Mb LCR22A-B 22q11.2 deletion carry inversions of LCR22B-D or LCR22C-D. Hence, the inversions predispose chromosome 22q11.2 to meiotic rearrangements and increase the individual risk for transmitting rearrangements. Interestingly, the inversions are nested or flanking rather than coinciding with the deletion or duplication sizes. This finding raises the possibility that inversions are a prerequisite not only for 22q11.2 rearrangements but also for all NAHR-mediated genomic disorders.
Collapse
Affiliation(s)
- Wolfram Demaerel
- Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Matthew S Hestand
- Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Elfi Vergaelen
- Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Ann Swillen
- Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Marcos López-Sánchez
- Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain; Institut Hospital del Mar d'Investigacions Mèdiques, Barcelona, Spain; Centro de Investigación Biomédica en Red de Enfermedades Raras, Barcelona, Spain
| | - Luis A Pérez-Jurado
- Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona, Spain; Institut Hospital del Mar d'Investigacions Mèdiques, Barcelona, Spain; Centro de Investigación Biomédica en Red de Enfermedades Raras, Barcelona, Spain
| | - Donna M McDonald-McGinn
- Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Elaine Zackai
- Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Beverly S Emanuel
- Division of Human Genetics, Children's Hospital of Philadelphia, Philadelphia, PA, USA; Department of Pediatrics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, USA
| | - Bernice E Morrow
- Department of Genetics, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Jeroen Breckpot
- Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Koenraad Devriendt
- Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium
| | - Joris R Vermeesch
- Department of Human Genetics, Katholieke Universiteit Leuven, Leuven, Belgium.
| |
Collapse
|