1
|
Choi Y, Chan AP, Kirkness E, Telenti A, Schork NJ. Comparison of phasing strategies for whole human genomes. PLoS Genet 2018; 14:e1007308. [PMID: 29621242 PMCID: PMC5903673 DOI: 10.1371/journal.pgen.1007308] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Revised: 04/17/2018] [Accepted: 03/13/2018] [Indexed: 12/17/2022] Open
Abstract
Humans are a diploid species that inherit one set of chromosomes paternally and one homologous set of chromosomes maternally. Unfortunately, most human sequencing initiatives ignore this fact in that they do not directly delineate the nucleotide content of the maternal and paternal copies of the 23 chromosomes individuals possess (i.e., they do not 'phase' the genome) often because of the costs and complexities of doing so. We compared 11 different widely-used approaches to phasing human genomes using the publicly available 'Genome-In-A-Bottle' (GIAB) phased version of the NA12878 genome as a gold standard. The phasing strategies we compared included laboratory-based assays that prepare DNA in unique ways to facilitate phasing as well as purely computational approaches that seek to reconstruct phase information from general sequencing reads and constructs or population-level haplotype frequency information obtained through a reference panel of haplotypes. To assess the performance of the 11 approaches, we used metrics that included, among others, switch error rates, haplotype block lengths, the proportion of fully phase-resolved genes, phasing accuracy and yield between pairs of SNVs. Our comparisons suggest that a hybrid or combined approach that leverages: 1. population-based phasing using the SHAPEIT software suite, 2. either genome-wide sequencing read data or parental genotypes, and 3. a large reference panel of variant and haplotype frequencies, provides a fast and efficient way to produce highly accurate phase-resolved individual human genomes. We found that for population-based approaches, phasing performance is enhanced with the addition of genome-wide read data; e.g., whole genome shotgun and/or RNA sequencing reads. Further, we found that the inclusion of parental genotype data within a population-based phasing strategy can provide as much as a ten-fold reduction in phasing errors. We also considered a majority voting scheme for the construction of a consensus haplotype combining multiple predictions for enhanced performance and site coverage. Finally, we also identified DNA sequence signatures associated with the genomic regions harboring phasing switch errors, which included regions of low polymorphism or SNV density.
Collapse
Affiliation(s)
- Yongwook Choi
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Agnes P. Chan
- J. Craig Venter Institute, Rockville, Maryland, United States of America
| | - Ewen Kirkness
- Human Longevity, Inc., San Diego, California, United States of America
| | - Amalio Telenti
- J. Craig Venter Institute, La Jolla, California, United States of America
| | - Nicholas J. Schork
- J. Craig Venter Institute, La Jolla, California, United States of America
- University of California San Diego, La Jolla, California, United States of America
- The Translational Genomics Research Institute (TGen), Phoenix, Arizona, United States of America
| |
Collapse
|
2
|
Moustafa A, Xie C, Kirkness E, Biggs W, Wong E, Turpaz Y, Bloom K, Delwart E, Nelson KE, Venter JC, Telenti A. The blood DNA virome in 8,000 humans. PLoS Pathog 2017; 13:e1006292. [PMID: 28328962 PMCID: PMC5378407 DOI: 10.1371/journal.ppat.1006292] [Citation(s) in RCA: 185] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Revised: 04/03/2017] [Accepted: 03/14/2017] [Indexed: 02/06/2023] Open
Abstract
The characterization of the blood virome is important for the safety of blood-derived transfusion products, and for the identification of emerging pathogens. We explored non-human sequence data from whole-genome sequencing of blood from 8,240 individuals, none of whom were ascertained for any infectious disease. Viral sequences were extracted from the pool of sequence reads that did not map to the human reference genome. Analyses sifted through close to 1 Petabyte of sequence data and performed 0.5 trillion similarity searches. With a lower bound for identification of 2 viral genomes/100,000 cells, we mapped sequences to 94 different viruses, including sequences from 19 human DNA viruses, proviruses and RNA viruses (herpesviruses, anelloviruses, papillomaviruses, three polyomaviruses, adenovirus, HIV, HTLV, hepatitis B, hepatitis C, parvovirus B19, and influenza virus) in 42% of the study participants. Of possible relevance to transfusion medicine, we identified Merkel cell polyomavirus in 49 individuals, papillomavirus in blood of 13 individuals, parvovirus B19 in 6 individuals, and the presence of herpesvirus 8 in 3 individuals. The presence of DNA sequences from two RNA viruses was unexpected: Hepatitis C virus is revealing of an integration event, while the influenza virus sequence resulted from immunization with a DNA vaccine. Age, sex and ancestry contributed significantly to the prevalence of infection. The remaining 75 viruses mostly reflect extensive contamination of commercial reagents and from the environment. These technical problems represent a major challenge for the identification of novel human pathogens. Increasing availability of human whole-genome sequences will contribute substantial amounts of data on the composition of the normal and pathogenic human blood virome. Distinguishing contaminants from real human viruses is challenging. Novel sequencing technologies offer insight into the virome in human samples. Here, we identify the viral DNA sequences in blood of over 8,000 individuals undergoing whole genome sequencing. This approach serves to identify 94 viruses; however, many are shown to reflect widespread DNA contamination of commercial reagents or of environmental origin. While this represents a significant limitation to reliably identify novel viruses infecting humans, we could confidently detect sequences and quantify abundance of 19 human viruses in 42% of individuals. Ancestry, sex, and age were important determinants of viral prevalence. This large study calls attention on the challenge of interpreting next generation sequencing data for the identification of novel viruses. However, it serves to categorize the abundance of human DNA viruses using an unbiased technique.
Collapse
Affiliation(s)
- Ahmed Moustafa
- Human Longevity Inc., San Diego, California, United States of America
| | - Chao Xie
- Human Longevity Singapore Pte. Ltd., Singapore
| | - Ewen Kirkness
- Human Longevity Inc., San Diego, California, United States of America
| | - William Biggs
- Human Longevity Inc., San Diego, California, United States of America
| | - Emily Wong
- Human Longevity Inc., San Diego, California, United States of America
| | | | - Kenneth Bloom
- Human Longevity Inc., San Diego, California, United States of America
| | - Eric Delwart
- Blood Systems Research Institute, Department of Laboratory Medicine, University of California San Francisco, San Francisco, California, United States of America
| | - Karen E. Nelson
- J. Craig Venter Institute, La Jolla, California, United States of America
| | - J. Craig Venter
- Human Longevity Inc., San Diego, California, United States of America
- J. Craig Venter Institute, La Jolla, California, United States of America
- * E-mail: (JCV); (AT)
| | - Amalio Telenti
- Human Longevity Inc., San Diego, California, United States of America
- J. Craig Venter Institute, La Jolla, California, United States of America
- * E-mail: (JCV); (AT)
| |
Collapse
|
3
|
Abernathy E, Chen MH, Bera J, Shrivastava S, Kirkness E, Zheng Q, Bellini W, Icenogle J. Analysis of whole genome sequences of 16 strains of rubella virus from the United States, 1961-2009. Virol J 2013; 10:32. [PMID: 23351667 PMCID: PMC3574052 DOI: 10.1186/1743-422x-10-32] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Accepted: 01/16/2013] [Indexed: 11/23/2022] Open
Abstract
Rubella virus is the causative agent of rubella, a mild rash illness, and a potent teratogenic agent when contracted by a pregnant woman. Global rubella control programs target the reduction and elimination of congenital rubella syndrome. Phylogenetic analysis of partial sequences of rubella viruses has contributed to virus surveillance efforts and played an important role in demonstrating that indigenous rubella viruses have been eliminated in the United States. Sixteen wild-type rubella viruses were chosen for whole genome sequencing. All 16 viruses were collected in the United States from 1961 to 2009 and are from 8 of the 13 known rubella genotypes. Phylogenetic analysis of 30 whole genome sequences produced a maximum likelihood tree giving high bootstrap values for all genotypes except provisional genotype 1a. Comparison of the 16 new complete sequences and 14 previously sequenced wild-type viruses found regions with clusters of variable amino acids. The 5′ 250 nucleotides of the genome are more conserved than any other part of the genome. Genotype specific deletions in the untranslated region between the non-structural and structural open reading frames were observed for genotypes 2B and genotype 1G. No evidence was seen for recombination events among the 30 viruses. The analysis presented here is consistent with previous reports on the genetic characterization of rubella virus genomes. Conserved and variable regions were identified and additional evidence for genotype specific nucleotide deletions in the intergenic region was found. Phylogenetic analysis confirmed genotype groupings originally based on structural protein coding region sequences, which provides support for the WHO nomenclature for genetic characterization of wild-type rubella viruses.
Collapse
Affiliation(s)
- Emily Abernathy
- National Center for Immunizations and Respiratory Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | | | | | | | | | | | | | | |
Collapse
|
4
|
Tran TM, Samal B, Kirkness E, Crompton PD. Systems immunology of human malaria. Trends Parasitol 2012; 28:248-57. [PMID: 22592005 DOI: 10.1016/j.pt.2012.03.006] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2012] [Revised: 03/27/2012] [Accepted: 03/27/2012] [Indexed: 12/28/2022]
Abstract
Plasmodium falciparum malaria remains a global public health threat. Optimism that a highly effective malaria vaccine can be developed stems in part from the observation that humans can acquire immunity to malaria through experimental and natural P. falciparum infection. Recent advances in systems immunology could accelerate efforts to unravel the mechanisms of acquired immunity to malaria. Here, we review the tools of systems immunology, their current limitations in the context of human malaria research, and the human 'models' of malaria immunity to which these tools can be applied.
Collapse
Affiliation(s)
- Tuan M Tran
- Laboratory of Immunogenetics, National Institute of Allergy and Infectious Diseases, National Institutes of Health, 12441 Parklawn Drive, Rockville, MD 20852, USA
| | | | | | | |
Collapse
|
5
|
Ding ZL, Oskarsson M, Ardalan A, Angleby H, Dahlgren LG, Tepeli C, Kirkness E, Savolainen P, Zhang YP. Origins of domestic dog in southern East Asia is supported by analysis of Y-chromosome DNA. Heredity (Edinb) 2011; 108:507-14. [PMID: 22108628 PMCID: PMC3330686 DOI: 10.1038/hdy.2011.114] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Global mitochondrial DNA (mtDNA) data indicates that the dog originates from domestication of wolf in Asia South of Yangtze River (ASY), with minor genetic contributions from dog–wolf hybridisation elsewhere. Archaeological data and autosomal single nucleotide polymorphism data have instead suggested that dogs originate from Europe and/or South West Asia but, because these datasets lack data from ASY, evidence pointing to ASY may have been overlooked. Analyses of additional markers for global datasets, including ASY, are therefore necessary to test if mtDNA phylogeography reflects the actual dog history and not merely stochastic events or selection. Here, we analyse 14 437 bp of Y-chromosome DNA sequence in 151 dogs sampled worldwide. We found 28 haplotypes distributed in five haplogroups. Two haplogroups were universally shared and included three haplotypes carried by 46% of all dogs, but two other haplogroups were primarily restricted to East Asia. Highest genetic diversity and virtually complete phylogenetic coverage was found within ASY. The 151 dogs were estimated to originate from 13–24 wolf founders, but there was no indication of post-domestication dog–wolf hybridisations. Thus, Y-chromosome and mtDNA data give strikingly similar pictures of dog phylogeography, most importantly that roughly 50% of the gene pools are shared universally but only ASY has nearly the full range of genetic diversity, such that the gene pools in all other regions may derive from ASY. This corroborates that ASY was the principal, and possibly sole region of wolf domestication, that a large number of wolves were domesticated, and that subsequent dog–wolf hybridisation contributed modestly to the dog gene pool.
Collapse
Affiliation(s)
- Z-L Ding
- Laboratory for Conservation and Utilization of Bio-resource, Yunnan University, Kunming, China
| | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Axelrod N, Lin Y, Ng PC, Stockwell TB, Crabtree J, Huang J, Kirkness E, Strausberg RL, Frazier ME, Venter JC, Kravitz S, Levy S. The HuRef Browser: a web resource for individual human genomics. Nucleic Acids Res 2008; 37:D1018-24. [PMID: 19036787 PMCID: PMC2686481 DOI: 10.1093/nar/gkn939] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The HuRef Genome Browser is a web application for the navigation and analysis of the previously published genome of a human individual, termed HuRef. The browser provides a comparative view between the NCBI human reference sequence and the HuRef assembly, and it enables the navigation of the HuRef genome in the context of HuRef, NCBI and Ensembl annotations. Single nucleotide polymorphisms, indels, inversions, structural and copy-number variations are shown in the context of existing functional annotations on either genome in the comparative view. Demonstrated here are some potential uses of the browser to enable a better understanding of individual human genetic variation. The browser provides full access to the underlying reads with sequence and quality information, the genome assembly and the evidence supporting the identification of DNA polymorphisms. The HuRef Browser is a unique and versatile tool for browsing genome assemblies and studying individual human sequence variation in a diploid context. The browser is available online at http://huref.jcvi.org.
Collapse
Affiliation(s)
- Nelson Axelrod
- J. Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD 20850, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Hunter LS, Sidjanin DJ, Hijar MV, Johnson JL, Kirkness E, Acland GM, Aguirre GD. Cloning and characterization of canine PAX6 and evaluation as a candidate gene in a canine model of aniridia. Mol Vis 2007; 13:431-42. [PMID: 17417604 PMCID: PMC2647561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
Abstract
PURPOSE Mutations in PAX6 cause human aniridia. The small eye (sey) mouse represents an animal model for aniridia. However, no large animal model currently exists. We cloned and characterized canine PAX6, and evaluated PAX6 for causal associations with inherited aniridia in dogs. METHODS Canine PAX6 was cloned from a canine retinal cDNA library using primers designed from human and mouse PAX6 consensus sequences. An RH3000 radiation hybrid panel was used to localize PAX6 within the canine genome. Genomic DNA was extracted from whole blood of dogs with inherited aniridia, and association testing was performed using markers on CFA18. Fourteen PAX6 exons were sequenced and scanned for mutations, and a Southern blot was used to test for large deletions. RESULTS Like the human gene, canine PAX6 has 13 exons and 12 introns, plus an alternatively spliced exon (5a). PAX6 nucleotide and amino acid sequences were highly conserved between dog, human, and mouse. The canine PAX6 cDNA sequence determined in this study spans 2 large gaps present in the current canine genomic sequence. Radiation hybrid mapping placed canine PAX6 on CFA18 in a region with synteny to HSA11p13. Exon-scanning revealed single nucleotide polymorphisms, but no pathological mutations, and Southern blot analysis revealed no differences between normal and affected animals. CONCLUSIONS Canine PAX6 was cloned and characterized, and results provide sequence information for gaps in the current canine genome sequence. Canine PAX6 nucleotide and amino acid sequences, as well as gene organization and map location, were highly homologous with that of the human gene. PAX6 was evaluated in dogs with an inherited form of aniridia, and sequence analysis indicated no pathological mutations in the coding regions or splice sites of aniridia-affected dogs, and Southern blot analysis showed no large deletions.
Collapse
Affiliation(s)
- Linda S Hunter
- J.A. Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY, USA
| | | | | | | | | | | | | |
Collapse
|
8
|
Hackett SR, Jung SW, Kirkness E, Cruickshank J, Vikstrom KL, Moïse NS, Gunn TM. Identification and characterization of canine microsatellite markers in cardiac genes. Anim Genet 2007; 38:89-91. [PMID: 17257202 DOI: 10.1111/j.1365-2052.2007.01556.x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- S R Hackett
- Department of Biomedical Sciences, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA
| | | | | | | | | | | | | |
Collapse
|
9
|
Natanaelsson C, Oskarsson MCR, Angleby H, Lundeberg J, Kirkness E, Savolainen P. Dog Y chromosomal DNA sequence: identification, sequencing and SNP discovery. BMC Genet 2006; 7:45. [PMID: 17026745 PMCID: PMC1630699 DOI: 10.1186/1471-2156-7-45] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2006] [Accepted: 10/06/2006] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Population genetic studies of dogs have so far mainly been based on analysis of mitochondrial DNA, describing only the history of female dogs. To get a picture of the male history, as well as a second independent marker, there is a need for studies of biallelic Y-chromosome polymorphisms. However, there are no biallelic polymorphisms reported, and only 3200 bp of non-repetitive dog Y-chromosome sequence deposited in GenBank, necessitating the identification of dog Y chromosome sequence and the search for polymorphisms therein. The genome has been only partially sequenced for one male dog, disallowing mapping of the sequence into specific chromosomes. However, by comparing the male genome sequence to the complete female dog genome sequence, candidate Y-chromosome sequence may be identified by exclusion. RESULTS The male dog genome sequence was analysed by Blast search against the human genome to identify sequences with a best match to the human Y chromosome and to the female dog genome to identify those absent in the female genome. Candidate sequences were then tested for male specificity by PCR of five male and five female dogs. 32 sequences from the male genome, with a total length of 24 kbp, were identified as male specific, based on a match to the human Y chromosome, absence in the female dog genome and male specific PCR results. 14437 bp were then sequenced for 10 male dogs originating from Europe, Southwest Asia, Siberia, East Asia, Africa and America. Nine haplotypes were found, which were defined by 14 substitutions. The genetic distance between the haplotypes indicates that they originate from at least five wolf haplotypes. There was no obvious trend in the geographic distribution of the haplotypes. CONCLUSION We have identified 24159 bp of dog Y-chromosome sequence to be used for population genetic studies. We sequenced 14437 bp in a worldwide collection of dogs, identifying 14 SNPs for future SNP analyses, and giving a first description of the dog Y-chromosome phylogeny.
Collapse
Affiliation(s)
- Christian Natanaelsson
- School of Biotechnology, KTH, Royal Institute of Technology, AlbaNova University Center, 10691 Stockholm, Sweden
| | - Mattias CR Oskarsson
- School of Biotechnology, KTH, Royal Institute of Technology, AlbaNova University Center, 10691 Stockholm, Sweden
| | - Helen Angleby
- School of Biotechnology, KTH, Royal Institute of Technology, AlbaNova University Center, 10691 Stockholm, Sweden
| | - Joakim Lundeberg
- School of Biotechnology, KTH, Royal Institute of Technology, AlbaNova University Center, 10691 Stockholm, Sweden
| | - Ewen Kirkness
- The Institute for Genomic Research (TIGR), Rockville, MD 20850, USA
| | - Peter Savolainen
- School of Biotechnology, KTH, Royal Institute of Technology, AlbaNova University Center, 10691 Stockholm, Sweden
| |
Collapse
|
10
|
Hunter LS, Sidjanin DJ, Johnson JL, Zangerl B, Galibert F, Andre C, Kirkness E, Talamas E, Acland GM, Aguirre GD. Radiation hybrid mapping of cataract genes in the dog. Mol Vis 2006; 12:588-96. [PMID: 16760895 PMCID: PMC1509099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023] Open
Abstract
PURPOSE To facilitate the molecular characterization of naturally occurring cataracts in dogs by providing the radiation hybrid location of 21 cataract-associated genes along with their closely associated polymorphic markers. These can be used for segregation testing of the candidate genes in canine cataract pedigrees. METHODS Twenty-one genes with known mutations causing hereditary cataracts in man and/or mouse were selected and mapped to canine chromosomes using a canine:hamster radiation hybrid RH5000 panel. Each cataract gene ortholog was mapped in relation to over 3,000 markers including microsatellites, ESTs, genes, and BAC clones. The resulting independently determined RH-map locations were compared with the corresponding gene locations from the draft sequence of the canine genome. RESULTS Twenty-one cataract orthologs were mapped to canine chromosomes. The genetic locations and nearest polymorphic markers were determined for 20 of these orthologs. In addition, the resulting cataract gene locations, as determined experimentally by this study, were compared with those determined by the canine genome project. All genes mapped within or near chromosomal locations with previously established homology to the corresponding human gene locations based on canine:human chromosomal synteny. CONCLUSIONS The location of selected cataract gene orthologs in the dog, along with their nearest polymorphic markers, serves as a resource for association and linkage testing in canine pedigrees segregating inherited cataracts. The recent development of canine genomic resources make canine models a practical and valuable resource for the study of human hereditary cataracts. Canine models can serve as large animal models intermediate between mouse and man for both gene discovery and the development of novel cataract therapies.
Collapse
Affiliation(s)
- Linda S Hunter
- J. A. Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Thomas R, Scott A, Langford CF, Fosmire SP, Jubala CM, Lorentzen TD, Hitte C, Karlsson EK, Kirkness E, Ostrander EA, Galibert F, Lindblad-Toh K, Modiano JF, Breen M. Construction of a 2-Mb resolution BAC microarray for CGH analysis of canine tumors. Genome Res 2006; 15:1831-7. [PMID: 16339382 PMCID: PMC1356122 DOI: 10.1101/gr.3825705] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Recognition of the domestic dog as a model for the comparative study of human genetic traits has led to major advances in canine genomics. The pathophysiological similarities shared between many human and dog diseases extend to a range of cancers. Human tumors frequently display recurrent chromosome aberrations, many of which are hallmarks of particular tumor subtypes. Using a range of molecular cytogenetic techniques we have generated evidence indicating that this is also true of canine tumors. Detailed knowledge of these genomic abnormalities has the potential to aid diagnosis, prognosis, and the selection of appropriate therapy in both species. We recently improved the efficiency and resolution of canine cancer cytogenetics studies by developing a small-scale genomic microarray comprising a panel of canine BAC clones representing subgenomic regions of particular interest. We have now extended these studies to generate a comprehensive canine comparative genomic hybridization (CGH) array that comprises 1158 canine BAC clones ordered throughout the genome with an average interval of 2 Mb. Most of the clones (84.3%) have been assigned to a precise cytogenetic location by fluorescence in situ hybridization (FISH), and 98.5% are also directly anchored within the current canine genome assembly, permitting direct translation from cytogenetic aberration to DNA sequence. We are now using this resource routinely for high-throughput array CGH and single-locus probe analysis of a range of canine cancers. Here we provide examples of the varied applications of this resource to tumor cytogenetics, in combination with other molecular cytogenetic techniques.
Collapse
Affiliation(s)
- Rachael Thomas
- Department of Molecular Biomedical Sciences, College of Veterinary Medicine, North Carolina State University, Raleigh, North Carolina 27606, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Lindblad-Toh K, Wade CM, Mikkelsen TS, Karlsson EK, Jaffe DB, Kamal M, Clamp M, Chang JL, Kulbokas EJ, Zody MC, Mauceli E, Xie X, Breen M, Wayne RK, Ostrander EA, Ponting CP, Galibert F, Smith DR, DeJong PJ, Kirkness E, Alvarez P, Biagi T, Brockman W, Butler J, Chin CW, Cook A, Cuff J, Daly MJ, DeCaprio D, Gnerre S, Grabherr M, Kellis M, Kleber M, Bardeleben C, Goodstadt L, Heger A, Hitte C, Kim L, Koepfli KP, Parker HG, Pollinger JP, Searle SMJ, Sutter NB, Thomas R, Webber C, Baldwin J, Abebe A, Abouelleil A, Aftuck L, Ait-Zahra M, Aldredge T, Allen N, An P, Anderson S, Antoine C, Arachchi H, Aslam A, Ayotte L, Bachantsang P, Barry A, Bayul T, Benamara M, Berlin A, Bessette D, Blitshteyn B, Bloom T, Blye J, Boguslavskiy L, Bonnet C, Boukhgalter B, Brown A, Cahill P, Calixte N, Camarata J, Cheshatsang Y, Chu J, Citroen M, Collymore A, Cooke P, Dawoe T, Daza R, Decktor K, DeGray S, Dhargay N, Dooley K, Dooley K, Dorje P, Dorjee K, Dorris L, Duffey N, Dupes A, Egbiremolen O, Elong R, Falk J, Farina A, Faro S, Ferguson D, Ferreira P, Fisher S, FitzGerald M, Foley K, Foley C, Franke A, Friedrich D, Gage D, Garber M, Gearin G, Giannoukos G, Goode T, Goyette A, Graham J, Grandbois E, Gyaltsen K, Hafez N, Hagopian D, Hagos B, Hall J, Healy C, Hegarty R, Honan T, Horn A, Houde N, Hughes L, Hunnicutt L, Husby M, Jester B, Jones C, Kamat A, Kanga B, Kells C, Khazanovich D, Kieu AC, Kisner P, Kumar M, Lance K, Landers T, Lara M, Lee W, Leger JP, Lennon N, Leuper L, LeVine S, Liu J, Liu X, Lokyitsang Y, Lokyitsang T, Lui A, Macdonald J, Major J, Marabella R, Maru K, Matthews C, McDonough S, Mehta T, Meldrim J, Melnikov A, Meneus L, Mihalev A, Mihova T, Miller K, Mittelman R, Mlenga V, Mulrain L, Munson G, Navidi A, Naylor J, Nguyen T, Nguyen N, Nguyen C, Nguyen T, Nicol R, Norbu N, Norbu C, Novod N, Nyima T, Olandt P, O'Neill B, O'Neill K, Osman S, Oyono L, Patti C, Perrin D, Phunkhang P, Pierre F, Priest M, Rachupka A, Raghuraman S, Rameau R, Ray V, Raymond C, Rege F, Rise C, Rogers J, Rogov P, Sahalie J, Settipalli S, Sharpe T, Shea T, Sheehan M, Sherpa N, Shi J, Shih D, Sloan J, Smith C, Sparrow T, Stalker J, Stange-Thomann N, Stavropoulos S, Stone C, Stone S, Sykes S, Tchuinga P, Tenzing P, Tesfaye S, Thoulutsang D, Thoulutsang Y, Topham K, Topping I, Tsamla T, Vassiliev H, Venkataraman V, Vo A, Wangchuk T, Wangdi T, Weiand M, Wilkinson J, Wilson A, Yadav S, Yang S, Yang X, Young G, Yu Q, Zainoun J, Zembek L, Zimmer A, Lander ES. Genome sequence, comparative analysis and haplotype structure of the domestic dog. Nature 2005; 438:803-19. [PMID: 16341006 DOI: 10.1038/nature04338] [Citation(s) in RCA: 1677] [Impact Index Per Article: 88.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2005] [Accepted: 10/11/2005] [Indexed: 12/12/2022]
Abstract
Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.
Collapse
Affiliation(s)
- Kerstin Lindblad-Toh
- Broad Institute of Harvard and MIT, 320 Charles Street, Cambridge, Massachusetts 02141, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Abstract
In mammals, the Y-linked SRY gene is normally responsible for testis induction, yet testis development can occur in the absence of Y-linked genes, including SRY. The canine model of SRY-negative XX sex reversal could lead to the discovery of novel genes in the mammalian sex determination pathway. The autosomal genes causing testis induction in this disorder in dogs, humans, pigs, and horses are presently unknown. In goats, a large deletion is responsible for sex reversal linked to the polled (hornless) phenotype. However, this region has been excluded as being causative of the canine disorder, as have WT1 and DMRT1 in more recent studies. The purpose of this study was to determine whether microsatellite marker alleles near or within five candidate genes (GATA4, FOG2, LHX1, SF1, SOX9) are associated with the affected phenotype in a pedigree of canine SRY-negative XX sex reversal. Primer sequences flanking nucleotide repeats were designed within genomic sequences of canine candidate gene homologues. Fluorescence-labeled polymorphic markers were used to screen a subset of the multigenerational pedigree, and marker alleles were determined by software. Our results indicate that the mutation causing canine SRY-negative XX sex reversal in this pedigree is unlikely to be located in regions containing these candidates.
Collapse
Affiliation(s)
- K Kothapalli
- J. A. Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA
| | | | | | | | | |
Collapse
|
14
|
Pujar S, Kothapalli KS, Kirkness E, Van Wormer RH, Meyers-Wallen VN. Exclusion of Lhx9 as a Candidate Gene for Sry-Negative XX Sex Reversal in the American Cocker Spaniel Model. J Hered 2005; 96:452-4. [PMID: 15814894 DOI: 10.1093/jhered/esi058] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
XX sex reversal is known in 17 breeds of dogs. In the American cocker spaniel, it segregates as an autosomal recessive trait, and the affected animals lack the testis determining Sry gene. In the search for an autosomal gene that causes this trait, we considered the possibility of Lhx9, a gene encoding LIM homeobox containing transcription factor 9, as a candidate gene. An American cocker spaniel pedigree showing Sry-negative XX sex reversal phenotype was genotyped with an intronic Lhx9 microsatellite marker. Segregation of the Lhx9 marker in the pedigree indicated that a mutation in canine Lhx9 is not likely to be the cause of Sry-negative XX sex reversal. In addition, using the recently available 7.6X canine genomic sequence, we report the location and genomic organization of canine Lhx9.
Collapse
Affiliation(s)
- S Pujar
- J. A. Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA
| | | | | | | | | |
Collapse
|
15
|
Werner P, Raducha MG, Shin D, Ostrander EA, Kirkness E, Patterson DF, Henthorn PS. Assignment of 10 canine genes to the canine linkage and comparative maps. Anim Genet 2004; 35:249-51. [PMID: 15147403 DOI: 10.1111/j.1365-2052.2004.01123.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- P Werner
- Center for Comparative Medical Genetics and Section of Medical Genetics, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | | | | | | | | | | | | |
Collapse
|
16
|
Andelfinger G, Hitte C, Etter L, Guyon R, Bourque G, Tesler G, Pevzner P, Kirkness E, Galibert F, Benson DW. Detailed four-way comparative mapping and gene order analysis of the canine ctvm locus reveals evolutionary chromosome rearrangements. Genomics 2004; 83:1053-62. [PMID: 15177558 DOI: 10.1016/j.ygeno.2003.12.009] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2003] [Accepted: 12/17/2003] [Indexed: 11/26/2022]
Abstract
Canine tricuspid valve malformation (CTVM) maps to canine chromosome 9 (CFA9), in a region syntenic with gene-dense human chromosome 17q. To define synteny blocks, we analyzed 148 markers on CFA9 using radiation hybrid mapping and established a four-way comparative map for human, mouse, rat, and dog. We identified a large number of rearrangements, allowing us to reconstruct the evolutionary history of individual synteny blocks and large chromosomal segments. A most parsimonious rearrangement scenario for all four species reveals that human chromosome 17q differs from CFA9 and the syntenic rodent chromosomes through two macroreversals of 9.2 and 23 Mb. Compared to a recovered ancestral gene order, CFA9 has undergone 11 reversals of <3 Mb and 2 reversals of >3 Mb. Interspecies reuse of breakpoints for micro- and macrorearrangements was observed. Gene order and content of the ctvm interval are best extrapolated from murine data, showing that multispecies genome rearrangement scenarios contribute to identifying gene content in canine mapping studies.
Collapse
Affiliation(s)
- G Andelfinger
- Cardiovascular Genetics, Division of Cardiology, ML 7042, Cincinnati Children's Hospital, 3333 Burnet Avenue, Cincinnati, OH 45229, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Dermitzakis ET, Kirkness E, Schwarz S, Birney E, Reymond A, Antonarakis SE. Comparison of human chromosome 21 conserved nongenic sequences (CNGs) with the mouse and dog genomes shows that their selective constraint is independent of their genic environment. Genome Res 2004; 14:852-9. [PMID: 15078857 PMCID: PMC479112 DOI: 10.1101/gr.1934904] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The analysis of conservation between the human and mouse genomes resulted in the identification of a large number of conserved nongenic sequences (CNGs). The functional significance of this nongenic conservation remains unknown, however. The availability of the sequence of a third mammalian genome, the dog, allows for a large-scale analysis of evolutionary attributes of CNGs in mammals. We have aligned 1638 previously identified CNGs and 976 conserved exons (CODs) from human chromosome 21 (Hsa21) with their orthologous sequences in mouse and dog. Attributes of selective constraint, such as sequence conservation, clustering, and direction of substitutions were compared between CNGs and CODs, showing a clear distinction between the two classes. We subsequently performed a chromosome-wide analysis of CNGs by correlating selective constraint metrics with their position on the chromosome and relative to their distance from genes. We found that CNGs appear to be randomly arranged in intergenic regions, with no bias to be closer or farther from genes. Moreover, conservation and clustering of substitutions of CNGs appear to be completely independent of their distance from genes. These results suggest that the majority of CNGs are not typical of previously described regulatory elements in terms of their location. We propose models for a global role of CNGs in genome function and regulation, through long-distance cis or trans chromosomal interactions.
Collapse
Affiliation(s)
- Emmanouil T Dermitzakis
- Division of Medical Genetics, University of Geneva Medical School, CH-1211 Geneva, Switzerland.
| | | | | | | | | | | |
Collapse
|
18
|
Affiliation(s)
- K S D Kothapalli
- JA Baker Institute for Animal Health, College of Veterinary Medicine, Cornell University, Ithaca, NY 14853, USA
| | | | | | | |
Collapse
|
19
|
Quignon P, Kirkness E, Cadieu E, Touleimat N, Guyon R, Renier C, Hitte C, André C, Fraser C, Galibert F. Comparison of the canine and human olfactory receptor gene repertoires. Genome Biol 2003; 4:R80. [PMID: 14659017 PMCID: PMC329419 DOI: 10.1186/gb-2003-4-12-r80] [Citation(s) in RCA: 80] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2003] [Revised: 10/01/2003] [Accepted: 11/03/2003] [Indexed: 11/25/2022] Open
Abstract
In this study, 817 novel canine olfactory receptor (OR) sequences were identified, and 640 have been characterized. Of the 661 characterized OR sequences, representing half of the canine repertoire, 18% are predicted to be pseudogenes, compared with 63% in human and 20% in mouse. Background Olfactory receptors (ORs), the first dedicated molecules with which odorants physically interact to arouse an olfactory sensation, constitute the largest gene family in vertebrates, including around 900 genes in human and 1,500 in the mouse. Whereas dogs, like many other mammals, have a much keener olfactory potential than humans, only 21 canine OR genes have been described to date. Results In this study, 817 novel canine OR sequences were identified, and 640 have been characterized. Of the 661 characterized OR sequences, representing half of the canine repertoire, 18% are predicted to be pseudogenes, compared with 63% in human and 20% in mouse. Phylogenetic analysis of 403 canine OR sequences identified 51 families, and radiation-hybrid mapping of 562 showed that they are distributed on 24 dog chromosomes, in 37 distinct regions. Most of these regions constitute clusters of 2 to 124 closely linked genes. The two largest clusters (124 and 109 OR genes) are located on canine chromosomes 18 and 21. They are orthologous to human clusters located on human chromosomes 11q11-q13 and HSA11p15, containing 174 and 115 ORs respectively. Conclusions This study shows a strongly conserved genomic distribution of OR genes between dog and human, suggesting that OR genes evolved from a common mammalian ancestral repertoire by successive duplications. In addition, the dog repertoire appears to have expanded relative to that of humans, leading to the emergence of specific canine OR genes.
Collapse
Affiliation(s)
- Pascale Quignon
- UMR 6061 CNRS Génétique et Développement, Faculté de Médecine, 2 Avenue du Professeur Léon Bernard, 35043 Rennes Cedex, France
| | - Ewen Kirkness
- The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA
| | - Edouard Cadieu
- UMR 6061 CNRS Génétique et Développement, Faculté de Médecine, 2 Avenue du Professeur Léon Bernard, 35043 Rennes Cedex, France
| | - Nizar Touleimat
- UMR 6061 CNRS Génétique et Développement, Faculté de Médecine, 2 Avenue du Professeur Léon Bernard, 35043 Rennes Cedex, France
| | - Richard Guyon
- UMR 6061 CNRS Génétique et Développement, Faculté de Médecine, 2 Avenue du Professeur Léon Bernard, 35043 Rennes Cedex, France
| | - Corinne Renier
- UMR 6061 CNRS Génétique et Développement, Faculté de Médecine, 2 Avenue du Professeur Léon Bernard, 35043 Rennes Cedex, France
- Current address: Stanford University School of Medicine, Center for Narcolepsy, 701B Welch Road, Palo Alto, CA 94305-5742, USA
| | - Christophe Hitte
- UMR 6061 CNRS Génétique et Développement, Faculté de Médecine, 2 Avenue du Professeur Léon Bernard, 35043 Rennes Cedex, France
| | - Catherine André
- UMR 6061 CNRS Génétique et Développement, Faculté de Médecine, 2 Avenue du Professeur Léon Bernard, 35043 Rennes Cedex, France
| | - Claire Fraser
- The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA
| | - Francis Galibert
- UMR 6061 CNRS Génétique et Développement, Faculté de Médecine, 2 Avenue du Professeur Léon Bernard, 35043 Rennes Cedex, France
| |
Collapse
|
20
|
Dermitzakis ET, Reymond A, Scamuffa N, Ucla C, Kirkness E, Rossier C, Antonarakis SE. Evolutionary Discrimination of Mammalian Conserved Non-Genic Sequences (CNGs). Science 2003; 302:1033-5. [PMID: 14526086 DOI: 10.1126/science.1087047] [Citation(s) in RCA: 156] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Analysis of the human and mouse genomes identified an abundance of conserved non-genic sequences (CNGs). The significance and evolutionary depth of their conservation remain unanswered. We have quantified levels and patterns of conservation of 191 CNGs of human chromosome 21 in 14 mammalian species. We found that CNGs are significantly more conserved than protein-coding genes and noncoding RNAS (ncRNAs) within the mammalian class from primates to monotremes to marsupials. The pattern of substitutions in CNGs differed from that seen in protein-coding and ncRNA genes and resembled that of protein-binding regions. About 0.3% to 1% of the human genome corresponds to a previously unknown class of extremely constrained CNGs shared among mammals.
Collapse
Affiliation(s)
- Emmanouil T Dermitzakis
- Division of Medical Genetics and National Center of Competence in Research (NCCR) Frontiers in Genetics, University of Geneva Medical School and University Hospitals, 1211 Geneva, Switzerland.
| | | | | | | | | | | | | |
Collapse
|
21
|
Affiliation(s)
- G Andelfinger
- Cardiovascular Genetics, Division of Cardiology, Cincinnati Children's Hospital, 3333 Burnet Avenue, Cincinnati, OH 45229, USA
| | | | | | | | | | | | | |
Collapse
|