51
|
Sanders AD, Hills M, Porubský D, Guryev V, Falconer E, Lansdorp PM. Characterizing polymorphic inversions in human genomes by single-cell sequencing. Genome Res 2016; 26:1575-1587. [PMID: 27472961 PMCID: PMC5088599 DOI: 10.1101/gr.201160.115] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Accepted: 06/13/2016] [Indexed: 12/23/2022]
Abstract
Identifying genomic features that differ between individuals and cells can help uncover the functional variants that drive phenotypes and disease susceptibilities. For this, single-cell studies are paramount, as it becomes increasingly clear that the contribution of rare but functional cellular subpopulations is important for disease prognosis, management, and progression. Until now, studying these associations has been challenged by our inability to map structural rearrangements accurately and comprehensively. To overcome this, we coupled single-cell sequencing of DNA template strands (Strand-seq) with custom analysis software to rapidly discover, map, and genotype genomic rearrangements at high resolution. This allowed us to explore the distribution and frequency of inversions in a heterogeneous cell population, identify several polymorphic domains in complex regions of the genome, and locate rare alleles in the reference assembly. We then mapped the entire genomic complement of inversions within two unrelated individuals to characterize their distinct inversion profiles and built a nonredundant global reference of structural rearrangements in the human genome. The work described here provides a powerful new framework to study structural variation and genomic heterogeneity in single-cell samples, whether from individuals for population studies or tissue types for biomarker discovery.
Collapse
Affiliation(s)
- Ashley D Sanders
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Mark Hills
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - David Porubský
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, NL-9713 AV Groningen, The Netherlands
| | - Victor Guryev
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, NL-9713 AV Groningen, The Netherlands
| | - Ester Falconer
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada
| | - Peter M Lansdorp
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, V5Z 1L3, Canada.,European Research Institute for the Biology of Ageing, University of Groningen, University Medical Centre Groningen, NL-9713 AV Groningen, The Netherlands.,Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| |
Collapse
|
52
|
Searles Quick VB, Davis JM, Olincy A, Sikela JM. DUF1220 copy number is associated with schizophrenia risk and severity: implications for understanding autism and schizophrenia as related diseases. Transl Psychiatry 2015; 5:e697. [PMID: 26670282 PMCID: PMC5068589 DOI: 10.1038/tp.2015.192] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2015] [Revised: 09/29/2015] [Accepted: 10/21/2015] [Indexed: 11/30/2022] Open
Abstract
The copy number of DUF1220, a protein domain implicated in human brain evolution, has been linearly associated with autism severity. Given the possibility that autism and schizophrenia are related disorders, the present study examined DUF1220 copy number variation in schizophrenia severity. There are notable similarities between autism symptoms and schizophrenia negative symptoms, and divergence between autism symptoms and schizophrenia positive symptoms. We therefore also examined DUF1220 copy number in schizophrenia subgroups defined by negative and positive symptom features, versus autistic individuals and controls. In the schizophrenic population (N=609), decreased DUF1220 copy number was linearly associated with increasing positive symptom severity (CON1 P=0.013, HLS1 P=0.0227), an association greatest in adult-onset schizophrenia (CON1 P=0.00155, HLS1 P=0.00361). In schizophrenic males, DUF1220 CON1 subtype copy number increase was associated with increased negative symptom severity (P=0.0327), a finding similar to that seen in autistic populations. Subgroup analyses demonstrated that schizophrenic individuals with predominantly positive symptoms exhibited reduced CON1 copy number compared with both controls (P=0.0237) and schizophrenic individuals with predominantly negative symptoms (P=0.0068). These findings support the view that (1) autism and schizophrenia exhibit both opposing and partially overlapping phenotypes and may represent a disease continuum, (2) variation in DUF1220 copy number contributes to schizophrenia disease risk and to the severity of both disorders, and (3) schizophrenia and autism may be, in part, a harmful by-product of the rapid and extreme evolutionary increase in DUF1220 copy number in the human species.
Collapse
Affiliation(s)
- V B Searles Quick
- Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Genomics and Medical Scientist Training Programs, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - J M Davis
- Department of Psychiatry, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - A Olincy
- Department of Psychiatry, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - J M Sikela
- Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Genomics and Medical Scientist Training Programs, University of Colorado Anschutz Medical Campus, Aurora, CO, USA,Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Genomics and Medical Scientist Training Programs, University of Colorado Anschutz Medical Campus, 12801 E. 17th Avenue, Aurora, CO 80045, USA. E-mail:
| |
Collapse
|
53
|
Zimmer F, Montgomery SH. Phylogenetic Analysis Supports a Link between DUF1220 Domain Number and Primate Brain Expansion. Genome Biol Evol 2015; 7:2083-8. [PMID: 26112965 PMCID: PMC4558844 DOI: 10.1093/gbe/evv122] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The expansion of DUF1220 domain copy number during human evolution is a dramatic example of rapid and repeated domain duplication. Although patterns of expression, homology, and disease associations suggest a role in cortical development, this hypothesis has not been robustly tested using phylogenetic methods. Here, we estimate DUF1220 domain counts across 12 primate genomes using a nucleotide Hidden Markov Model. We then test a series of hypotheses designed to examine the potential evolutionary significance of DUF1220 copy number expansion. Our results suggest a robust association with brain size, and more specifically neocortex volume. In contradiction to previous hypotheses, we find a strong association with postnatal brain development but not with prenatal brain development. Our results provide further evidence of a conserved association between specific loci and brain size across primates, suggesting that human brain evolution may have occurred through a continuation of existing processes.
Collapse
Affiliation(s)
- Fabian Zimmer
- Department of Genetics, Evolution & Environment, University College London, United Kingdom
| | - Stephen H Montgomery
- Department of Genetics, Evolution & Environment, University College London, United Kingdom
| |
Collapse
|
54
|
Andries V, Vandepoele K, Staes K, Berx G, Bogaert P, Van Isterdael G, Ginneberge D, Parthoens E, Vandenbussche J, Gevaert K, van Roy F. NBPF1, a tumor suppressor candidate in neuroblastoma, exerts growth inhibitory effects by inducing a G1 cell cycle arrest. BMC Cancer 2015; 15:391. [PMID: 25958384 PMCID: PMC4440459 DOI: 10.1186/s12885-015-1408-5] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Accepted: 04/29/2015] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND NBPF1 (Neuroblastoma Breakpoint Family, member 1) was originally identified in a neuroblastoma patient on the basis of its disruption by a chromosomal translocation t(1;17)(p36.2;q11.2). Considering this genetic defect and the frequent genomic alterations of the NBPF1 locus in several cancer types, we hypothesized that NBPF1 is a tumor suppressor. Decreased expression of NBPF1 in neuroblastoma cell lines with loss of 1p36 heterozygosity and the marked decrease of anchorage-independent clonal growth of DLD1 colorectal carcinoma cells with induced NBPF1 expression further suggest that NBPF1 functions as tumor suppressor. However, little is known about the mechanisms involved. METHODS Expression of NBPF was analyzed in human skin and human cervix by immunohistochemistry. The effects of NBPF1 on the cell cycle were evaluated by flow cytometry. We investigated by real-time quantitative RT-PCR the expression profile of a panel of genes important in cell cycle regulation. Protein levels of CDKN1A-encoded p21(CIP1/WAF1) were determined by western blotting and the importance of p53 was shown by immunofluorescence and by a loss-of-function approach. LC-MS/MS analysis was used to investigate the proteome of DLD1 colon cancer cells with induced NBPF1 expression. Possible biological interactions between the differentially regulated proteins were investigated with the Ingenuity Pathway Analysis tool. RESULTS We show that NBPF is expressed in the non-proliferative suprabasal layers of squamous stratified epithelia of human skin and cervix. Forced expression of NBPF1 in HEK293T cells resulted in a G1 cell cycle arrest that was accompanied by upregulation of the cyclin-dependent kinase inhibitor p21(CIP1/WAF1) in a p53-dependent manner. Additionally, forced expression of NBPF1 in two p53-mutant neuroblastoma cell lines also resulted in a G1 cell cycle arrest and CDKN1A upregulation. However, CDKN1A upregulation by NBPF1 was not observed in the DLD1 cells, which demonstrates that NBPF1 exerts cell-specific effects. In addition, proteome analysis of NBPF1-overexpressing DLD1 cells identified 32 differentially expressed proteins, of which several are implicated in carcinogenesis. CONCLUSIONS We demonstrated that NBPF1 exerts different tumor suppressive effects, depending on the cell line analyzed, and provide new clues into the molecular mechanism of the enigmatic NBPF proteins.
Collapse
Affiliation(s)
- Vanessa Andries
- Inflammation Research Center, VIB, Ghent, Belgium. .,Department of Biomedical Molecular Biology, Ghent University, Technologiepark 927, B-9052, Ghent, Zwijnaarde, Belgium.
| | - Karl Vandepoele
- Inflammation Research Center, VIB, Ghent, Belgium. .,Department of Biomedical Molecular Biology, Ghent University, Technologiepark 927, B-9052, Ghent, Zwijnaarde, Belgium. .,Laboratory for Molecular Diagnostics - Hematology, Ghent University Hospital, Ghent, Belgium.
| | | | - Geert Berx
- Inflammation Research Center, VIB, Ghent, Belgium. .,Department of Biomedical Molecular Biology, Ghent University, Technologiepark 927, B-9052, Ghent, Zwijnaarde, Belgium.
| | - Pieter Bogaert
- Inflammation Research Center, VIB, Ghent, Belgium. .,BARC Global Central Laboratory, Ghent, Zwijnaarde, Belgium.
| | - Gert Van Isterdael
- Inflammation Research Center, VIB, Ghent, Belgium. .,Department of Internal Medicine, Ghent University, Ghent, Belgium.
| | | | - Eef Parthoens
- Department of Biomedical Molecular Biology, Ghent University, Technologiepark 927, B-9052, Ghent, Zwijnaarde, Belgium. .,BioImaging Core, VIB, Ghent, Belgium.
| | - Jonathan Vandenbussche
- Department of Medical Protein Research, VIB, Ghent, Belgium. .,Department of Biochemistry, Ghent University, Ghent, Belgium.
| | - Kris Gevaert
- Department of Medical Protein Research, VIB, Ghent, Belgium. .,Department of Biochemistry, Ghent University, Ghent, Belgium.
| | - Frans van Roy
- Inflammation Research Center, VIB, Ghent, Belgium. .,Department of Biomedical Molecular Biology, Ghent University, Technologiepark 927, B-9052, Ghent, Zwijnaarde, Belgium.
| |
Collapse
|
55
|
Davis JM, Searles Quick VB, Sikela JM. Replicated linear association between DUF1220 copy number and severity of social impairment in autism. Hum Genet 2015; 134:569-75. [PMID: 25758905 DOI: 10.1007/s00439-015-1537-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2014] [Accepted: 02/27/2015] [Indexed: 11/25/2022]
Abstract
Sequences encoding DUF1220 protein domains exhibit an exceptional human-specific increase in copy number and have been associated with several phenotypes related to brain size. Autism is a highly heritable and heterogeneous condition characterized behaviorally by social and communicative impairments, and increased repetitive and stereotyped behavior. Given the accelerated brain growth pattern observed in many individuals with autism, and the association between DUF1220 subtype CON1 copy number and brain size, we previously investigated associations between CON1 copy number and autism-related symptoms. We determined that CON1 copy number increase is associated with increasing severity of all three behavioral features of autism. The present study sought to replicate these findings in an independent population (N = 166). Our results demonstrate a replication of the linear relationship between CON1 copy number and the severity of social impairment in individuals with autism as measured by Autism Diagnostic Interview-Revised Social Diagnostic Score, such that with each additional copy of CON1 Social Diagnostic Score increased 0.24 points (SE = 0.11, p = 0.036). We also identified an analogous trend between CON1 copy number and Communicative Diagnostic Score, but did not replicate the relationship between CON1 copy number and Repetitive Behavior Diagnostic Score. Interestingly, these associations appear to be most pronounced in multiplex children. These results, representing the first replication of a gene dosage relationship with the severity of a primary symptom of autism, lend further support to the possibility that the same protein domain family implicated in the evolutionary expansion of the human brain may also be involved in autism severity.
Collapse
Affiliation(s)
- J M Davis
- Department of Biochemistry and Molecular Genetics and Human Medical Genetics and Genomics, Medical Scientist Training and Neuroscience Programs, University of Colorado School of Medicine, Anschutz Medical Campus, Aurora, CO, 80045, USA
| | | | | |
Collapse
|
56
|
Romanel A, Lago S, Prandi D, Sboner A, Demichelis F. ASEQ: fast allele-specific studies from next-generation sequencing data. BMC Med Genomics 2015; 8:9. [PMID: 25889339 PMCID: PMC4363342 DOI: 10.1186/s12920-015-0084-2] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2014] [Accepted: 02/12/2015] [Indexed: 11/17/2022] Open
Abstract
Background Single base level information from next-generation sequencing (NGS) allows for the quantitative assessment of biological phenomena such as mosaicism or allele-specific features in healthy and diseased cells. Such studies often present with computationally challenging burdens that hinder genome-wide investigations across large datasets that are now becoming available through the 1,000 Genomes Project and The Cancer Genome Atlas (TCGA) initiatives. Results We present ASEQ, a tool to perform gene-level allele-specific expression (ASE) analysis from paired genomic and transcriptomic NGS data without requiring paternal and maternal genome data. ASEQ offers an easy-to-use set of modes that transparently to the user takes full advantage of a built-in fast computational engine. We report its performances on a set of 20 individuals from the 1,000 Genomes Project and show its detection power on imprinted genes. Next we demonstrate high level of ASE calls concordance when comparing it to AlleleSeq and MBASED tools. Finally, using a prostate cancer dataset we report on a higher fraction of ASE genes with respect to healthy individuals and show allele-specific events nominated by ASEQ in genes that are implicated in the disease. Conclusions ASEQ can be used to rapidly and reliably screen large NGS datasets for the identification of allele specific features. It can be integrated in any NGS pipeline and runs on computer systems with multiple CPUs, CPUs with multiple cores or across clusters of machines. Electronic supplementary material The online version of this article (doi:10.1186/s12920-015-0084-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Alessandro Romanel
- Centre for Integrative Biology (CIBIO), University of Trento, Trento, Italy.
| | - Sara Lago
- Centre for Integrative Biology (CIBIO), University of Trento, Trento, Italy.
| | - Davide Prandi
- Centre for Integrative Biology (CIBIO), University of Trento, Trento, Italy.
| | - Andrea Sboner
- Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, USA. .,Institute for Computational Biomedicine, Weill Cornell Medical College, New York, USA. .,Institute for Precision Medicine, Weill Cornell Medical College & New York Presbyterian Hospital, New York, USA.
| | - Francesca Demichelis
- Centre for Integrative Biology (CIBIO), University of Trento, Trento, Italy. .,Institute for Computational Biomedicine, Weill Cornell Medical College, New York, USA. .,Institute for Precision Medicine, Weill Cornell Medical College & New York Presbyterian Hospital, New York, USA.
| |
Collapse
|
57
|
Keeney JG, O'Bleness MS, Anderson N, Davis JM, Arevalo N, Busquet N, Chick W, Rozman J, Hölter SM, Garrett L, Horsch M, Beckers J, Wurst W, Klingenspor M, Restrepo D, de Angelis MH, Sikela JM. Generation of mice lacking DUF1220 protein domains: effects on fecundity and hyperactivity. Mamm Genome 2015; 26:33-42. [PMID: 25308000 PMCID: PMC4305498 DOI: 10.1007/s00335-014-9545-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 09/01/2014] [Indexed: 12/30/2022]
Abstract
Sequences encoding DUF1220 protein domains show the most extreme human lineage-specific copy number increase of any coding region in the genome and have been linked to human brain evolution. In addition, DUF1220 copy number (dosage) has been implicated in influencing brain size within the human species, both in normal populations and in individuals associated with brain size pathologies (1q21-associated microcephaly and macrocephaly). More recently, increasing dosage of a subtype of DUF1220 has been linked with increasing severity of the primary symptoms of autism. Despite these intriguing associations, a function for these domains has not been described. As a first step in addressing this question, we have developed the first transgenic model of DUF1220 function by removing the single DUF1220 domain (the ancestral form) encoded in the mouse genome. In a hypothesis generating exercise, these mice were evaluated by 197 different phenotype measurements. While resulting DUF1220-minus (KO) mice show no obvious anatomical peculiarities, they exhibit a significantly reduced fecundity (χ(2) = 19.1, df = 2, p = 7.0 × 10(-5)). Further extensive phenotypic analyses suggest hyperactivity (p < 0.05) of DUF1220 mice and changes in gene expression levels of brain associated with distinct neurological functions and disease. Other changes that met statistical significance include an increase in plasma glucose concentration (as measured by area under the curve, AUC 0-30 and AUC 30-120) in male mutants, fasting glucose levels, reduce sodium levels in male mutants, increased levels of the liver functional indicator ALAT/GPT in males, levels of alkaline phosphatase (also an indicator of liver function), mean R and SR amplitude by electrocardiography, elevated IgG3 levels, a reduced ratio of CD4:CD8 cells, and a reduced frequency of T cells; though it should be noted that many of these differences are quite small and require further examination. The linking of DUF1220 loss to a hyperactive phenotype is consistent with separate findings in which DUF1220 over expression results in a down-regulation of mitochondrial function, and potentially suggests a role in developmental metabolism. Finally, the substantially reduced fecundity we observe associated with KO mice argues that the ancestral DUF1220 domain provides an important biological functionthat is critical to survivability and reproductive success.
Collapse
Affiliation(s)
- J G Keeney
- Department of Biochemistry and Molecular Genetics and Human Medical Genetics and Neuroscience Programs, University of Colorado School of Medicine, Anschutz Medical Campus, Aurora, CO, 80045, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
58
|
Davis JM, Searles VB, Anderson N, Keeney J, Raznahan A, Horwood LJ, Fergusson DM, Kennedy MA, Giedd J, Sikela JM. DUF1220 copy number is linearly associated with increased cognitive function as measured by total IQ and mathematical aptitude scores. Hum Genet 2015; 134:67-75. [PMID: 25287832 PMCID: PMC5898241 DOI: 10.1007/s00439-014-1489-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Accepted: 09/22/2014] [Indexed: 10/24/2022]
Abstract
DUF1220 protein domains exhibit the greatest human lineage-specific copy number expansion of any protein-coding sequence in the genome, and variation in DUF1220 copy number has been linked to both brain size in humans and brain evolution among primates. Given these findings, we examined associations between DUF1220 subtypes CON1 and CON2 and cognitive aptitude. We identified a linear association between CON2 copy number and cognitive function in two independent populations of European descent. In North American males, an increase in CON2 copy number corresponded with an increase in WISC IQ (R (2) = 0.13, p = 0.02), which may be driven by males aged 6-11 (R (2) = 0.42, p = 0.003). We utilized ddPCR in a subset as a confirmatory measurement. This group had 26-33 copies of CON2 with a mean of 29, and each copy increase of CON2 was associated with a 3.3-point increase in WISC IQ (R (2) = 0.22, p = 0.045). In individuals from New Zealand, an increase in CON2 copy number was associated with an increase in math aptitude ability (R (2) = 0.10 p = 0.018). These were not confounded by brain size. To our knowledge, this is the first study to report a replicated association between copy number of a gene coding sequence and cognitive aptitude. Remarkably, dosage variations involving DUF1220 sequences have now been linked to human brain expansion, autism severity and cognitive aptitude, suggesting that such processes may be genetically and mechanistically inter-related. The findings presented here warrant expanded investigations in larger, well-characterized cohorts.
Collapse
Affiliation(s)
- Jonathon M Davis
- Department of Biochemistry and Molecular Genetics and Human Medical Genetics, Medical Scientist Training and Neuroscience Programs, University of Colorado School of Medicine, Anschutz Medical Campus, RC1-S, L18-10125, 12801 East 17th Ave, Mailstop 8101, P.O. Box 6511, Aurora, CO, 80045, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
59
|
Mayrhofer M, Kultima HG, Birgisson H, Sundström M, Mathot L, Edlund K, Viklund B, Sjöblom T, Botling J, Micke P, Påhlman L, Glimelius B, Isaksson A. 1p36 deletion is a marker for tumour dissemination in microsatellite stable stage II-III colon cancer. BMC Cancer 2014; 14:872. [PMID: 25420937 PMCID: PMC4251789 DOI: 10.1186/1471-2407-14-872] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2014] [Accepted: 11/13/2014] [Indexed: 12/15/2022] Open
Abstract
Background The clinical behaviour of colon cancer is heterogeneous. Five-year overall survival is 50-65% with all stages included. Recurring somatic chromosomal alterations have been identified and some have shown potential as markers for dissemination of the tumour, which is responsible for most colon cancer deaths. We investigated 115 selected stage II-IV primary colon cancers for associations between chromosomal alterations and tumour dissemination. Methods Follow-up was at least 5 years for stage II-III patients without distant recurrence. Affymetrix SNP 6.0 microarrays and allele-specific copy number analysis were used to identify chromosomal alterations. Fisher’s exact test was used to associate alterations with tumour dissemination, detected at diagnosis (stage IV) or later as recurrent disease (stage II-III). Results Loss of 1p36.11-21 was associated with tumour dissemination in microsatellite stable tumours of stage II-IV (odds ratio = 5.5). It was enriched to a similar extent in tumours with distant recurrence within stage II and stage III subgroups, and may therefore be used as a prognostic marker at diagnosis. Loss of 1p36.11-21 relative to average copy number of the genome showed similar prognostic value compared to absolute loss of copies. Therefore, the use of relative loss as a prognostic marker would benefit more patients by applying also to hyperploid cancer genomes. The association with tumour dissemination was supported by independent data from the The Cancer Genome Atlas. Conclusion Deletions on 1p36 may be used to guide adjuvant treatment decisions in microsatellite stable colon cancer of stages II and III. Electronic supplementary material The online version of this article (doi:10.1186/1471-2407-14-872) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | | | - Anders Isaksson
- Science for Life Laboratory, Department of Medical Sciences, Uppsala University, Box 3056, Uppsala 750 03, Sweden.
| |
Collapse
|
60
|
Abstract
BACKGROUND Many aspects of autoimmune disease are not well understood, including the specificities of autoimmune targets, and patterns of co-morbidity and cross-heritability across diseases. Prior work has provided evidence that somatic mutation caused by gene conversion and deletion at segmentally duplicated loci is relevant to several diseases. Simple tandem repeat (STR) sequence is highly mutable, both somatically and in the germ-line, and somatic STR mutations are observed under inflammation. RESULTS Protein-coding genes spanning STRs having markers of mutability, including germ-line variability, high total length, repeat count and/or repeat similarity, are evaluated in the context of autoimmunity. For the initiation of autoimmune disease, antigens whose autoantibodies are the first observed in a disease, termed primary autoantigens, are informative. Three primary autoantigens, thyroid peroxidase (TPO), phogrin (PTPRN2) and filaggrin (FLG), include STRs that are among the eleven longest STRs spanned by protein-coding genes. This association of primary autoantigens with long STR sequence is highly significant (p<3.0x10(-7)). Long STRs occur within twenty genes that are associated with sixteen common autoimmune diseases and atherosclerosis. The repeat within the TTC34 gene is an outlier in terms of length and a link with systemic lupus erythematosus is proposed. CONCLUSIONS The results support the hypothesis that many autoimmune diseases are triggered by immune responses to proteins whose DNA sequence mutates somatically in a coherent, consistent fashion. Other autoimmune diseases may be caused by coherent somatic mutations in immune cells. The coherent somatic mutation hypothesis has the potential to be a comprehensive explanation for the initiation of many autoimmune diseases.
Collapse
Affiliation(s)
- Kenneth Andrew Ross
- Department of Computer Science, Columbia University, New York, New York, United States of America
| |
Collapse
|
61
|
DUF1220 protein domains drive proliferation in human neural stem cells and are associated with increased cortical volume in anthropoid primates. Brain Struct Funct 2014; 220:3053-60. [PMID: 24957859 DOI: 10.1007/s00429-014-0814-9] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2013] [Accepted: 05/27/2014] [Indexed: 10/25/2022]
Abstract
Genome sequences encoding DUF1220 protein domains show a burst in copy number among anthropoid species and especially humans, where they have undergone the greatest human lineage-specific copy number expansion of any protein coding sequence in the genome. While DUF1220 copy number shows a dosage-related association with brain size in both normal populations and in 1q21.1-associated microcephaly and macrocephaly, a function for these domains has not yet been described. Here we provide multiple lines of evidence supporting the view that DUF1220 domains function as drivers of neural stem cell proliferation among anthropoid species including humans. First, we show that brain MRI data from 131 individuals across 7 anthropoid species shows a strong correlation between DUF1220 copy number and multiple brain size-related measures. Using in situ hybridization analyses of human fetal brain, we also show that DUF1220 domains are expressed in the ventricular zone and primarily during human cortical neurogenesis, and are therefore expressed at the right time and place to be affecting cortical brain development. Finally, we demonstrate that in vitro expression of DUF1220 sequences in neural stem cells strongly promotes proliferation. Taken together, these data provide the strongest evidence so far reported implicating DUF1220 dosage in anthropoid and human brain expansion through mechanisms involving increasing neural stem cell proliferation.
Collapse
|
62
|
Keeney JG, Dumas L, Sikela JM. The case for DUF1220 domain dosage as a primary contributor to anthropoid brain expansion. Front Hum Neurosci 2014; 8:427. [PMID: 25009482 PMCID: PMC4067907 DOI: 10.3389/fnhum.2014.00427] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2013] [Accepted: 05/28/2014] [Indexed: 12/14/2022] Open
Abstract
Here we present the hypothesis that increasing copy number (dosage) of sequences encoding DUF1220 protein domains is a major contributor to the evolutionary increase in brain size, neuron number, and cognitive capacity that is associated with the primate order. We further propose that this relationship is restricted to the anthropoid sub-order of primates, with DUF1220 copy number markedly increasing in monkeys, further in apes, and most extremely in humans where the greatest number of copies (~272 haploid copies) is found. We show that this increase closely parallels the increase in brain size and neuron number that has occurred among anthropoid primate species. We also provide evidence linking DUF1220 copy number to brain size within the human species, both in normal populations and in individuals associated with brain size pathologies (1q21-associated microcephaly and macrocephaly). While we believe these and other findings presented here strongly suggest increase in DUF1220 copy number is a key contributor to anthropoid brain expansion, the data currently available rely largely on correlative measures that, though considerable, do not yet provide direct evidence for a causal connection. Nevertheless, we believe the evidence presented is sufficient to provide the basis for a testable model which proposes that DUF1220 protein domain dosage increase is a main contributor to the increase in brain size and neuron number found among the anthropoid primate species and that is at its most extreme in human.
Collapse
Affiliation(s)
- Jonathon G Keeney
- Department of Biochemistry and Molecular Genetics and Human Medical Genetics and Neuroscience Programs, University of Colorado School of Medicine, Anschutz Medical Campus Aurora, CO, USA
| | - Laura Dumas
- Department of Biochemistry and Molecular Genetics and Human Medical Genetics and Neuroscience Programs, University of Colorado School of Medicine, Anschutz Medical Campus Aurora, CO, USA
| | - James M Sikela
- Department of Biochemistry and Molecular Genetics and Human Medical Genetics and Neuroscience Programs, University of Colorado School of Medicine, Anschutz Medical Campus Aurora, CO, USA
| |
Collapse
|
63
|
Li W, Li K, Zhao L, Zou H. Bioinformatics analysis reveals disturbance mechanism of MAPK signaling pathway and cell cycle in Glioblastoma multiforme. Gene 2014; 547:346-50. [PMID: 24967941 DOI: 10.1016/j.gene.2014.06.042] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2013] [Revised: 06/09/2014] [Accepted: 06/21/2014] [Indexed: 01/09/2023]
Abstract
BACKGROUND & OBJECTIVES To analyze the reversal gene pairs and identify featured reversal genes related to mitogen-activated protein kinases (MAPK) signaling pathway and cell cycle in Glioblastoma multiforme (GBM) to reveal its pathogenetic mechanism. METHODS We downloaded the gene expression profile GSE4290 from the Gene Expression Omnibus database, including 81 gene chips of GBM and 23 gene chips of controls. The t test was used to analyze the DEGs (differentially expressed genes) between 23 normal and 81 GBM samples. Then some perturbing metabolic pathways, including MAPK (mitogen-activated protein kinases) and cell cycle signaling pathway, were extracted from KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway database. Cancer genes were obtained from the database of Cancer Gene Census. The reversal gene pairs between DEGs and cancer genes were further analyzed in MAPK and cell cycle signaling pathway. RESULTS A total 8523 DEGs were obtained including 4090 up-regulated and 4433 down-regulated genes. Among them, ras-related protein rab-13(RAB13), neuroblastoma breakpoint family member 10 (NBPF10) and disks large homologue 4 (DLG4) were found to be involved in GBM for the first time. We obtained MAPK and cell cycle signaling pathways from KEGG database. By analyzing perturbing mechanism in these two pathways, we identified several reversal gene pairs, including NRAS (neuroblastoma RAS) and CDK2 (cyclin-dependent kinase 2), CCND1 (cyclin D1) and FGFR (fibroblast growth factor receptor). Further analysis showed that NRAS and CDK2 were positively related with GBM. However, FGFR2 and CCND1 were negatively related with GBM. INTERPRETATION & CONCLUSIONS These findings suggest that newly identified DEGs and featured reversal gene pairs participated in MAPK and cell cycle signaling pathway may provide a new therapeutic line of approach to GBM.
Collapse
Affiliation(s)
- Wusheng Li
- Department of Oncology, Shengjing Hospital of China Medical University, Shenyang 110023, China.
| | - Kai Li
- Department of Oncology, Shengjing Hospital of China Medical University, Shenyang 110023, China
| | - Li Zhao
- Department of Oncology, Shengjing Hospital of China Medical University, Shenyang 110023, China
| | - Huawei Zou
- Department of Oncology, Shengjing Hospital of China Medical University, Shenyang 110023, China
| |
Collapse
|
64
|
Zhang Q, Su B. Evolutionary origin and human-specific expansion of a cancer/testis antigen gene family. Mol Biol Evol 2014; 31:2365-75. [PMID: 24916032 DOI: 10.1093/molbev/msu188] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Cancer/testis (CT) antigens are encoded by germline genes and are aberrantly expressed in a number of human cancers. Interestingly, CT antigens are frequently involved in gene families that are highly expressed in germ cells. Here, we presented an evolutionary analysis of the CTAGE (cutaneous T-cell-lymphoma-associated antigen) gene family to delineate its molecular history and functional significance during primate evolution. Comparisons among human, chimpanzee, gorilla, orangutan, macaque, marmoset, and other mammals show a rapid and primate specific expansion of CTAGE family, which starts with an ancestral retroposition in the haplorhini ancestor. Subsequent DNA-based duplications lead to the prosperity of single-exon CTAGE copies in catarrhines, especially in humans. Positive selection was identified on the single-exon copies in comparison with functional constraint on the multiexon copies. Further sequence analysis suggests that the newly derived CTAGE genes may obtain regulatory elements from long terminal repeats. Our result indicates the dynamic evolution of primate genomes, and the recent expansion of this CT antigen family in humans may confer advantageous phenotypic traits during early human evolution.
Collapse
Affiliation(s)
- Qu Zhang
- Department of Human Evolutionary Biology, Graduate School of Art and Science, Harvard University
| | - Bing Su
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
65
|
Tsakogiannis D, Kyriakopoulou Z, Darmis F, Ruether I, Dimitriou T, Orfanoudakis G, Panotopoulou E, Markoulatos P. Prevalence of HPV16 E1-1374^63nt variants in Greek women. J Med Virol 2014; 86:778-84. [DOI: 10.1002/jmv.23896] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/14/2014] [Indexed: 11/07/2022]
Affiliation(s)
- D. Tsakogiannis
- Department of Biochemistry & Biotechnology, Microbiology-Virology Laboratory; School of Health Sciences; University of Thessaly; Larissa Greece
| | - Z. Kyriakopoulou
- Department of Biochemistry & Biotechnology, Microbiology-Virology Laboratory; School of Health Sciences; University of Thessaly; Larissa Greece
| | - F. Darmis
- Department of Biochemistry & Biotechnology, Microbiology-Virology Laboratory; School of Health Sciences; University of Thessaly; Larissa Greece
| | - I.G.A. Ruether
- Department of Biochemistry & Biotechnology, Microbiology-Virology Laboratory; School of Health Sciences; University of Thessaly; Larissa Greece
| | - T.G. Dimitriou
- Department of Biochemistry & Biotechnology, Microbiology-Virology Laboratory; School of Health Sciences; University of Thessaly; Larissa Greece
| | - G. Orfanoudakis
- Oncoprotein Group; University of Strasbourg; CNRS FRE 3211, The Biotechnology School of Strasbourg, ESBS, University of Strasbourg; Illkirch France
| | - E. Panotopoulou
- Papanicolaou Research Centre of Oncology and Experimental Surgery; Anticancer Oncology Hospital of Athens “St Savvas”; Athens Greece
| | - P. Markoulatos
- Department of Biochemistry & Biotechnology, Microbiology-Virology Laboratory; School of Health Sciences; University of Thessaly; Larissa Greece
| |
Collapse
|
66
|
Li W, Freudenberg J, Miramontes P. Diminishing return for increased Mappability with longer sequencing reads: implications of the k-mer distributions in the human genome. BMC Bioinformatics 2014; 15:2. [PMID: 24386976 PMCID: PMC3927684 DOI: 10.1186/1471-2105-15-2] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2013] [Accepted: 12/17/2013] [Indexed: 11/10/2022] Open
Abstract
Background The amount of non-unique sequence (non-singletons) in a genome directly affects the difficulty of read alignment to a reference assembly for high throughput-sequencing data. Although a longer read is more likely to be uniquely mapped to the reference genome, a quantitative analysis of the influence of read lengths on mappability has been lacking. To address this question, we evaluate the k-mer distribution of the human reference genome. The k-mer frequency is determined for k ranging from 20 bp to 1000 bp. Results We observe that the proportion of non-singletons k-mers decreases slowly with increasing k, and can be fitted by piecewise power-law functions with different exponents at different ranges of k. A slower decay at greater values for k indicates more limited gains in mappability for read lengths between 200 bp and 1000 bp. The frequency distributions of k-mers exhibit long tails with a power-law-like trend, and rank frequency plots exhibit a concave Zipf’s curve. The most frequent 1000-mers comprise 172 regions, which include four large stretches on chromosomes 1 and X, containing genes of biomedical relevance. Comparison with other databases indicates that the 172 regions can be broadly classified into two types: those containing LINE transposable elements and those containing segmental duplications. Conclusion Read mappability as measured by the proportion of singletons increases steadily up to the length scale around 200 bp. When read length increases above 200 bp, smaller gains in mappability are expected. Moreover, the proportion of non-singletons decreases with read lengths much slower than linear. Even a read length of 1000 bp would not allow the unique alignment of reads for many coding regions of human genes. A mix of techniques will be needed for efficiently producing high-quality data that cover the complete human genome.
Collapse
Affiliation(s)
- Wentian Li
- The Robert S, Boas Center for Genomics and Human Genetic, The Feinstein Institute for Medical Research, North Shore LIJ Health System, 350 Community Drive, Manhasset, USA.
| | | | | |
Collapse
|
67
|
Kozanitis C, Heiberg A, Varghese G, Bafna V. Using Genome Query Language to uncover genetic variation. Bioinformatics 2014; 30:1-8. [PMID: 23751181 PMCID: PMC3866549 DOI: 10.1093/bioinformatics/btt250] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2013] [Revised: 04/25/2013] [Accepted: 04/26/2013] [Indexed: 12/27/2022] Open
Abstract
MOTIVATION With high-throughput DNA sequencing costs dropping <$1000 for human genomes, data storage, retrieval and analysis are the major bottlenecks in biological studies. To address the large-data challenges, we advocate a clean separation between the evidence collection and the inference in variant calling. We define and implement a Genome Query Language (GQL) that allows for the rapid collection of evidence needed for calling variants. RESULTS We provide a number of cases to showcase the use of GQL for complex evidence collection, such as the evidence for large structural variations. Specifically, typical GQL queries can be written in 5-10 lines of high-level code and search large datasets (100 GB) in minutes. We also demonstrate its complementarity with other variant calling tools. Popular variant calling tools can achieve one order of magnitude speed-up by using GQL to retrieve evidence. Finally, we show how GQL can be used to query and compare multiple datasets. By separating the evidence and inference for variant calling, it frees all variant detection tools from the data intensive evidence collection and focuses on statistical inference. AVAILABILITY GQL can be downloaded from http://cseweb.ucsd.edu/~ckozanit/gql.
Collapse
Affiliation(s)
- Christos Kozanitis
- Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, San Diego, CA 92123 and Microsoft Research, 1065 La Avenida, Mountain View, CA 94043, USA
| | | | | | | |
Collapse
|
68
|
Dumont BL, Eichler EE. Signals of historical interlocus gene conversion in human segmental duplications. PLoS One 2013; 8:e75949. [PMID: 24124524 PMCID: PMC3790853 DOI: 10.1371/journal.pone.0075949] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Accepted: 08/17/2013] [Indexed: 12/04/2022] Open
Abstract
Standard methods of DNA sequence analysis assume that sequences evolve independently, yet this assumption may not be appropriate for segmental duplications that exchange variants via interlocus gene conversion (IGC). Here, we use high quality multiple sequence alignments from well-annotated segmental duplications to systematically identify IGC signals in the human reference genome. Our analysis combines two complementary methods: (i) a paralog quartet method that uses DNA sequence simulations to identify a statistical excess of sites consistent with inter-paralog exchange, and (ii) the alignment-based method implemented in the GENECONV program. One-quarter (25.4%) of the paralog families in our analysis harbor clear IGC signals by the quartet approach. Using GENECONV, we identify 1477 gene conversion tracks that cumulatively span 1.54 Mb of the genome. Our analyses confirm the previously reported high rates of IGC in subtelomeric regions and Y-chromosome palindromes, and identify multiple novel IGC hotspots, including the pregnancy specific glycoproteins and the neuroblastoma breakpoint gene families. Although the duplication history of a paralog family is described by a single tree, we show that IGC has introduced incredible site-to-site variation in the evolutionary relationships among paralogs in the human genome. Our findings indicate that IGC has left significant footprints in patterns of sequence diversity across segmental duplications in the human genome, out-pacing the contributions of single base mutation by orders of magnitude. Collectively, the IGC signals we report comprise a catalog that will provide a critical reference for interpreting observed patterns of DNA sequence variation across duplicated genomic regions, including targets of recent adaptive evolution in humans.
Collapse
Affiliation(s)
- Beth L. Dumont
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail:
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Seattle, Washington, United States of America
| |
Collapse
|
69
|
Zhou F, Xing Y, Xu X, Yang Y, Zhang J, Ma Z, Wang J. NBPF is a potential DNA-binding transcription factor that is directly regulated by NF-κB. Int J Biochem Cell Biol 2013; 45:2479-90. [PMID: 23939288 DOI: 10.1016/j.biocel.2013.07.022] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2013] [Revised: 07/03/2013] [Accepted: 07/29/2013] [Indexed: 02/07/2023]
Abstract
The neuroblastoma breakpoint family (NBPF) has been reported to play potential roles in the development of neuroblastoma and human evolution. However, the exact regulation and function of this family is still unknown. In this study, the genes of NBPF family were found to be densely covered by many high-confidence ChIP-Seq peaks of NF-κB. The expressions of NBPF genes were thus deduced to be regulated by this transcription factor. The activities of NF-κB in HeLa, HepG2 and ECa109 cells were then manipulated with NF-κB activator (TNFα) and inhibitors (BAY11-7082 or p65 siRNA), and the expressions of NBPF genes in these cells were checked. As result, it was found that the expressions of NBPF genes were regulated by NF-κB in HeLa and HepG2 cells. Therefore, the genes of NBPF family were identified as new bona fide direct target genes of NF-κB. In addition, NBPF was also identified as a nuclear protein by in silico prediction and immunolocalization. Finally, the bioinformatics analysis revealed that most of NBPF proteins contained classical nuclear localization signals (NLSs) and a conserved DNA-binding domain similar to the transcription factor stat3b/dna complex or stat-1/dna complex in their N-terminals. Therefore, this study concluded that NBPF was nuclear protein that contained classical NLSs and conserved known DNA-binding domain, and its expression was regulated by another important transcription factor, NF-κB. These findings suggest that NBPF may function as DNA-binding transcription factor in nucleus, which provides important new insight into the functions of NBPF genes in the human cells.
Collapse
Affiliation(s)
- Fei Zhou
- The State Key Laboratory of Bioelectronics, Southeast University, Nanjing 210096, China.
| | | | | | | | | | | | | |
Collapse
|
70
|
Potential role of human-specific genes, human-specific microRNAs and human-specific non-coding regulatory RNAs in the pathogenesis of systemic sclerosis and Sjögren's syndrome. Autoimmun Rev 2013; 12:1046-51. [PMID: 23684698 DOI: 10.1016/j.autrev.2013.04.004] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2013] [Accepted: 04/24/2013] [Indexed: 12/20/2022]
Abstract
The etiology and pathogenesis of human autoimmune diseases remain unknown despite intensive investigations. Although remarkable progress has been accomplished through genome wide association studies in the identification of genetic factors that may predispose to their occurrence or modify their clinical presentation to date no specific gene abnormalities have been conclusively demonstrated to be responsible for these diseases. The completion of the human and chimpanzee genome sequencing has opened up novel opportunities to examine the possible contribution of human specific genes and other regulatory elements unique to the human genome, such as microRNAs and non-coding RNAs, towards the pathogenesis of a variety of human disorders. Thus, it is likely that these human specific genes and non-coding regulatory elements may be involved in the development or the pathogenesis of various disorders that do not occur in non-human primates including certain autoimmune diseases such as Systemic Sclerosis and Primary Sjögren's Syndrome. Here, we discuss recent evidence supporting the notion that human specific genes or human specific microRNA and other non-coding RNA regulatory elements unique to the human genome may participate in the development or in the pathogenesis of Systemic Sclerosis and Primary Sjögren's Syndrome.
Collapse
|
71
|
Abstract
Given the unprecedented tools that are now available for rapidly comparing genomes, the identification and study of genetic and genomic changes that are unique to our species have accelerated, and we are entering a golden age of human evolutionary genomics. Here we provide an overview of these efforts, highlighting important recent discoveries, examples of the different types of human-specific genomic and genetic changes identified, and salient trends, such as the localization of evolutionary adaptive changes to complex loci that are highly enriched for disease associations. Finally, we discuss the remaining challenges, such as the incomplete nature of current genome sequence assemblies and difficulties in linking human-specific genomic changes to human-specific phenotypic traits.
Collapse
|
72
|
Giannuzzi G, Siswara P, Malig M, Marques-Bonet T, Mullikin JC, Ventura M, Eichler EE. Evolutionary dynamism of the primate LRRC37 gene family. Genome Res 2012; 23:46-59. [PMID: 23064749 PMCID: PMC3530683 DOI: 10.1101/gr.138842.112] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Core duplicons in the human genome represent ancestral duplication modules shared by the majority of intrachromosomal duplication blocks within a given chromosome. These cores are associated with the emergence of novel gene families in the hominoid lineage, but their genomic organization and gene characterization among other primates are largely unknown. Here, we investigate the genomic organization and expression of the core duplicon on chromosome 17 that led to the expansion of LRRC37 during primate evolution. A comparison of the LRRC37 gene family organization in human, orangutan, macaque, marmoset, and lemur genomes shows the presence of both orthologous and species-specific gene copies in all primate lineages. Expression profiling in mouse, macaque, and human tissues reveals that the ancestral expression of LRRC37 was restricted to the testis. In the hominid lineage, the pattern of LRRC37 became increasingly ubiquitous, with significantly higher levels of expression in the cerebellum and thymus, and showed a remarkable diversity of alternative splice forms. Transfection studies in HeLa cells indicate that the human FLAG-tagged recombinant LRRC37 protein is secreted after cleavage of a transmembrane precursor and its overexpression can induce filipodia formation.
Collapse
Affiliation(s)
- Giuliana Giannuzzi
- Dipartimento di Biologia, Università degli Studi di Bari Aldo Moro, Bari 70126, Italy
| | | | | | | | | | | | | | | |
Collapse
|
73
|
DUF1220-domain copy number implicated in human brain-size pathology and evolution. Am J Hum Genet 2012; 91:444-54. [PMID: 22901949 DOI: 10.1016/j.ajhg.2012.07.016] [Citation(s) in RCA: 83] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Revised: 05/17/2012] [Accepted: 07/25/2012] [Indexed: 02/04/2023] Open
Abstract
DUF1220 domains show the largest human-lineage-specific increase in copy number of any protein-coding region in the human genome and map primarily to 1q21, where deletions and reciprocal duplications have been associated with microcephaly and macrocephaly, respectively. Given these findings and the high correlation between DUF1220 copy number and brain size across primate lineages (R(2) = 0.98; p = 1.8 × 10(-6)), DUF1220 sequences represent plausible candidates for underlying 1q21-associated brain-size pathologies. To investigate this possibility, we used specialized bioinformatics tools developed for scoring highly duplicated DUF1220 sequences to implement targeted 1q21 array comparative genomic hybridization on individuals (n = 42) with 1q21-associated microcephaly and macrocephaly. We show that of all the 1q21 genes examined (n = 53), DUF1220 copy number shows the strongest association with brain size among individuals with 1q21-associated microcephaly, particularly with respect to the three evolutionarily conserved DUF1220 clades CON1(p = 0.0079), CON2 (p = 0.0134), and CON3 (p = 0.0116). Interestingly, all 1q21 DUF1220-encoding genes belonging to the NBPF family show significant correlations with frontal-occipital-circumference Z scores in the deletion group. In a similar survey of a nondisease population, we show that DUF1220 copy number exhibits the strongest correlation with brain gray-matter volume (CON1, p = 0.0246; and CON2, p = 0.0334). Notably, only DUF1220 sequences are consistently significant in both disease and nondisease populations. Taken together, these data strongly implicate the loss of DUF1220 copy number in the etiology of 1q21-associated microcephaly and support the view that DUF1220 domains function as general effectors of evolutionary, pathological, and normal variation in brain size.
Collapse
|
74
|
O’Bleness MS, Dickens CM, Dumas LJ, Kehrer-Sawatzki H, Wyckoff GJ, Sikela JM. Evolutionary history and genome organization of DUF1220 protein domains. G3 (BETHESDA, MD.) 2012; 2:977-86. [PMID: 22973535 PMCID: PMC3429928 DOI: 10.1534/g3.112.003061] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Accepted: 06/05/2012] [Indexed: 12/04/2022]
Abstract
DUF1220 protein domains exhibit the most extreme human lineage-specific (HLS) copy number increase of any protein coding region in the human genome and have recently been linked to evolutionary and pathological changes in brain size (e.g., 1q21-associated microcephaly). These findings lend support to the view that DUF1220 domain dosage is a key factor in the determination of primate (and human) brain size. Here we analyze 41 animal genomes and present the most complete account to date of the evolutionary history and genome organization of DUF1220 domains and the gene family that encodes them (NBPF). Included among the novel features identified by this analysis is a DUF1220 domain precursor in nonmammalian vertebrates, a unique predicted promoter common to all mammalian NBPF genes, six distinct clades into which DUF1220 sequences can be subdivided, and a previously unknown member of the NBPF gene family (NBPF25). Most importantly, we show that the exceptional HLS increase in DUF1220 copy number (from 102 in our last common ancestor with chimp to 272 in human; an average HLS increase of ~28 copies every million years since the Homo/Pan split) was driven by intragenic domain hyperamplification. This increase primarily involved a 4.7 kb, tandemly repeated three DUF1220 domain unit we have named the HLS DUF1220 triplet, a motif that is a likely candidate to underlie key properties unique to the Homo sapiens brain. Interestingly, all copies of the HLS DUF1220 triplet lie within a human-specific pericentric inversion that also includes the 1q12 C-band, a polymorphic heterochromatin expansion that is unique to the human genome. Both cytogenetic features likely played key roles in the rapid HLS DUF1220 triplet hyperamplification, which is among the most striking genomic changes specific to the human lineage.
Collapse
Affiliation(s)
- Majesta S. O’Bleness
- Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Neuroscience Programs, University of Colorado School of Medicine, Aurora, Colorado 80045
| | - C. Michael Dickens
- Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Neuroscience Programs, University of Colorado School of Medicine, Aurora, Colorado 80045
| | - Laura J. Dumas
- Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Neuroscience Programs, University of Colorado School of Medicine, Aurora, Colorado 80045
| | | | - Gerald J. Wyckoff
- Division of Molecular Biology and Biochemistry, School of Biological Sciences, University of Missouri, Kansas City, Missouri 64110
| | - James M. Sikela
- Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Neuroscience Programs, University of Colorado School of Medicine, Aurora, Colorado 80045
| |
Collapse
|
75
|
Detection of gene expression changes at chromosomal rearrangement breakpoints in evolution. BMC Bioinformatics 2012; 13 Suppl 3:S6. [PMID: 22536904 PMCID: PMC3402925 DOI: 10.1186/1471-2105-13-s3-s6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Background We study the relation between genome rearrangements, breakpoints and gene expression. Genome rearrangement research has been concerned with the creation of breakpoints and their position in the chromosome, but the functional consequences of individual breakpoints remain virtually unknown, and there are no direct genome-wide studies of breakpoints from this point of view. A question arises of what the biological consequences of breakpoint creation are, rather than just their structural aspects. The question is whether proximity to the site of a breakpoint event changes the activity of a gene. Results We investigate this by comparing the distribution of distances to the nearest breakpoint of genes that are differentially expressed with the distribution of the same distances for the entire gene complement. We study this in data on whole blood tissue in human versus macaque, and in cerebral cortex tissue in human versus chimpanzee. We find in both data sets that the distribution of distances to the nearest breakpoint of "changed expression genes" differs little from this distance calculated for the rest of the gene complement. In focusing on the changed expression genes closest to the breakpoints, however, we discover that several of these have previously been implicated in the literature as being connected to the evolutionary divergence of humans from other primates. Conclusions We conjecture that chromosomal rearrangements occasionally interrupt the regulatory configurations of genes close to the breakpoint, leading to changes in expression.
Collapse
|
76
|
Stouffs K, Vandermaelen D, Massart A, Menten B, Vergult S, Tournaye H, Lissens W. Array comparative genomic hybridization in male infertility. Hum Reprod 2012; 27:921-9. [DOI: 10.1093/humrep/der440] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
77
|
Deyell RJ, Attiyeh EF. Advances in the understanding of constitutional and somatic genomic alterations in neuroblastoma. Cancer Genet 2011; 204:113-21. [PMID: 21504710 DOI: 10.1016/j.cancergen.2011.03.001] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Revised: 03/01/2011] [Accepted: 03/03/2011] [Indexed: 01/30/2023]
Abstract
Advances in the field of genomics have led to multiple recent discoveries in the understanding of genetic predisposition and molecular pathogenesis of the childhood cancer neuroblastoma. Neuroblastoma is the most common extracranial solid tumor of childhood and is responsible for 10% of childhood cancer related mortality. The genetic etiology of rare families with hereditary neuroblastoma is now largely understood, with the majority having activating mutations in the anaplastic lymphoma kinase (ALK) gene. Genome-wide association studies have identified multiple common, low penetrance genetic polymorphisms that are associated with a predisposition to sporadic neuroblastoma, and these associations are disease phenotype specific. While many of the discoveries related to variations in the host genome that predispose to neuroblastoma are recent, there is a long and robust history of investigation of tumor cell genomics, leading to the identification of multiple biomarkers of tumor aggressiveness. Current patient risk stratification algorithms utilize key genomic features for therapy assignment. Microarray-based tumor DNA and RNA profiling techniques and next generation sequencing efforts may further refine these risk groups and identify new tractable therapeutic targets. Moving forward, integrative genomics efforts will be needed to discover how the interaction of germline genetic variations influence oncogenesis in neuroblastoma-both initiation and progression. In this review, we summarize the recent advances in the understanding of germline predisposition and molecular pathogenesis of neuroblastoma.
Collapse
Affiliation(s)
- Rebecca J Deyell
- Division of Oncology, Children's Hospital of Philadelphia, Department of Pediatrics, University of Pennsylvania School of Medicine, USA
| | | |
Collapse
|
78
|
Different transcription activity of HERV-K LTR-containing and LTR-lacking genes of the KIAA1245/NBPF gene subfamily. Genetica 2011; 139:733-41. [PMID: 21544646 DOI: 10.1007/s10709-011-9577-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2010] [Accepted: 04/15/2011] [Indexed: 10/18/2022]
Abstract
Long terminal repeats (LTRs) of human endogenous retroviruses (HERVs) located near or within genes might affect their expression. We used the KIAA1245/NBPF human gene subfamily in an attempt to assess the regulatory potential of HERV LTRs. The subfamily includes five closely related paralogous genes: three of them contain an LTR in the second intron, and two genes lack it. Earlier we reported that the second and third exons of only LTR-containing genes of this subfamily could be detected in mature mRNAs of various cell lines and human tissues. The corresponding parts of mRNA of LTR-lacking genes analyzed in our study were absent from EST libraries, but other fragments of their mRNAs were available in EST databases. For a more unbiased view on the correlation between gene transcription and the intronic LTRs, in the present work we analyzed non-spliced pre-mRNA thus avoiding splicing effects. Based on RT-PCR analysis, we demonstrated that the KIAA1245/NBPF LTR-lacking gene AL592309/NBPF3 was transcriptionally active, but the LTR-containing genes showed significantly higher transcription levels. The data are in agreement with the suggestion that HERV-K LTRs within the second intron of the KIAA1245/NBPF subfamily genes might affect their transcriptional activity. However, it still remains to be investigated whether the revealed effect is due just to the LTR insertion or other factors are responsible for the difference.
Collapse
|
79
|
Paar V, Gluncic M, Rosandic M, Basar I, Vlahovic I. Intragene Higher Order Repeats in Neuroblastoma BreakPoint Family Genes Distinguish Humans from Chimpanzees. Mol Biol Evol 2011; 28:1877-92. [DOI: 10.1093/molbev/msr009] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
|
80
|
Warden M, Pique-Regi R, Ortega A, Asgharzadeh S. Bioinformatics for copy number variation data. Methods Mol Biol 2011; 719:235-49. [PMID: 21370087 DOI: 10.1007/978-1-61779-027-0_11] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Copy number variation is known to be an important component of structural variation in the human genome. Greater than 1 kb in size, these gains and losses of genetic material are known to confer risk to many human diseases, both Mendelian and complex. Therefore, the technologies used to detect copy number variation have been quickly improving in both throughput and cost. From comparative genomic hybridization to synthetic high-density oligonucleotide arrays to next-generation sequencing methods, algorithms used to estimate copy number are plentiful. Here we describe a practical introduction to the copy number variation technology and available analysis methods, and demonstrate the analysis flow on an example case.
Collapse
Affiliation(s)
- Melissa Warden
- Department of Pediatrics and Pathology, Keck School of Medicine, Childrens Hospital Los Angeles, University of Southern California, Los Angeles, CA, USA
| | | | | | | |
Collapse
|
81
|
Abrarova ND, Stoukacheva EA, Pleshkan VV, Vinogradova TV, Sverdlov ED. Functional analysis of the HERV-K LTR residing in the KIAA1245/NBPF subfamily genes. Mol Biol 2010. [DOI: 10.1134/s0026893310040084] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
82
|
Wu X, Rauch TA, Zhong X, Bennett WP, Latif F, Krex D, Pfeifer GP. CpG island hypermethylation in human astrocytomas. Cancer Res 2010; 70:2718-27. [PMID: 20233874 DOI: 10.1158/0008-5472.can-09-3631] [Citation(s) in RCA: 107] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Astrocytomas are common and lethal human brain tumors. We have analyzed the methylation status of over 28,000 CpG islands and 18,000 promoters in normal human brain and in astrocytomas of various grades using the methylated CpG island recovery assay. We identified 6,000 to 7,000 methylated CpG islands in normal human brain. Approximately 5% of the promoter-associated CpG islands in the normal brain are methylated. Promoter CpG island methylation is inversely correlated whereas intragenic methylation is directly correlated with gene expression levels in brain tissue. In astrocytomas, several hundred CpG islands undergo specific hypermethylation relative to normal brain with 428 methylation peaks common to more than 25% of the tumors. Genes involved in brain development and neuronal differentiation, such as BMP4, POU4F3, GDNF, OTX2, NEFM, CNTN4, OTP, SIM1, FYN, EN1, CHAT, GSX2, NKX6-1, PAX6, RAX, and DLX2, were strongly enriched among genes frequently methylated in tumors. There was an overrepresentation of homeobox genes and 31% of the most commonly methylated genes represent targets of the Polycomb complex. We identified several chromosomal loci in which many (sometimes more than 20) consecutive CpG islands were hypermethylated in tumors. Seven such loci were near homeobox genes, including the HOXC and HOXD clusters, and the BARHL2, DLX1, and PITX2 genes. Two other clusters of hypermethylated islands were at sequences of recent gene duplication events. Our analysis offers mechanistic insights into brain neoplasia suggesting that methylation of the genes involved in neuronal differentiation, in cooperation with other oncogenic events, may shift the balance from regulated differentiation towards gliomagenesis.
Collapse
Affiliation(s)
- Xiwei Wu
- Department of Cancer Biology, Beckman Research Institute of the City of Hope, Duarte, CA, USA
| | | | | | | | | | | | | |
Collapse
|
83
|
Abstract
PURPOSE OF REVIEW DNA copy number variations (CNVs) comprise a recently discovered element of genetic variation that affects a greater cumulative fraction of the genome than single-nucleotide polymorphisms (SNPs). This review discusses current understanding of the characteristics of CNVs in the human genome and explores the emerging discoveries of both constitutional and somatic CNVs in an ever-expanding variety of human cancers. RECENT FINDINGS The advent of high-resolution SNP arrays has made it possible to identify CNVs. Characterization of widespread constitutional CNVs offers insight into their role in disease susceptibility, whereas somatic CNVs identify regions of the genome involved in disease phenotype. The role of CNVs in cancer has only emerged in the last 2 years, with constitutional CNVs originally being observed in the Li-Fraumeni cancer susceptibility syndrome, and more recently in neuroblastoma. SUMMARY It is not yet known how common or how functionally relevant CNVs will be to the process of carcinogenesis. Nonetheless, the inherent instability and structural variability that characterize cancer cell genomes make this form of genetic variation particularly intriguing to the study of cancer.
Collapse
|
84
|
Janoueix-Lerosey I, Schleiermacher G, Delattre O. Molecular pathogenesis of peripheral neuroblastic tumors. Oncogene 2010; 29:1566-79. [PMID: 20101209 DOI: 10.1038/onc.2009.518] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Neuroblastoma (NB) is an embryonal cancer of the sympathetic nervous system observed in early childhood, characterized by a broad spectrum of clinical behaviors, ranging from spontaneous regression to fatal outcome despite aggressive therapies. NB accounts for 8-10% of pediatric cancers and 15% of the deaths attributable to malignant conditions in children. Interestingly, NB may occur in various contexts, being mostly sporadic but also familial or syndromic. This review focuses on recent advances in the identification of the genes and mechanisms implicated in NB pathogenesis. Although the extensive characterization of the genomic aberrations recurrently observed in sporadic NBs provides important insights into the understanding of the clinical heterogeneity of this neoplasm, analysis of familial and syndromic cases also unravels essential clues on the genetic bases of NB. Recently, the ALK gene emerged as an important NB gene, being implicated both in sporadic and familial cases. The identification of gene expression signatures associated with patient's outcome points out the potential of using gene expression profiling to improve clinical management of patients suffering from NB. Finally, based on recent observations integrating genomic analyses, biological data and clinical information, we discuss possible evolution/progression schemes in NB.
Collapse
Affiliation(s)
- I Janoueix-Lerosey
- INSERM U830, Laboratoire de Génétique et Biologie des Cancers, Institut Curie, Paris Cedex 05, France.
| | | | | |
Collapse
|
85
|
Vandepoele K, Staes K, Andries V, van Roy F. Chibby interacts with NBPF1 and clusterin, two candidate tumor suppressors linked to neuroblastoma. Exp Cell Res 2010; 316:1225-33. [PMID: 20096688 DOI: 10.1016/j.yexcr.2010.01.019] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Revised: 01/13/2010] [Accepted: 01/14/2010] [Indexed: 11/24/2022]
Abstract
The NBPF genes are members of a gene family that underwent a remarkable increase in their copy number during recent primate evolution. The NBPF proteins contain 5 to 40 copies of a domain known as the NBPF repeat or DUF1220. Very little is known about the function of these domains or about the NBPF proteins. We performed a yeast two-hybrid screening with the aminoterminal domain of NBPF11 and found that Chibby, a documented repressor of Wnt signaling, interacts with multiple NBPF proteins. More specifically, a coiled-coil region in the NBPF proteins interacts with the coiled-coil domain in the carboxyterminal region of Chibby. Nonetheless, this interaction did not influence the repressor function of Chibby in a TOPFLASH reporter assay. Using Chibby as bait in a new yeast two-hybrid screening, we identified clusterin as a binding protein. Chibby and clusterin were co-immunoprecipitated with NBPF1, suggesting the formation of a tri-molecular complex. Although we have not pinpointed the role of these mutual interactions, the possible formation of a macromolecular complex of three candidate tumor suppressor proteins, including the enigmatic NBPF1, points at important functional implications.
Collapse
Affiliation(s)
- Karl Vandepoele
- Department for Molecular Biomedical Research, VIB, Ghent, Belgium.
| | | | | | | |
Collapse
|
86
|
Suh I, Filetti S, Vriens MR, Guerrero MA, Tumino S, Wong M, Shen WT, Kebebew E, Duh QY, Clark OH. Distinct loci on chromosome 1q21 and 6q22 predispose to familial nonmedullary thyroid cancer: A SNP array-based linkage analysis of 38 families. Surgery 2009; 146:1073-80. [DOI: 10.1016/j.surg.2009.09.012] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2009] [Accepted: 09/17/2009] [Indexed: 11/30/2022]
|
87
|
Dumas L, Sikela JM. DUF1220 domains, cognitive disease, and human brain evolution. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2009; 74:375-82. [PMID: 19850849 DOI: 10.1101/sqb.2009.74.025] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
We have established that human genome sequences encoding a novel protein domain, DUF1220, show a dramatically elevated copy number in the human lineage (>200 copies in humans vs. 1 in mouse/rat) and may be important to human evolutionary adaptation. Copy-number variations (CNVs) in the 1q21.1 region, where most DUF1220 sequences map, have now been implicated in numerous diseases associated with cognitive dysfunction, including autism, autism spectrum disorder, mental retardation, schizophrenia, microcephaly, and macrocephaly. We report here that these disease-related 1q21.1 CNVs either encompass or are directly flanked by DUF1220 sequences and exhibit a dosage-related correlation with human brain size. Microcephaly-producing 1q21.1 CNVs are deletions, whereas macrocephaly-producing 1q21.1 CNVs are duplications. Similarly, 1q21.1 deletions and smaller brain size are linked with schizophrenia, whereas 1q21.1 duplications and larger brain size are associated with autism. Interestingly, these two diseases are thought to be phenotypic opposites. These data suggest a model which proposes that (1) DUF1220 domain copy number may be involved in influencing human brain size and (2) the evolutionary advantage of rapidly increasing DUF1220 copy number in the human lineage has resulted in favoring retention of the high genomic instability of the 1q21.1 region, which, in turn, has precipitated a spectrum of recurrent human brain and developmental disorders.
Collapse
Affiliation(s)
- L Dumas
- University of Colorado Denver School of Medicine, Aurora, CO 80045, USA
| | | |
Collapse
|
88
|
An exon-based comparative variant analysis pipeline to study the scale and role of frameshift and nonsense mutation in the human-chimpanzee divergence. Comp Funct Genomics 2009:406421. [PMID: 19859573 PMCID: PMC2765723 DOI: 10.1155/2009/406421] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2009] [Revised: 07/14/2009] [Accepted: 07/18/2009] [Indexed: 11/18/2022] Open
Abstract
Chimpanzees and humans are closely related but differ in many deadly human diseases and other characteristics in physiology, anatomy, and pathology. In spite of decades of extensive research, crucial questions about the
molecular mechanisms behind the differences are yet to be understood. Here I report ExonVar, a novel computational pipeline for Exon-based human-chimpanzee comparative Variant analysis. The objective is to comparatively
analyze mutations specifically those that caused the frameshift and nonsense mutations and to assess their scale and potential impacts on human-chimpanzee divergence. Genomewide analysis of human and chimpanzee exons with ExonVar identified a number of species-specific, exon-disrupting mutations in chimpanzees but much fewer in humans. Many were found on genes involved in
important biological processes such as T cell lineage development, the pathogenesis of inflammatory diseases, and antigen induced cell death. A “less-is-more” model was previously established to illustrate the role of the gene inactivation and disruptions during human evolution. Here this analysis suggested a different model where the chimpanzee-specific exon-disrupting mutations may act as additional evolutionary force that drove the human-chimpanzee divergence. Finally, the analysis revealed a number of sequencing errors in the chimpanzee and human genome sequences and further illustrated that they could be corrected without resequencing.
Collapse
|
89
|
Marques-Bonet T, Ryder OA, Eichler EE. Sequencing primate genomes: what have we learned? Annu Rev Genomics Hum Genet 2009; 10:355-86. [PMID: 19630567 DOI: 10.1146/annurev.genom.9.081307.164420] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
We summarize the progress in whole-genome sequencing and analyses of primate genomes. These emerging genome datasets have broadened our understanding of primate genome evolution revealing unexpected and complex patterns of evolutionary change. This includes the characterization of genome structural variation, episodic changes in the repeat landscape, differences in gene expression, new models regarding speciation, and the ephemeral nature of the recombination landscape. The functional characterization of genomic differences important in primate speciation and adaptation remains a significant challenge. Limited access to biological materials, the lack of detailed phenotypic data and the endangered status of many critical primate species have significantly attenuated research into the genetic basis of primate evolution. Next-generation sequencing technologies promise to greatly expand the number of available primate genome sequences; however, such draft genome sequences will likely miss critical genetic differences within complex genomic regions unless dedicated efforts are put forward to understand the full spectrum of genetic variation.
Collapse
Affiliation(s)
- Tomas Marques-Bonet
- Department of Genome Sciences, University of Washington and the Howard Hughes Medical Institute, Seattle, Washington 98105, USA.
| | | | | |
Collapse
|
90
|
Marques-Bonet T, Girirajan S, Eichler EE. The origins and impact of primate segmental duplications. Trends Genet 2009; 25:443-54. [PMID: 19796838 PMCID: PMC2847396 DOI: 10.1016/j.tig.2009.08.002] [Citation(s) in RCA: 120] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2009] [Revised: 08/07/2009] [Accepted: 08/10/2009] [Indexed: 12/25/2022]
Abstract
Duplicated sequences are substrates for the emergence of new genes and are an important source of genetic instability associated with rare and common diseases. Analyses of primate genomes have shown an increase in the proportion of interspersed segmental duplications (SDs) within the genomes of humans and great apes. This contrasts with other mammalian genomes that seem to have their recently duplicated sequences organized in a tandem configuration. In this review, we focus on the mechanistic origin and impact of this difference with respect to evolution, genetic diversity and primate phenotype. Although many genomes will be sequenced in the future, resolution of this aspect of genomic architecture still requires high quality sequences and detailed analyses.
Collapse
Affiliation(s)
- Tomas Marques-Bonet
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | | | | |
Collapse
|
91
|
Han MV, Demuth JP, McGrath CL, Casola C, Hahn MW. Adaptive evolution of young gene duplicates in mammals. Genome Res 2009; 19:859-67. [PMID: 19411603 DOI: 10.1101/gr.085951.108] [Citation(s) in RCA: 163] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Duplicate genes act as a source of genetic material from which new functions arise. They exist in large numbers in every sequenced eukaryotic genome and may be responsible for many differences in phenotypes between species. However, recent work searching for the targets of positive selection in humans has largely ignored duplicated genes due to complications in orthology assignment. Here we find that a high proportion of young gene duplicates in the human, macaque, mouse, and rat genomes have experienced adaptive natural selection. Approximately 10% of all lineage-specific duplicates show evidence for positive selection on their protein sequences, larger than any reported amount of selection among single-copy genes in these lineages using similar methods. We also find that newly duplicated genes that have been transposed to new chromosomal locations are significantly more likely to have undergone positive selection than the ancestral copy. Human-specific duplicates evolving under adaptive natural selection include a surprising number of genes involved in neuronal and cognitive functions. Our results imply that genome scans for selection that ignore duplicated loci are missing a large fraction of all adaptive substitutions. The results are also in agreement with the classical model of evolution by gene duplication, supporting a common role for neofunctionalization in the long-term maintenance of gene duplicates.
Collapse
Affiliation(s)
- Mira V Han
- School of Informatics, Indiana University, Bloomington, IN 47405, USA
| | | | | | | | | |
Collapse
|
92
|
Stahl PD, Wainszelbaum MJ. Human-specific genes may offer a unique window into human cell signaling. Sci Signal 2009; 2:pe59. [PMID: 19797272 DOI: 10.1126/scisignal.289pe59] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The identification and characterization of human-specific genes and the cellular processes that the encoded proteins control have the potential to help us understand at the molecular level what makes humans different from other species. The sequencing of the human genome and the genomes of closely related primates has revealed the presence of a small number of human- or human-lineage-specific genes that have no orthologs in lower species. Human-specific and human-lineage-specific genes are likely to function as regulators of cell signaling events, and by fine-tuning pathways, the encoded proteins may contribute to human-specific characteristics and behaviors. In addition, human-specific genes may represent biomarkers for examining human-specific characteristics of various diseases. Investigation of the gene encoding TBC1D3 is one example of a search that may lead to understanding the evolution and the function of human-specific genes, because it is absent in lower species and present in high copy number in the human genome.
Collapse
Affiliation(s)
- Philip D Stahl
- Department of Cell Biology and Physiology, Washington University School of Medicine, St. Louis, MO 63110, USA.
| | | |
Collapse
|
93
|
Copy number variation at 1q21.1 associated with neuroblastoma. Nature 2009; 459:987-91. [PMID: 19536264 DOI: 10.1038/nature08035] [Citation(s) in RCA: 298] [Impact Index Per Article: 19.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2008] [Accepted: 03/30/2009] [Indexed: 01/08/2023]
Abstract
Common copy number variations (CNVs) represent a significant source of genetic diversity, yet their influence on phenotypic variability, including disease susceptibility, remains poorly understood. To address this problem in human cancer, we performed a genome-wide association study of CNVs in the childhood cancer neuroblastoma, a disease in which single nucleotide polymorphism variations are known to influence susceptibility. We first genotyped 846 Caucasian neuroblastoma patients and 803 healthy Caucasian controls at approximately 550,000 single nucleotide polymorphisms, and performed a CNV-based test for association. We then replicated significant observations in two independent sample sets comprised of a total of 595 cases and 3,357 controls. Here we describe the identification of a common CNV at chromosome 1q21.1 associated with neuroblastoma in the discovery set, which was confirmed in both replication sets. This CNV was validated by quantitative polymerase chain reaction, fluorescent in situ hybridization and analysis of matched tumour specimens, and was shown to be heritable in an independent set of 713 cancer-free parent-offspring trios. We identified a previously unknown transcript within the CNV that showed high sequence similarity to several neuroblastoma breakpoint family (NBPF) genes and represents a new member of this gene family (NBPF23). This transcript was preferentially expressed in fetal brain and fetal sympathetic nervous tissues, and the expression level was strictly correlated with CNV state in neuroblastoma cells. These data demonstrate that inherited copy number variation at 1q21.1 is associated with neuroblastoma and implicate a previously unknown neuroblastoma breakpoint family gene in early tumorigenesis of this childhood cancer.
Collapse
|
94
|
Abstract
One of the unique insights provided by the growing number of fully sequenced genomes is the pervasiveness of gene duplication and gene loss. Indeed, several metrics now suggest that rates of gene birth and death per gene are only 10-40% lower than nucleotide substitutions per site, and that per nucleotide, the consequent lineage-specific expansion and contraction of gene families may play at least as large a role in adaptation as changes in orthologous sequences. While gene family evolution is pervasive, it may be especially important in our own evolution since it appears that the "revolving door" of gene duplication and loss has undergone multiple accelerations in the lineage leading to humans. In this paper, we review current understanding of gene family evolution including: methods for inferring copy number change, evidence for adaptive expansion and adaptive contraction of gene families, the origins of new families and deaths of previously established ones, and finally we conclude with a perspective on challenges and promising directions for future research.
Collapse
|
95
|
Pezzolo A, Rossi E, Gimelli S, Parodi F, Negri F, Conte M, Pistorio A, Sementa A, Pistoia V, Zuffardi O, Gambini C. Presence of 1q gain and absence of 7p gain are new predictors of local or metastatic relapse in localized resectable neuroblastoma. Neuro Oncol 2009; 11:192-200. [PMID: 18923191 PMCID: PMC2718991 DOI: 10.1215/15228517-2008-086] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2008] [Accepted: 04/29/2008] [Indexed: 11/19/2022] Open
Abstract
We have addressed the search of novel genetic prognostic markers in a selected cohort of patients with stroma-poor localized resectable neuroblastoma (NB) who underwent relapse or progression (group 1) or complete remission (group 2) over a minimum follow-up of 32 months from diagnosis. Twenty-three Italian patients with localized resectable NB (stages 1 and 2) diagnosed from 1994 through 2005 were studied. All patients received surgical treatment. Chemotherapy was administered only to the three stage 2 patients who had MYCN-amplified tumors. High-resolution array-comparative genomic hybridization (CGH) DNA copy-number analysis technology was used to identify novel prognostic markers. Chromosome 1p36.22p36.32 loss and 1q22qter gain, detected almost exclusively in group 1 patients, were significantly associated with poor event-free survival (EFS) (p = 0.0024 and p = 0.024, respectively). In contrast, patients with 7p11.2p22 gain, who belonged predominantly to group 2, had a significantly better EFS (p = 0.015). The frequency of 17q gain or 3p and 11q losses did not differ significantly in group 1 versus group 2 NBs. The sensitive technique allowed us to define the smallest region of 1p deletion. In conclusion, 1q22qter gain and 7p11.2p22 gain might represent new prognostic markers in localized resectable NB, but the small study size and the retrospective nature of the findings warrant further validation of the results in larger studies.
Collapse
Affiliation(s)
- Annalisa Pezzolo
- Department of Oncology, IRCCS G. Gaslini Hospital, Genova, Italy.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
96
|
Vandepoele K, Andries V, van Roy F. The NBPF1 promoter has been recruited from the unrelated EVI5 gene before simian radiation. Mol Biol Evol 2009; 26:1321-32. [PMID: 19282512 DOI: 10.1093/molbev/msp047] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Most new genes arise through the duplication of existing genes. In most cases, the duplication is not limited to the coding sequence but encompasses the regulatory region as well. The NBPF gene family has expanded during recent primate evolution, and it has no known mouse ortholog. One of its members, NBPF1, was found to be disrupted by a constitutional translocation in a neuroblastoma patient. Here, we show that the ancestral NBPF gene copied the regulatory region from an unrelated gene, EVI5, after the split between simians and prosimians but before simian radiation. Phylogenetic analysis points to the possible involvement of positive selection acting on the NBPF1 promoter in the simian lineage. We previously showed decreased NBPF1 expression in certain neuroblastoma cell lines. Here, we show that this expression pattern is mimicked by the EVI5 gene, but partly by different mechanisms. Epigenetic regulation of the EVI5 promoter is common in neuroblastoma cell lines, but it is not for the NBPF promoters. Here, we describe the recent acquisition of the NBPF1 promoter from an unrelated gene, and remarkably, both the donor (EVI5) and acceptor (NBPF1) genes are disrupted by constitutional translocations in patients with neuroblastoma, suggesting a functional link between these genes and the disease.
Collapse
Affiliation(s)
- Karl Vandepoele
- Department for Molecular Biomedical Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium
| | | | | |
Collapse
|
97
|
Warburton PE, Hasson D, Guillem F, Lescale C, Jin X, Abrusan G. Analysis of the largest tandemly repeated DNA families in the human genome. BMC Genomics 2008; 9:533. [PMID: 18992157 PMCID: PMC2588610 DOI: 10.1186/1471-2164-9-533] [Citation(s) in RCA: 108] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2008] [Accepted: 11/07/2008] [Indexed: 01/26/2023] Open
Abstract
Background Tandemly Repeated DNA represents a large portion of the human genome, and accounts for a significant amount of copy number variation. Here we present a genome wide analysis of the largest tandem repeats found in the human genome sequence. Results Using Tandem Repeats Finder (TRF), tandem repeat arrays greater than 10 kb in total size were identified, and classified into simple sequence e.g. GAATG, classical satellites e.g. alpha satellite DNA, and locus specific VNTR arrays. Analysis of these large sequenced regions revealed that several "simple sequence" arrays actually showed complex domain and/or higher order repeat organization. Using additional methods, we further identified a total of 96 additional arrays with tandem repeat units greater than 2 kb (the detection limit of TRF), 53 of which contained genes or repeated exons. The overall size of an array of tandem 12 kb repeats which spanned a gap on chromosome 8 was found to be 600 kb to 1.7 Mbp in size, representing one of the largest non-centromeric arrays characterized. Several novel megasatellite tandem DNA families were observed that are characterized by repeating patterns of interspersed transposable elements that have expanded presumably by unequal crossing over. One of these families is found on 11 different chromosomes in >25 arrays, and represents one of the largest most widespread megasatellite DNA families. Conclusion This study represents the most comprehensive genome wide analysis of large tandem repeats in the human genome, and will serve as an important resource towards understanding the organization and copy number variation of these complex DNA families.
Collapse
Affiliation(s)
- Peter E Warburton
- Deptartment of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY 10029, USA.
| | | | | | | | | | | |
Collapse
|
98
|
Vandepoele K, Andries V, Van Roy N, Staes K, Vandesompele J, Laureys G, De Smet E, Berx G, Speleman F, van Roy F. A constitutional translocation t(1;17)(p36.2;q11.2) in a neuroblastoma patient disrupts the human NBPF1 and ACCN1 genes. PLoS One 2008; 3:e2207. [PMID: 18493581 PMCID: PMC2386287 DOI: 10.1371/journal.pone.0002207] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2008] [Accepted: 04/11/2008] [Indexed: 11/18/2022] Open
Abstract
The human 1p36 region is deleted in many different types of tumors, and so it probably harbors one or more tumor suppressor genes. In a Belgian neuroblastoma patient, a constitutional balanced translocation t(1;17)(p36.2;q11.2) may have led to the development of the tumor by disrupting or activating a gene. Here, we report the cloning of both translocation breakpoints and the identification of a novel gene that is disrupted by this translocation. This gene, named NBPF1 for Neuroblastoma BreakPoint Family member 1, belongs to a recently described gene family encoding highly similar proteins, the functions of which are unknown. The translocation truncates NBPF1 and gives rise to two chimeric transcripts of NBPF1 sequences fused to sequences derived from chromosome 17. On chromosome 17, the translocation disrupts one of the isoforms of ACCN1, a potential glioma tumor suppressor gene. Expression of the NBPF family in neuroblastoma cell lines is highly variable, but it is decreased in cell lines that have a deletion of chromosome 1p. More importantly, expression profiling of the NBPF1 gene showed that its expression is significantly lower in cell lines with heterozygous NBPF1 loss than in cell lines with a normal 1p chromosome. Meta-analysis of the expression of NBPF and ACCN1 in neuroblastoma tumors indicates a role for the NBPF genes and for ACCN1 in tumor aggressiveness. Additionally, DLD1 cells with inducible NBPF1 expression showed a marked decrease of clonal growth in a soft agar assay. The disruption of both NBPF1 and ACCN1 genes in this neuroblastoma patient indicates that these genes might suppress development of neuroblastoma and possibly other tumor types.
Collapse
Affiliation(s)
- Karl Vandepoele
- Department for Molecular Biomedical Research, VIB, Ghent, Belgium
- Department of Molecular Biology, Ghent University, Ghent, Belgium
| | - Vanessa Andries
- Department for Molecular Biomedical Research, VIB, Ghent, Belgium
- Department of Molecular Biology, Ghent University, Ghent, Belgium
| | - Nadine Van Roy
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Katrien Staes
- Department for Molecular Biomedical Research, VIB, Ghent, Belgium
| | - Jo Vandesompele
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Geneviève Laureys
- Department of Pediatric Hematology and Oncology, Ghent University Hospital, Ghent, Belgium
| | - Els De Smet
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Geert Berx
- Department for Molecular Biomedical Research, VIB, Ghent, Belgium
- Department of Molecular Biology, Ghent University, Ghent, Belgium
| | - Frank Speleman
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Frans van Roy
- Department for Molecular Biomedical Research, VIB, Ghent, Belgium
- Department of Molecular Biology, Ghent University, Ghent, Belgium
| |
Collapse
|
99
|
Ancestral reconstruction of segmental duplications reveals punctuated cores of human genome evolution. Nat Genet 2007; 39:1361-8. [PMID: 17922013 DOI: 10.1038/ng.2007.9] [Citation(s) in RCA: 143] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2007] [Accepted: 08/07/2007] [Indexed: 01/22/2023]
Abstract
Human segmental duplications are hotspots for nonallelic homologous recombination leading to genomic disorders, copy-number polymorphisms and gene and transcript innovations. The complex structure and history of these regions have precluded a global evolutionary analysis. Combining a modified A-Bruijn graph algorithm with comparative genome sequence data, we identify the origin of 4,692 ancestral duplication loci and use these to cluster 437 complex duplication blocks into 24 distinct groups. The sequence-divergence data between ancestral-derivative pairs and a comparison with the chimpanzee and macaque genome support a 'punctuated' model of evolution. Our analysis reveals that human segmental duplications are frequently organized around 'core' duplicons, which are enriched for transcripts and, in some cases, encode primate-specific genes undergoing positive selection. We hypothesize that the rapid expansion and fixation of some intrachromosomal segmental duplications during great-ape evolution has been due to the selective advantage conferred by these genes and transcripts embedded within these core duplications.
Collapse
|
100
|
Bosch N, Cáceres M, Cardone MF, Carreras A, Ballana E, Rocchi M, Armengol L, Estivill X. Characterization and evolution of the novel gene family FAM90A in primates originated by multiple duplication and rearrangement events. Hum Mol Genet 2007; 16:2572-82. [PMID: 17684299 DOI: 10.1093/hmg/ddm209] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Genomic plasticity of human chromosome 8p23.1 region is highly influenced by two groups of complex segmental duplications (SDs), termed REPD and REPP, that mediate different kinds of rearrangements. Part of the difficulty to explain the wide range of phenotypes associated with 8p23.1 rearrangements is that REPP and REPD are not yet well characterized, probably due to their polymorphic status. Here, we describe a novel primate-specific gene family, named FAM90A (family with sequence similarity 90), found within these SDs. According to the current human reference sequence assembly, the FAM90A family includes 24 members along 8p23.1 region plus a single member on chromosome 12p13.31, showing copy number variation (CNV) between individuals. These genes can be classified into subfamilies I and II, which differ in their upstream and 5'-untranslated region sequences, but both share the same open reading frame and are ubiquitously expressed. Sequence analysis and comparative fluorescence in situ hybridization studies showed that FAM90A subfamily II suffered a big expansion in the hominoid lineage, whereas subfamily I members were likely generated sometime around the divergence of orangutan and African great apes by a fusion process. In addition, the analysis of the Ka/Ks ratios provides evidence of functional constraint of some FAM90A genes in all species. The characterization of the FAM90A gene family contributes to a better understanding of the structural polymorphism of the human 8p23.1 region and constitutes a good example of how SDs, CNVs and rearrangements within themselves can promote the formation of new gene sequences with potential functional consequences.
Collapse
Affiliation(s)
- Nina Bosch
- Genes and Disease Program, Center for Genomic Regulation (CRG-UPF) and CIBERESP, Barcelona, Catalonia, Spain
| | | | | | | | | | | | | | | |
Collapse
|