151
|
Dougherty ML, Underwood JG, Nelson BJ, Tseng E, Munson KM, Penn O, Nowakowski TJ, Pollen AA, Eichler EE. Transcriptional fates of human-specific segmental duplications in brain. Genome Res 2018; 28:1566-1576. [PMID: 30228200 PMCID: PMC6169893 DOI: 10.1101/gr.237610.118] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 08/07/2018] [Indexed: 01/27/2023]
Abstract
Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth-death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits.
Collapse
Affiliation(s)
- Max L Dougherty
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Jason G Underwood
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Pacific Biosciences (PacBio) of California, Incorporated, Menlo Park, California 94025, USA
| | - Bradley J Nelson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Elizabeth Tseng
- Pacific Biosciences (PacBio) of California, Incorporated, Menlo Park, California 94025, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Osnat Penn
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA
| | - Tomasz J Nowakowski
- Department of Anatomy, University of California, San Francisco, San Francisco, California 94158, USA.,Department of Psychiatry, University of California, San Francisco, San Francisco, California 94158, USA
| | - Alex A Pollen
- Department of Neurology, University of California, San Francisco, San Francisco, California 94158, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
152
|
Zhong J, Olsson LM, Urbonaviciute V, Yang M, Bäckdahl L, Holmdahl R. Association of NOX2 subunits genetic variants with autoimmune diseases. Free Radic Biol Med 2018. [PMID: 29526808 DOI: 10.1016/j.freeradbiomed.2018.03.005] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
A single nucleotide polymorphism in Ncf1 has been found with a major effect on chronic inflammatory autoimmune diseases in the rat with the surprising observation that a lower reactive oxygen response led to more severe diseases. This finding was subsequently reproduced in the mouse and the effect operates in many different murine diseases through different pathogenic pathways; like models for rheumatoid arthritis, encephalomyelitis, lupus, gout, psoriasis and psoriatic arthritis. The human gene is located in an unstable region with many variable sequence repetitions, which means it has not been included in any genome wide associated screens so far. However, identification of copy number variations and single nucleotide polymorphisms has now clearly shown that major autoimmune diseases are strongly associated with the Ncf1 locus. In systemic lupus erythematosus the associated Ncf1 polymorphism (leading to an amino acid substitution at position 90) is the strongest locus and is associated with a lower reactive oxidative burst response. In addition, more precise mapping analysis of polymorphism of other NOX2 genes reveals that these are also associated with autoimmunity. The identified genetic association shows the importance of redox control and that ROS regulate chronic inflammation instead of promoting it. The genetic identification of Ncf1 polymorphisms now opens for relevant studies of the regulatory mechanisms involved, effects that will have severe consequences in many different pathogenic pathways and understanding of the origin of autoimmune diseases.
Collapse
Affiliation(s)
- Jianghong Zhong
- Medical Inflammation Research, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm 17177, Sweden
| | - Lina M Olsson
- Medical Inflammation Research, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm 17177, Sweden
| | - Vilma Urbonaviciute
- Medical Inflammation Research, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm 17177, Sweden
| | - Min Yang
- Medical Inflammation Research, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm 17177, Sweden
| | - Liselotte Bäckdahl
- Medical Inflammation Research, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm 17177, Sweden
| | - Rikard Holmdahl
- Medical Inflammation Research, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm 17177, Sweden.
| |
Collapse
|
153
|
Numanagić I, Gökkaya AS, Zhang L, Berger B, Alkan C, Hach F. Fast characterization of segmental duplications in genome assemblies. Bioinformatics 2018; 34:i706-i714. [PMID: 30423092 PMCID: PMC6129265 DOI: 10.1093/bioinformatics/bty586] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Motivation Segmental duplications (SDs) or low-copy repeats, are segments of DNA > 1 Kbp with high sequence identity that are copied to other regions of the genome. SDs are among the most important sources of evolution, a common cause of genomic structural variation and several are associated with diseases of genomic origin including schizophrenia and autism. Despite their functional importance, SDs present one of the major hurdles for de novo genome assembly due to the ambiguity they cause in building and traversing both state-of-the-art overlap-layout-consensus and de Bruijn graphs. This causes SD regions to be misassembled, collapsed into a unique representation, or completely missing from assembled reference genomes for various organisms. In turn, this missing or incorrect information limits our ability to fully understand the evolution and the architecture of the genomes. Despite the essential need to accurately characterize SDs in assemblies, there has been only one tool that was developed for this purpose, called Whole-Genome Assembly Comparison (WGAC); its primary goal is SD detection. WGAC is comprised of several steps that employ different tools and custom scripts, which makes this strategy difficult and time consuming to use. Thus there is still a need for algorithms to characterize within-assembly SDs quickly, accurately, and in a user friendly manner. Results Here we introduce SEgmental Duplication Evaluation Framework (SEDEF) to rapidly detect SDs through sophisticated filtering strategies based on Jaccard similarity and local chaining. We show that SEDEF accurately detects SDs while maintaining substantial speed up over WGAC that translates into practical run times of minutes instead of weeks. Notably, our algorithm captures up to 25% 'pairwise error' between segments, whereas previous studies focused on only 10%, allowing us to more deeply track the evolutionary history of the genome. Availability and implementation SEDEF is available at https://github.com/vpc-ccg/sedef.
Collapse
Affiliation(s)
- Ibrahim Numanagić
- Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Alim S Gökkaya
- Department of Computer Engineering, Bilkent University, Ankara, Turkey
| | - Lillian Zhang
- Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara, Turkey
| | - Faraz Hach
- Vancouver Prostate Centre, Vancouver, Canada
- Department of Urologic Sciences, University of British Columbia, Vancouver, Canada
| |
Collapse
|
154
|
Pendleton AL, Shen F, Taravella AM, Emery S, Veeramah KR, Boyko AR, Kidd JM. Comparison of village dog and wolf genomes highlights the role of the neural crest in dog domestication. BMC Biol 2018; 16:64. [PMID: 29950181 PMCID: PMC6022502 DOI: 10.1186/s12915-018-0535-2] [Citation(s) in RCA: 101] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2018] [Accepted: 05/23/2018] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Domesticated from gray wolves between 10 and 40 kya in Eurasia, dogs display a vast array of phenotypes that differ from their ancestors, yet mirror other domesticated animal species, a phenomenon known as the domestication syndrome. Here, we use signatures persisting in dog genomes to identify genes and pathways possibly altered by the selective pressures of domestication. RESULTS Whole-genome SNP analyses of 43 globally distributed village dogs and 10 wolves differentiated signatures resulting from domestication rather than breed formation. We identified 246 candidate domestication regions containing 10.8 Mb of genome sequence and 429 genes. The regions share haplotypes with ancient dogs, suggesting that the detected signals are not the result of recent selection. Gene enrichments highlight numerous genes linked to neural crest and central nervous system development as well as neurological function. Read depth analysis suggests that copy number variation played a minor role in dog domestication. CONCLUSIONS Our results identify genes that act early in embryogenesis and can confer phenotypes distinguishing domesticated dogs from wolves, such as tameness, smaller jaws, floppy ears, and diminished craniofacial development as the targets of selection during domestication. These differences reflect the phenotypes of the domestication syndrome, which can be explained by alterations in the migration or activity of neural crest cells during development. We propose that initial selection during early dog domestication was for behavior, a trait influenced by genes which act in the neural crest, which secondarily gave rise to the phenotypes of modern dogs.
Collapse
Affiliation(s)
- Amanda L Pendleton
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Feichen Shen
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Angela M Taravella
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Sarah Emery
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Krishna R Veeramah
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Adam R Boyko
- Department of Biomedical Sciences, Cornell University, Ithaca, New York, 14853, USA
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan, Ann Arbor, MI, 48109, USA.
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
155
|
Human-Specific NOTCH2NL Genes Expand Cortical Neurogenesis through Delta/Notch Regulation. Cell 2018; 173:1370-1384.e16. [PMID: 29856955 PMCID: PMC6092419 DOI: 10.1016/j.cell.2018.03.067] [Citation(s) in RCA: 243] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 02/16/2018] [Accepted: 03/26/2018] [Indexed: 12/03/2022]
Abstract
The cerebral cortex underwent rapid expansion and increased complexity during recent hominid evolution. Gene duplications constitute a major evolutionary force, but their impact on human brain development remains unclear. Using tailored RNA sequencing (RNA-seq), we profiled the spatial and temporal expression of hominid-specific duplicated (HS) genes in the human fetal cortex and identified a repertoire of 35 HS genes displaying robust and dynamic patterns during cortical neurogenesis. Among them NOTCH2NL, human-specific paralogs of the NOTCH2 receptor, stood out for their ability to promote cortical progenitor maintenance. NOTCH2NL promote the clonal expansion of human cortical progenitors, ultimately leading to higher neuronal output. At the molecular level, NOTCH2NL function by activating the Notch pathway through inhibition of cis Delta/Notch interactions. Our study uncovers a large repertoire of recently evolved genes active during human corticogenesis and reveals how human-specific NOTCH paralogs may have contributed to the expansion of the human cortex. Identification of >35 HS protein-coding genes expressed during human corticogenesis NOTCH2NL human-specific paralogs of NOTCH2 expressed in human cortical progenitors NOTCH2NL genes expand human cortical progenitors and their neuronal output NOTCH2NL promotes Notch signaling through cis-inhibition of Delta/Notch interactions
Collapse
|
156
|
Catacchio CR, Maggiolini FAM, D'Addabbo P, Bitonto M, Capozzi O, Lepore Signorile M, Miroballo M, Archidiacono N, Eichler EE, Ventura M, Antonacci F. Inversion variants in human and primate genomes. Genome Res 2018; 28:910-920. [PMID: 29776991 PMCID: PMC5991517 DOI: 10.1101/gr.234831.118] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Accepted: 04/26/2018] [Indexed: 02/06/2023]
Abstract
For many years, inversions have been proposed to be a direct driving force in speciation since they suppress recombination when heterozygous. Inversions are the most common large-scale differences among humans and great apes. Nevertheless, they represent large events easily distinguishable by classical cytogenetics, whose resolution, however, is limited. Here, we performed a genome-wide comparison between human, great ape, and macaque genomes using the net alignments for the most recent releases of genome assemblies. We identified a total of 156 putative inversions, between 103 kb and 91 Mb, corresponding to 136 human loci. Combining literature, sequence, and experimental analyses, we analyzed 109 of these loci and found 67 regions inverted in one or multiple primates, including 28 newly identified inversions. These events overlap with 81 human genes at their breakpoints, and seven correspond to sites of recurrent rearrangements associated with human disease. This work doubles the number of validated primate inversions larger than 100 kb, beyond what was previously documented. We identified 74 sites of errors, where the sequence has been assembled in the wrong orientation, in the reference genomes analyzed. Our data serve two purposes: First, we generated a map of evolutionary inversions in these genomes representing a resource for interrogating differences among these species at a functional level; second, we provide a list of misassembled regions in these primate genomes, involving over 300 Mb of DNA and 1978 human genes. Accurately annotating these regions in the genome references has immediate applications for evolutionary and biomedical studies on primates.
Collapse
Affiliation(s)
| | | | - Pietro D'Addabbo
- Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro," Bari 70125, Italy
| | - Miriana Bitonto
- Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro," Bari 70125, Italy
| | - Oronzo Capozzi
- Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro," Bari 70125, Italy
| | | | - Mattia Miroballo
- Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro," Bari 70125, Italy
| | | | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Mario Ventura
- Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro," Bari 70125, Italy
| | - Francesca Antonacci
- Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro," Bari 70125, Italy
| |
Collapse
|
157
|
Dynamic Copy Number Evolution of X- and Y-Linked Ampliconic Genes in Human Populations. Genetics 2018; 209:907-920. [PMID: 29769284 PMCID: PMC6028258 DOI: 10.1534/genetics.118.300826] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Accepted: 05/15/2018] [Indexed: 11/18/2022] Open
Abstract
Ampliconic genes are multicopy genes often located on sex chromosomes and enriched for testis-expressed genes. Here, Lucotte et al. developed new bioinformatic approaches to investigate the ampliconic gene copy number and their coding... Ampliconic genes are multicopy, with the majority found on sex chromosomes and enriched for testis-expressed genes. While ampliconic genes have been associated with the emergence of hybrid incompatibilities, we know little about their copy number distribution and their turnover in human populations. Here, we explore the evolution of human X- and Y-linked ampliconic genes by investigating copy number variation (CNV) and coding variation between populations using the Simons Genome Diversity Project. We develop a method to assess CNVs using the read depth on modified X and Y chromosome targets containing only one repetition of each ampliconic gene. Our results reveal extensive standing variation in copy number both within and between human populations for several ampliconic genes. For the Y chromosome, we can infer multiple independent amplifications and losses of these gene copies even within closely related Y haplogroups, that diversified < 50,000 years ago. Moreover, X- and Y-linked ampliconic genes seem to have a faster amplification dynamic than autosomal multicopy genes. Looking at expression data from another study, we also find that X- and Y-linked ampliconic genes with extensive CNV are significantly more expressed than genes with no CNV during meiotic sex chromosome inactivation (for both X and Y) and postmeiotic sex chromosome repression (for the Y chromosome only). While we cannot rule out that the XY-linked ampliconic genes are evolving neutrally, this study gives insights into the distribution of copy number within human populations and demonstrates an extremely fast turnover in copy number of these regions.
Collapse
|
158
|
Bayless AM, Zapotocny RW, Grunwald DJ, Amundson KK, Diers BW, Bent AF. An atypical N-ethylmaleimide sensitive factor enables the viability of nematode-resistant Rhg1 soybeans. Proc Natl Acad Sci U S A 2018; 115:E4512-E4521. [PMID: 29695628 PMCID: PMC5948960 DOI: 10.1073/pnas.1717070115] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
N-ethylmaleimide sensitive factor (NSF) and α-soluble NSF attachment protein (α-SNAP) are essential eukaryotic housekeeping proteins that cooperatively function to sustain vesicular trafficking. The "resistance to Heterodera glycines 1" (Rhg1) locus of soybean (Glycine max) confers resistance to soybean cyst nematode, a highly damaging soybean pest. Rhg1 loci encode repeat copies of atypical α-SNAP proteins that are defective in promoting NSF function and are cytotoxic in certain contexts. Here, we discovered an unusual NSF allele (Rhg1-associated NSF on chromosome 07; NSFRAN07 ) in Rhg1+ germplasm. NSFRAN07 protein modeling to mammalian NSF/α-SNAP complex structures indicated that at least three of the five NSFRAN07 polymorphisms reside adjacent to the α-SNAP binding interface. NSFRAN07 exhibited stronger in vitro binding with Rhg1 resistance-type α-SNAPs. NSFRAN07 coexpression in planta was more protective against Rhg1 α-SNAP cytotoxicity, relative to WT NSFCh07 Investigation of a previously reported segregation distortion between chromosome 18 Rhg1 and a chromosome 07 interval now known to contain the Glyma.07G195900 NSF gene revealed 100% coinheritance of the NSFRAN07 allele with disease resistance Rhg1 alleles, across 855 soybean accessions and in all examined Rhg1+ progeny from biparental crosses. Additionally, we show that some Rhg1-mediated resistance is associated with depletion of WT α-SNAP abundance via selective loss of WT α-SNAP loci. Hence atypical coevolution of the soybean SNARE-recycling machinery has balanced the acquisition of an otherwise disruptive housekeeping protein, enabling a valuable disease resistance trait. Our findings further indicate that successful engineering of Rhg1-related resistance in plants will require a compatible NSF partner for the resistance-conferring α-SNAP.
Collapse
Affiliation(s)
- Adam M Bayless
- Department of Plant Pathology, University of Wisconsin-Madison, Madison, WI 53706
| | - Ryan W Zapotocny
- Department of Plant Pathology, University of Wisconsin-Madison, Madison, WI 53706
| | - Derrick J Grunwald
- Department of Plant Pathology, University of Wisconsin-Madison, Madison, WI 53706
| | - Kaela K Amundson
- Department of Plant Pathology, University of Wisconsin-Madison, Madison, WI 53706
| | - Brian W Diers
- Department of Crop Sciences, University of Illinois, Urbana, IL 61801
| | - Andrew F Bent
- Department of Plant Pathology, University of Wisconsin-Madison, Madison, WI 53706;
| |
Collapse
|
159
|
Recurrent structural variation, clustered sites of selection, and disease risk for the complement factor H ( CFH) gene family. Proc Natl Acad Sci U S A 2018; 115:E4433-E4442. [PMID: 29686068 DOI: 10.1073/pnas.1717600115] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Structural variation and single-nucleotide variation of the complement factor H (CFH) gene family underlie several complex genetic diseases, including age-related macular degeneration (AMD) and atypical hemolytic uremic syndrome (AHUS). To understand its diversity and evolution, we performed high-quality sequencing of this ∼360-kbp locus in six primate lineages, including multiple human haplotypes. Comparative sequence analyses reveal two distinct periods of gene duplication leading to the emergence of four CFH-related (CFHR) gene paralogs (CFHR2 and CFHR4 ∼25-35 Mya and CFHR1 and CFHR3 ∼7-13 Mya). Remarkably, all evolutionary breakpoints share a common ∼4.8-kbp segment corresponding to an ancestral CFHR gene promoter that has expanded independently throughout primate evolution. This segment is recurrently reused and juxtaposed with a donor duplication containing exons 8 and 9 from ancestral CFH, creating four CFHR fusion genes that include lineage-specific members of the gene family. Combined analysis of >5,000 AMD cases and controls identifies a significant burden of a rare missense mutation that clusters at the N terminus of CFH [P = 5.81 × 10-8, odds ratio (OR) = 9.8 (3.67-Infinity)]. A bipolar clustering pattern of rare nonsynonymous mutations in patients with AMD (P < 10-3) and AHUS (P = 0.0079) maps to functional domains that show evidence of positive selection during primate evolution. Our structural variation analysis in >2,400 individuals reveals five recurrent rearrangement breakpoints that show variable frequency among AMD cases and controls. These data suggest a dynamic and recurrent pattern of mutation critical to the emergence of new CFHR genes but also in the predisposition to complex human genetic disease phenotypes.
Collapse
|
160
|
He G, Wang Z, Zou X, Chen X, Liu J, Wang M, Hou Y. Genetic diversity and phylogenetic characteristics of Chinese Tibetan and Yi minority ethnic groups revealed by non-CODIS STR markers. Sci Rep 2018; 8:5895. [PMID: 29651125 PMCID: PMC5897523 DOI: 10.1038/s41598-018-24291-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 03/13/2018] [Indexed: 12/16/2022] Open
Abstract
Non-CODIS STRs, with high polymorphism and allele frequency difference among ethnically and geographically different populations, play a crucial role in population genetics, molecular anthropology, and human forensics. In this work, 332 unrelated individuals from Sichuan Province (237 Tibetan individuals and 95 Yi individuals) are firstly genotyped with 21 non-CODIS autosomal STRs, and phylogenetic relationships with 26 previously investigated populations (9,444 individuals) are subsequently explored. In the Sichuan Tibetan and Yi, the combined power of discrimination (CPD) values are 0.9999999999999999999 and 0.9999999999999999993, and the combined power of exclusion (CPE) values are 0. 999997 and 0.999999, respectively. Analysis of molecular variance (AMOVA), principal component analysis (PCA), multidimensional scaling plots (MDS) and phylogenetic analysis demonstrated that Sichuan Tibetan has a close genetic relationship with Tibet Tibetan, and Sichuan Yi has a genetic affinity with Yunnan Bai group. Furthermore, significant genetic differences have widely existed between Chinese minorities (most prominently for Tibetan and Kazakh) and Han groups, but no population stratifications rather a homogenous group among Han populations distributed in Northern and Southern China are observed. Aforementioned results suggested that these 21 STRs are highly polymorphic and informative in the Sichuan Tibetan and Yi, which are suitable for population genetics and forensic applications.
Collapse
Affiliation(s)
- Guanglin He
- Institute of Forensic Medicine, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Zheng Wang
- Institute of Forensic Medicine, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Xing Zou
- Department of Forensic Medicine, College of Basic Medicine, Chongqing Medical University, Chongqing, China
| | - Xu Chen
- Department of Clinical Laboratory, the First People's Hospital of Liangshan Yi Autonomous Prefecture, Xichang, Sichuan, China
| | - Jing Liu
- Institute of Forensic Medicine, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Mengge Wang
- Institute of Forensic Medicine, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, Sichuan, China
| | - Yiping Hou
- Institute of Forensic Medicine, West China School of Basic Medical Sciences & Forensic Medicine, Sichuan University, Chengdu, Sichuan, China.
| |
Collapse
|
161
|
Shebanits K, Andersson-Assarsson JC, Larsson I, Carlsson LMS, Feuk L, Larhammar D. Copy number of pancreatic polypeptide receptor gene NPY4R correlates with body mass index and waist circumference. PLoS One 2018; 13:e0194668. [PMID: 29621259 PMCID: PMC5886410 DOI: 10.1371/journal.pone.0194668] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 03/07/2018] [Indexed: 01/14/2023] Open
Abstract
Multiple genetic studies have linked copy number variation (CNV) in different genes to body mass index (BMI) and obesity. A CNV on chromosome 10q11.22 has been associated with body weight. This CNV region spans NPY4R, the gene encoding the pancreatic polypeptide receptor Y4, which has been described as a satiety-stimulating receptor. We have investigated CNV of the NPY4R gene and analysed its relationship to BMI, waist circumference and self-reported dietary intake from 558 individuals (216 men and 342 women) representing a wide BMI range. The copy number for NPY4R ranged from 2 to 8 copies (average 4.6±0.8). Rather than the expected negative correlation, we observed a positive correlation between NPY4R copy number and BMI as well as waist circumference in women (Pearson’s r = 0.267, p = 2.65×10−7 and r = 0.256, p = 8×10−7, respectively). Each additional copy of NPY4R correlated with 2.6 kg/m2 increase in BMI and 5.67 cm increase in waist circumference (p = 2.8×10−5 and p = 6.2×10−5, respectively) for women. For men, there was no statistically significant correlation between CNV and BMI. Our results suggest that NPY4R genetic variation influences body weight in women, but the exact role of this receptor appears to be more complex than previously proposed.
Collapse
Affiliation(s)
| | | | - Ingrid Larsson
- Dept. of Gastroenterology and Hepatology, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Lena M. S. Carlsson
- Dept. of Molecular and Clinical Medicine, Sahlgrenska Academy at Gothenburg University, Gothenburg, Sweden
| | - Lars Feuk
- Dept. of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Dan Larhammar
- Dept. of Neuroscience, Uppsala University, Uppsala, Sweden
- * E-mail:
| |
Collapse
|
162
|
Kim J, Weber JA, Jho S, Jang J, Jun J, Cho YS, Kim HM, Kim H, Kim Y, Chung O, Kim CG, Lee H, Kim BC, Han K, Koh I, Chae KS, Lee S, Edwards JS, Bhak J. KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses. Sci Rep 2018; 8:5677. [PMID: 29618732 PMCID: PMC5885007 DOI: 10.1038/s41598-018-23837-x] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 03/16/2018] [Indexed: 01/05/2023] Open
Abstract
High-coverage whole-genome sequencing data of a single ethnicity can provide a useful catalogue of population-specific genetic variations, and provides a critical resource that can be used to more accurately identify pathogenic genetic variants. We report a comprehensive analysis of the Korean population, and present the Korean National Standard Reference Variome (KoVariome). As a part of the Korean Personal Genome Project (KPGP), we constructed the KoVariome database using 5.5 terabases of whole genome sequence data from 50 healthy Korean individuals in order to characterize the benign ethnicity-relevant genetic variation present in the Korean population. In total, KoVariome includes 12.7M single-nucleotide variants (SNVs), 1.7M short insertions and deletions (indels), 4K structural variations (SVs), and 3.6K copy number variations (CNVs). Among them, 2.4M (19%) SNVs and 0.4M (24%) indels were identified as novel. We also discovered selective enrichment of 3.8M SNVs and 0.5M indels in Korean individuals, which were used to filter out 1,271 coding-SNVs not originally removed from the 1,000 Genomes Project when prioritizing disease-causing variants. KoVariome health records were used to identify novel disease-causing variants in the Korean population, demonstrating the value of high-quality ethnic variation databases for the accurate interpretation of individual genomes and the precise characterization of genetic variations.
Collapse
Affiliation(s)
- Jungeun Kim
- Personal Genomics Institute, Genome Research Foundation, Cheongju, 28190, Republic of Korea
| | - Jessica A Weber
- Department of Biology, University of New Mexico, Albuquerque, NM, 87131, USA
| | - Sungwoong Jho
- Personal Genomics Institute, Genome Research Foundation, Cheongju, 28190, Republic of Korea
| | - Jinho Jang
- Department of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
- The Genomics Institute, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - JeHoon Jun
- Personal Genomics Institute, Genome Research Foundation, Cheongju, 28190, Republic of Korea
- Geromics, Ulsan, 44919, Republic of Korea
| | | | - Hak-Min Kim
- Department of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
- The Genomics Institute, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Hyunho Kim
- Geromics, Ulsan, 44919, Republic of Korea
| | - Yumi Kim
- Geromics, Ulsan, 44919, Republic of Korea
| | - OkSung Chung
- Personal Genomics Institute, Genome Research Foundation, Cheongju, 28190, Republic of Korea
- Geromics, Ulsan, 44919, Republic of Korea
| | - Chang Geun Kim
- National Standard Reference Center, Korea Research Institute of Standards and Science, Daejeon, 34113, Republic of Korea
| | - HyeJin Lee
- Personal Genomics Institute, Genome Research Foundation, Cheongju, 28190, Republic of Korea
| | | | - Kyudong Han
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan, 31116, Republic of Korea
| | - InSong Koh
- Department of Physiology, College of Medicine, Hanyang University, Seoul, 04763, Republic of Korea
| | - Kyun Shik Chae
- National Standard Reference Center, Korea Research Institute of Standards and Science, Daejeon, 34113, Republic of Korea
| | - Semin Lee
- Department of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
- The Genomics Institute, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea
| | - Jeremy S Edwards
- Chemistry and Chemical Biology, UNM Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM, 87131, USA.
| | - Jong Bhak
- Personal Genomics Institute, Genome Research Foundation, Cheongju, 28190, Republic of Korea.
- Department of Biomedical Engineering, School of Life Sciences, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea.
- The Genomics Institute, Ulsan National Institute of Science and Technology (UNIST), Ulsan, 44919, Republic of Korea.
- Geromics, Ulsan, 44919, Republic of Korea.
| |
Collapse
|
163
|
Calmodulin promotes matrix metalloproteinase 9 production and cell migration by inhibiting the ubiquitination and degradation of TBC1D3 oncoprotein in human breast cancer cells. Oncotarget 2018; 8:36383-36398. [PMID: 28422741 PMCID: PMC5482662 DOI: 10.18632/oncotarget.16756] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 03/22/2017] [Indexed: 11/25/2022] Open
Abstract
The hominoid oncoprotein TBC1D3 enhances growth factor (GF) signaling and GF signaling, conversely, induces the ubiquitination and subsequent degradation of TBC1D3. However, little is known regarding the regulation of this degradation, and the role of TBC1D3 in the progression of tumors has also not been defined. In the present study, we demonstrated that calmodulin (CaM), a ubiquitous cellular calcium sensor, specifically interacted with TBC1D3 in a Ca2+-dependent manner and inhibited GF signaling-induced ubiquitination and degradation of the oncoprotein in both cytoplasm and nucleus of human breast cancer cells. The CaM-interacting site of TBC1D3 was mapped to amino acids 157~171, which comprises two 1–14 hydrophobic motifs and one lysine residue (K166). Deletion of these motifs was shown to abolish interaction between TBC1D3 and CaM. Surprisingly, this deletion mutation caused inability of GF signaling to induce the ubiquitination and subsequent degradation of TBC1D3. In agreement with this, we identified lysine residue 166 within the CaM-interacting motifs of TBC1D3 as the actual site for the GF signaling-induced ubiquitination using mutational analysis. Point mutation of this lysine residue exhibited the same effect on TBC1D3 as the deletion mutant, suggesting that CaM inhibits GF signaling-induced degradation of TBC1D3 by occluding its ubiquitination at K166. Notably, we found that TBC1D3 promoted the expression and activation of MMP-9 and the migration of MCF-7 cells. Furthermore, interaction with CaM considerably enhanced such effect of TBC1D3. Taken together, our work reveals a novel model by which CaM promotes cell migration through inhibiting the ubiquitination and degradation of TBC1D3.
Collapse
|
164
|
Florio M, Heide M, Pinson A, Brandl H, Albert M, Winkler S, Wimberger P, Huttner WB, Hiller M. Evolution and cell-type specificity of human-specific genes preferentially expressed in progenitors of fetal neocortex. eLife 2018; 7:32332. [PMID: 29561261 PMCID: PMC5898914 DOI: 10.7554/elife.32332] [Citation(s) in RCA: 121] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 03/09/2018] [Indexed: 01/21/2023] Open
Abstract
Understanding the molecular basis that underlies the expansion of the neocortex during primate, and notably human, evolution requires the identification of genes that are particularly active in the neural stem and progenitor cells of the developing neocortex. Here, we have used existing transcriptome datasets to carry out a comprehensive screen for protein-coding genes preferentially expressed in progenitors of fetal human neocortex. We show that 15 human-specific genes exhibit such expression, and many of them evolved distinct neural progenitor cell-type expression profiles and levels compared to their ancestral paralogs. Functional studies on one such gene, NOTCH2NL, demonstrate its ability to promote basal progenitor proliferation in mice. An additional 35 human genes with progenitor-enriched expression are shown to have orthologs only in primates. Our study provides a resource of genes that are promising candidates to exert specific, and novel, roles in neocortical development during primate, and notably human, evolution.
Collapse
Affiliation(s)
- Marta Florio
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Michael Heide
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Anneline Pinson
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Holger Brandl
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Mareike Albert
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Sylke Winkler
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Pauline Wimberger
- Klinik und Poliklinik für Frauenheilkunde und Geburtshilfe, Universitätsklinikum Carl Gustav Carus, Technische Universität Dresden, Dresden, Germany
| | - Wieland B Huttner
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany
| | - Michael Hiller
- Max Planck Institute of Molecular Cell Biology and Genetics, Dresden, Germany.,Max Planck Institute for the Physics of Complex Systems, Dresden, Germany
| |
Collapse
|
165
|
Numanagić I, Malikić S, Ford M, Qin X, Toji L, Radovich M, Skaar TC, Pratt VM, Berger B, Scherer S, Sahinalp SC. Allelic decomposition and exact genotyping of highly polymorphic and structurally variant genes. Nat Commun 2018; 9:828. [PMID: 29483503 PMCID: PMC5826927 DOI: 10.1038/s41467-018-03273-1] [Citation(s) in RCA: 56] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 02/01/2018] [Indexed: 12/30/2022] Open
Abstract
High-throughput sequencing provides the means to determine the allelic decomposition for any gene of interest-the number of copies and the exact sequence content of each copy of a gene. Although many clinically and functionally important genes are highly polymorphic and have undergone structural alterations, no high-throughput sequencing data analysis tool has yet been designed to effectively solve the full allelic decomposition problem. Here we introduce a combinatorial optimization framework that successfully resolves this challenging problem, including for genes with structural alterations. We provide an associated computational tool Aldy that performs allelic decomposition of highly polymorphic, multi-copy genes through using whole or targeted genome sequencing data. For a large diverse sequencing data set, Aldy identifies multiple rare and novel alleles for several important pharmacogenes, significantly improving upon the accuracy and utility of current genotyping assays. As more data sets become available, we expect Aldy to become an essential component of genotyping toolkits.
Collapse
Affiliation(s)
- Ibrahim Numanagić
- School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Salem Malikić
- School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - Michael Ford
- School of Computing Science, Simon Fraser University, Burnaby, BC, V5A 1S6, Canada
| | - Xiang Qin
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, 77030, USA
| | - Lorraine Toji
- Coriell Institute for Medical Research, Camden, NJ, 08103, USA
| | - Milan Radovich
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Todd C Skaar
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Victoria M Pratt
- Indiana University School of Medicine, Indianapolis, IN, 46202, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Steve Scherer
- Baylor College of Medicine Human Genome Sequencing Center, Houston, TX, 77030, USA
| | - S Cenk Sahinalp
- Department of Computer Science, Indiana University, Bloomington, IN, 47405, USA.
| |
Collapse
|
166
|
De novo vs. inherited copy number variations in multiple sclerosis susceptibility. Cell Mol Immunol 2018; 15:812-814. [PMID: 29429997 DOI: 10.1038/cmi.2017.166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Revised: 12/09/2017] [Accepted: 12/09/2017] [Indexed: 11/08/2022] Open
|
167
|
Karimi K, Esmailizadeh A, Wu DD, Gondro C. Mapping of genome-wide copy number variations in the Iranian indigenous cattle using a dense SNP data set. ANIMAL PRODUCTION SCIENCE 2018. [DOI: 10.1071/an16384] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The objective of this study was to present the first map of the copy number variations (CNVs) in Iranian indigenous cattle based on a high-density single nucleotide polymorphism (SNP) dataset. A total of 90 individuals were genotyped using the Illumina BovineHD BeadChip containing 777 962 SNPs. The QuantiSNP algorithm was used to perform a genome-wide CNV detection across autosomal genome. After merging the overlapping CNV, a total of 221 CNV regions were identified encompassing 36.4 Mb or 1.44% of the bovine autosomal genome. The length of the CNV regions ranged from 3.5 to 2252.8 Kb with an average of 163.8 Kb. These regions included 147 loss (66.52%) and 74 gain (33.48%) events containing a total of 637 annotated Ensembl genes. Gene ontology analysis revealed that most of genes in the CNV regions were involved in environmental responses, disease susceptibility and immune system functions. Furthermore, 543 of these genes corresponded to the human orthologous genes, which involved in a wide range of biological functions. Altogether, 73% of the 221 CNV regions overlapped either completely or partially with those previously reported in other cattle studies. Moreover, novel CNV regions involved several quantitative trait loci (QTL)-related to adaptative traits of Iranian indigenous cattle. These results provided a basis to conduct future studies on association between CNV regions and phenotypic variations in the Iranian indigenous cattle.
Collapse
|
168
|
Freund M, Taylor A, Ng C, Little AR. The NIH NeuroBioBank: creating opportunities for human brain research. HANDBOOK OF CLINICAL NEUROLOGY 2018; 150:41-48. [PMID: 29496155 DOI: 10.1016/b978-0-444-63639-3.00004-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The National Institutes of Health (NIH) NeuroBioBank is a federally funded research resource for human neurologic diseases and disorders. This chapter will discuss the principles that guided the creation of the NIH NeuroBioBank and the rationale for the resource model selected. In addition, we will describe some performance metrics in the first 2 years and highlight recent advances in biomedical neuroscience that could only have been achieved using postmortem human tissues. The NIH NeuroBioBank was created in order to increase availability of high-quality postmortem human brain tissues to the research community across a broad spectrum of neurologic diseases and disorders, and to achieve economies of scale over previous funding and organizational models. In addition, we aim to increase public awareness about the value of human tissue donation for research by providing web-based information to the public and through active outreach to disease advocacy communities. Studies with human brain tissue have led to a rapid increase in our knowledge of the biologic differences between humans and are bridging the divide between humans and model organisms. Studies of human brain are beginning to give us a glimpse not only into what makes us uniquely human as well as how individual biology may be connected to health and disease.
Collapse
Affiliation(s)
- Michelle Freund
- National Institute of Mental Health, Rockville, MD, United States
| | - Anna Taylor
- National Institute of Neurological Disorders and Stroke, Rockville, MD, United States
| | - Cathy Ng
- National Institute of Mental Health, Rockville, MD, United States
| | - A Roger Little
- National Institute on Drug Abuse, Rockville, MD, United States.
| |
Collapse
|
169
|
Xu L, Yang L, Bickhart DM, Li J, Liu GE. Analysis of Population-Genetic Properties of Copy Number Variations. Methods Mol Biol 2018; 1833:179-186. [PMID: 30039373 DOI: 10.1007/978-1-4939-8666-8_14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
While single nucleotide polymorphisms (SNPs) are typically the variant of choice for population genetics, copy number variations (CNVs) which comprise insertions, deletions and duplications of genomic sequences, is also an informative type of genetic variation. CNVs have been shown to be both common in mammals and important for understanding the relationship between genotype and phenotype. Moreover, population-specific CNVs are candidate regions under selection and are potentially responsible for diverse phenotypes.
Collapse
Affiliation(s)
- Lingyang Xu
- Institute of Animal Science, Beijing, China.
| | - Liu Yang
- Institute of Animal Science, Beijing, China
| | - Derek M Bickhart
- Research Microbiologist/Bioinformatician, USDA ARS DFRC, Madison, WI, USA
| | - JunYa Li
- Institute of Animal Science, Beijing, China
| | - George E Liu
- Animal Genomics and Improvement Laboratory, USDA ARS, Beltsville, MD, USA
| |
Collapse
|
170
|
Levchenko A, Kanapin A, Samsonova A, Gainetdinov RR. Human Accelerated Regions and Other Human-Specific Sequence Variations in the Context of Evolution and Their Relevance for Brain Development. Genome Biol Evol 2018; 10:166-188. [PMID: 29149249 PMCID: PMC5767953 DOI: 10.1093/gbe/evx240] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2017] [Indexed: 12/24/2022] Open
Abstract
The review discusses, in a format of a timeline, the studies of different types of genetic variants, present in Homo sapiens, but absent in all other primate, mammalian, or vertebrate species, tested so far. The main characteristic of these variants is that they are found in regions of high evolutionary conservation. These sequence variations include single nucleotide substitutions (called human accelerated regions), deletions, and segmental duplications. The rationale for finding such variations in the human genome is that they could be responsible for traits, specific to our species, of which the human brain is the most remarkable. As became obvious, the vast majority of human-specific single nucleotide substitutions are found in noncoding, likely regulatory regions. A number of genes, associated with these human-specific alleles, often through novel enhancer activity, were in fact shown to be implicated in human-specific development of certain brain areas, including the prefrontal cortex. Human-specific deletions may remove regulatory sequences, such as enhancers. Segmental duplications, because of their large size, create new coding sequences, like new functional paralogs. Further functional study of these variants will shed light on evolution of our species, as well as on the etiology of neurodevelopmental disorders.
Collapse
Affiliation(s)
- Anastasia Levchenko
- Institute of Translational Biomedicine, Saint Petersburg State University, Russia
| | - Alexander Kanapin
- Institute of Translational Biomedicine, Saint Petersburg State University, Russia
- Department of Oncology, University of Oxford, United Kingdom
| | - Anastasia Samsonova
- Institute of Translational Biomedicine, Saint Petersburg State University, Russia
- Department of Oncology, University of Oxford, United Kingdom
| | - Raul R Gainetdinov
- Institute of Translational Biomedicine, Saint Petersburg State University, Russia
- Skolkovo Institute of Science and Technology, Skolkovo, Moscow, Russia
| |
Collapse
|
171
|
Serres-Armero A, Povolotskaya IS, Quilez J, Ramirez O, Santpere G, Kuderna LFK, Hernandez-Rodriguez J, Fernandez-Callejo M, Gomez-Sanchez D, Freedman AH, Fan Z, Novembre J, Navarro A, Boyko A, Wayne R, Vilà C, Lorente-Galdos B, Marques-Bonet T. Similar genomic proportions of copy number variation within gray wolves and modern dog breeds inferred from whole genome sequencing. BMC Genomics 2017; 18:977. [PMID: 29258433 PMCID: PMC5735816 DOI: 10.1186/s12864-017-4318-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 11/17/2017] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Whole genome re-sequencing data from dogs and wolves are now commonly used to study how natural and artificial selection have shaped the patterns of genetic diversity. Single nucleotide polymorphisms, microsatellites and variants in mitochondrial DNA have been interrogated for links to specific phenotypes or signals of domestication. However, copy number variation (CNV), despite its increasingly recognized importance as a contributor to phenotypic diversity, has not been extensively explored in canids. RESULTS Here, we develop a new accurate probabilistic framework to create fine-scale genomic maps of segmental duplications (SDs), compare patterns of CNV across groups and investigate their role in the evolution of the domestic dog by using information from 34 canine genomes. Our analyses show that duplicated regions are enriched in genes and hence likely possess functional importance. We identify 86 loci with large CNV differences between dogs and wolves, enriched in genes responsible for sensory perception, immune response, metabolic processes, etc. In striking contrast to the observed loss of nucleotide diversity in domestic dogs following the population bottlenecks that occurred during domestication and breed creation, we find a similar proportion of CNV loci in dogs and wolves, suggesting that other dynamics are acting to particularly select for CNVs with potentially functional impacts. CONCLUSIONS This work is the first comparison of genome wide CNV patterns in domestic and wild canids using whole-genome sequencing data and our findings contribute to study the impact of novel kinds of genetic changes on the evolution of the domestic dog.
Collapse
Affiliation(s)
- Aitor Serres-Armero
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain
| | - Inna S Povolotskaya
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain
| | - Javier Quilez
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain.,CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Oscar Ramirez
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain.,Vetgenomics, 08193, Barcelona, Spain
| | - Gabriel Santpere
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain.,Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA
| | - Lukas F K Kuderna
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain
| | - Jessica Hernandez-Rodriguez
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain
| | - Marcos Fernandez-Callejo
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Daniel Gomez-Sanchez
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain
| | - Adam H Freedman
- UCLA, Department of Ecology and Evolutionary Biology, Los Angeles, CA, 90095, USA
| | - Zhenxin Fan
- Key Laboratory of Bioresources and Ecoenvironment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, 610064, People's Republic of China
| | - John Novembre
- UCLA, Department of Ecology and Evolutionary Biology, Los Angeles, CA, 90095, USA
| | - Arcadi Navarro
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain.,CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institucio Catalana de Recerca i Estudis Avançats (ICREA), 08010, Barcelona, Catalonia, Spain
| | - Adam Boyko
- Cornell University, Department of Biological Statistics and Computational Biology, New York, NY, 14853, USA
| | - Robert Wayne
- UCLA, Department of Ecology and Evolutionary Biology, Los Angeles, CA, 90095, USA
| | - Carles Vilà
- Estación Biológica de Doñana EBD-CSIC, Department of Integrative Ecology, 41092, Sevilla, Spain
| | - Belen Lorente-Galdos
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain. .,Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA.
| | - Tomas Marques-Bonet
- IBE, Institut de Biologia Evolutiva (Universitat Pompeu Fabra/CSIC), Ciencies Experimentals i de la Salut, 08003, Barcelona, Spain. .,CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain. .,Institucio Catalana de Recerca i Estudis Avançats (ICREA), 08010, Barcelona, Catalonia, Spain.
| |
Collapse
|
172
|
Heide M, Long KR, Huttner WB. Novel gene function and regulation in neocortex expansion. Curr Opin Cell Biol 2017; 49:22-30. [PMID: 29227861 DOI: 10.1016/j.ceb.2017.11.008] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2017] [Revised: 11/18/2017] [Accepted: 11/26/2017] [Indexed: 01/01/2023]
Abstract
The expansion of the neocortex during human evolution is due to changes in our genome that result in increased and prolonged proliferation of neural stem and progenitor cells during neocortex development. Three principal types of such genomic changes can be distinguished, first, novel gene regulation in human, second, novel function in human of genes existing in both human and non-human species, and third, novel, human-specific genes. The latter comprise both, increases in the copy number of genes existing also in non-human species, and the emergence of genes giving rise to unique, human-specific gene products. Examples of all these types of changes in the human genome have been identified, with ARHGAP11B constituting a paradigmatic example of a unique, human-specific protein.
Collapse
Affiliation(s)
- Michael Heide
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, D-01307 Dresden, Germany
| | - Katherine R Long
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, D-01307 Dresden, Germany
| | - Wieland B Huttner
- Max Planck Institute of Molecular Cell Biology and Genetics, Pfotenhauerstr. 108, D-01307 Dresden, Germany.
| |
Collapse
|
173
|
Zuccherato LW, Schneider S, Tarazona-Santos E, Hardwick RJ, Berg DE, Bogle H, Gouveia MH, Machado LR, Machado M, Rodrigues-Soares F, Soares-Souza GB, Togni DL, Zamudio R, Gilman RH, Duarte D, Hollox EJ, Rodrigues MR. Population genetics of immune-related multilocus copy number variation in Native Americans. J R Soc Interface 2017; 14:rsif.2017.0057. [PMID: 28356540 DOI: 10.1098/rsif.2017.0057] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2017] [Accepted: 03/02/2017] [Indexed: 12/22/2022] Open
Abstract
While multiallelic copy number variation (mCNV) loci are a major component of genomic variation, quantifying the individual copy number of a locus and defining genotypes is challenging. Few methods exist to study how mCNV genetic diversity is apportioned within and between populations (i.e. to define the population genetic structure of mCNV). These inferences are critical in populations with a small effective size, such as Amerindians, that may not fit the Hardy-Weinberg model due to inbreeding, assortative mating, population subdivision, natural selection or a combination of these evolutionary factors. We propose a likelihood-based method that simultaneously infers mCNV allele frequencies and the population structure parameter f, which quantifies the departure of homozygosity from the Hardy-Weinberg expectation. This method is implemented in the freely available software CNVice, which also infers individual genotypes using information from both the population and from trios, if available. We studied the population genetics of five immune-related mCNV loci associated with complex diseases (beta-defensins, CCL3L1/CCL4L1, FCGR3A, FCGR3B and FCGR2C) in 12 traditional Native American populations and found that the population structure parameters inferred for these mCNVs are comparable to but lower than those for single nucleotide polymorphisms studied in the same populations.
Collapse
Affiliation(s)
- Luciana W Zuccherato
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Silvana Schneider
- Departamento de Estatística, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Eduardo Tarazona-Santos
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | | | - Douglas E Berg
- Department of Molecular Microbiology, Washington University in Saint Louis School of Medicine, St Louis, MO, USA.,Department of Medicine, University of California San Diego, CA, USA
| | - Helen Bogle
- Department of Genetics, University of Leicester, Leicester, UK
| | - Mateus H Gouveia
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Lee R Machado
- Department of Genetics, University of Leicester, Leicester, UK.,School of Health, University of Northampton, Northampton, UK
| | - Moara Machado
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Fernanda Rodrigues-Soares
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Giordano B Soares-Souza
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Diego L Togni
- Departamento de Estatística, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Roxana Zamudio
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Robert H Gilman
- Johns Hopkins School of Public Health, Johns Hopkins University, Baltimore, MD, USA.,Asociación Benéfica PRISMA, Lima, Peru.,Universidade Peruana Cayetano Heredia, Lima, Peru
| | - Denise Duarte
- Departamento de Estatística, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| | - Edward J Hollox
- Department of Genetics, University of Leicester, Leicester, UK
| | - Maíra R Rodrigues
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, Brazil
| |
Collapse
|
174
|
Yang L, Xu L, Zhu B, Niu H, Zhang W, Miao J, Shi X, Zhang M, Chen Y, Zhang L, Gao X, Gao H, Li L, Liu GE, Li J. Genome-wide analysis reveals differential selection involved with copy number variation in diverse Chinese Cattle. Sci Rep 2017; 7:14299. [PMID: 29085051 PMCID: PMC5662686 DOI: 10.1038/s41598-017-14768-0] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 10/12/2017] [Indexed: 12/20/2022] Open
Abstract
Copy number variations (CNVs) are defined as deletions, insertions, and duplications between two individuals of a species. To investigate the diversity and population-genetic properties of CNVs and their diverse selection patterns, we performed a genome-wide CNV analysis using high density SNP array in Chinese native cattle. In this study, we detected a total of 13,225 CNV events and 3,356 CNV regions (CNVRs), overlapping with 1,522 annotated genes. Among them, approximately 71.43 Mb of novel CNVRs were detected in the Chinese cattle population for the first time, representing the unique genomic resources in cattle. A new V i statistic was proposed to estimate the region-specific divergence in CNVR for each group based on unbiased estimates of pairwise V ST . We obtained 12 and 62 candidate CNVRs at the top 1% and top 5% of genome-wide V i value thresholds for each of four groups (North, Northwest, Southwest and South). Moreover, we identified many lineage-differentiated CNV genes across four groups, which were associated with several important molecular functions and biological processes, including metabolic process, response to stimulus, immune system, and others. Our findings provide some insights into understanding lineage-differentiated CNVs under divergent selection in the Chinese native cattle.
Collapse
Affiliation(s)
- Liu Yang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.,Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, Sichuan, 611130, China
| | - Lingyang Xu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.
| | - Bo Zhu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Hong Niu
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Wengang Zhang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Jian Miao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.,College of Animal Sciences, Fujian Agriculture and Forestry University, Fuzhou, Fujian, 350002, China
| | - Xinping Shi
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.,College of Animal Science and Technology, Agricultural University of Hebei, Baoding, Hebei, 071001, China
| | - Ming Zhang
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, Sichuan, 611130, China
| | - Yan Chen
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Lupei Zhang
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Xue Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Huijiang Gao
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China
| | - Li Li
- Farm Animal Genetic Resources Exploration and Innovation Key Laboratory of Sichuan Province, Sichuan Agricultural University, Chengdu, Sichuan, 611130, China
| | - George E Liu
- Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, Maryland, 20705, USA
| | - Junya Li
- Laboratory of Molecular Biology and Bovine Breeding, Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China.
| |
Collapse
|
175
|
Affiliation(s)
- Chet C. Sherwood
- Department of Anthropology and Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC 20052
| | - Aida Gómez-Robles
- Department of Anthropology and Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC 20052
- Department of Genetics, Evolution and Environment, University College London, London WC1E 6BT, United Kingdom
| |
Collapse
|
176
|
Turner TN, Coe BP, Dickel DE, Hoekzema K, Nelson BJ, Zody MC, Kronenberg ZN, Hormozdiari F, Raja A, Pennacchio LA, Darnell RB, Eichler EE. Genomic Patterns of De Novo Mutation in Simplex Autism. Cell 2017; 171:710-722.e12. [PMID: 28965761 PMCID: PMC5679715 DOI: 10.1016/j.cell.2017.08.047] [Citation(s) in RCA: 228] [Impact Index Per Article: 32.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 08/03/2017] [Accepted: 08/25/2017] [Indexed: 12/22/2022]
Abstract
To further our understanding of the genetic etiology of autism, we generated and analyzed genome sequence data from 516 idiopathic autism families (2,064 individuals). This resource includes >59 million single-nucleotide variants (SNVs) and 9,212 private copy number variants (CNVs), of which 133,992 and 88 are de novo mutations (DNMs), respectively. We estimate a mutation rate of ∼1.5 × 10-8 SNVs per site per generation with a significantly higher mutation rate in repetitive DNA. Comparing probands and unaffected siblings, we observe several DNM trends. Probands carry more gene-disruptive CNVs and SNVs, resulting in severe missense mutations and mapping to predicted fetal brain promoters and embryonic stem cell enhancers. These differences become more pronounced for autism genes (p = 1.8 × 10-3, OR = 2.2). Patients are more likely to carry multiple coding and noncoding DNMs in different genes, which are enriched for expression in striatal neurons (p = 3 × 10-3), suggesting a path forward for genetically characterizing more complex cases of autism.
Collapse
Affiliation(s)
- Tychele N Turner
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Bradley P Coe
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Diane E Dickel
- Functional Genomics Department, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Bradley J Nelson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | | | - Zev N Kronenberg
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Fereydoun Hormozdiari
- Department of Biochemistry and Molecular Medicine, University of California, Davis, Davis, CA 95817, USA
| | - Archana Raja
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Len A Pennacchio
- Functional Genomics Department, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Robert B Darnell
- New York Genome Center, New York, NY 10013, USA; Laboratory of Molecular Neuro-Oncology, The Rockefeller University, New York, NY 10065, USA; Howard Hughes Medical Institute, The Rockefeller University, New York, NY 10065, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
177
|
Sporny M, Guez-Haddad J, Kreusch A, Shakartzi S, Neznansky A, Cross A, Isupov MN, Qualmann B, Kessels MM, Opatowsky Y. Structural History of Human SRGAP2 Proteins. Mol Biol Evol 2017; 34:1463-1478. [PMID: 28333212 PMCID: PMC5435084 DOI: 10.1093/molbev/msx094] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
In the development of the human brain, human-specific genes are considered to play key roles, conferring its unique advantages and vulnerabilities. At the time of Homo lineage divergence from Australopithecus, SRGAP2C gradually emerged through a process of serial duplications and mutagenesis from ancestral SRGAP2A (3.4–2.4 Ma). Remarkably, ectopic expression of SRGAP2C endows cultured mouse brain cells, with human-like characteristics, specifically, increased dendritic spine length and density. To understand the molecular mechanisms underlying this change in neuronal morphology, we determined the structure of SRGAP2A and studied the interplay between SRGAP2A and SRGAP2C. We found that: 1) SRGAP2A homo-dimerizes through a large interface that includes an F-BAR domain, a newly identified F-BAR extension (Fx), and RhoGAP-SH3 domains. 2) SRGAP2A has an unusual inverse geometry, enabling associations with lamellipodia and dendritic spine heads in vivo, and scaffolding of membrane protrusions in cell culture. 3) As a result of the initial partial duplication event (∼3.4 Ma), SRGAP2C carries a defective Fx-domain that severely compromises its solubility and membrane-scaffolding ability. Consistently, SRGAP2A:SRAGP2C hetero-dimers form, but are insoluble, inhibiting SRGAP2A activity. 4) Inactivation of SRGAP2A is sensitive to the level of hetero-dimerization with SRGAP2C. 5) The primal form of SRGAP2C (P-SRGAP2C, existing between ∼3.4 and 2.4 Ma) is less effective in hetero-dimerizing with SRGAP2A than the modern SRGAP2C, which carries several substitutions (from ∼2.4 Ma). Thus, the genetic mutagenesis phase contributed to modulation of SRGAP2A’s inhibition of neuronal expansion, by introducing and improving the formation of inactive SRGAP2A:SRGAP2C hetero-dimers, indicating a stepwise involvement of SRGAP2C in human evolutionary history.
Collapse
Affiliation(s)
- Michael Sporny
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Julia Guez-Haddad
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Annett Kreusch
- Institute for Biochemistry I, Jena University Hospital, Friedrich Schiller University Jena, Jena, Germany
| | - Sivan Shakartzi
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Avi Neznansky
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Alice Cross
- Department of Biosciences, College of Life and Environmental Sciences, University of Exeter, Exeter, United Kingdom
| | - Michail N Isupov
- Department of Biosciences, College of Life and Environmental Sciences, University of Exeter, Exeter, United Kingdom
| | - Britta Qualmann
- Institute for Biochemistry I, Jena University Hospital, Friedrich Schiller University Jena, Jena, Germany
| | - Michael M Kessels
- Institute for Biochemistry I, Jena University Hospital, Friedrich Schiller University Jena, Jena, Germany
| | - Yarden Opatowsky
- The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| |
Collapse
|
178
|
Sohrabi SS, Mohammadabadi M, Wu DD, Esmailizadeh A. Detection of breed-specific copy number variations in domestic chicken genome. Genome 2017; 61:7-14. [PMID: 28961404 DOI: 10.1139/gen-2017-0016] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Copy number variations (CNVs) are important large-scale variants. They are widespread in the genome and may contribute to phenotypic variation. Detection and characterization of CNVs can provide new insights into the genetic basis of important traits. Here, we perform whole-genome short read sequence analysis to identify CNVs in two indigenous and commercial chicken breeds to evaluate the impact of the identified CNVs on breed-specific traits. After filtration, a total of 12 955 CNVs spanning (on average) about 9.42% of the chicken genome were found that made up 5467 CNV regions (CNVRs). Chicken quantitative trait loci (QTL) datasets and Ensembl gene annotations were used as resources for the estimation of potential phenotypic effects of our CNVRs on breed-specific traits. In total, 34% of our detected CNVRs were also detected in earlier CNV studies. These CNVRs partly overlap several previously reported QTL and gene ontology terms associated with some important traits, including shank length QTL in Creeper-specific CNVRs and body weight and egg production characteristics, as well as muscle and body organ growth, in the Arian commercial breed. Our findings provide new insights into the genomic structure of the chicken genome for an improved understanding of the potential roles of CNVRs in differentiating between breeds or lines.
Collapse
Affiliation(s)
- Saeed S Sohrabi
- a Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, PB 76169-133, Kerman, Iran.,b Young Researchers Society, Shahid Bahonar University of Kerman, PB 76169-133, Kerman, Iran
| | - Mohammadreza Mohammadabadi
- a Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, PB 76169-133, Kerman, Iran
| | - Dong-Dong Wu
- c State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China.,d Kunming College of Life Science, University of Chinese Academy of Sciences, Kunming 650204, China
| | - Ali Esmailizadeh
- a Department of Animal Science, Faculty of Agriculture, Shahid Bahonar University of Kerman, PB 76169-133, Kerman, Iran.,c State Key Laboratory of Genetic Resources and Evolution & Yunnan Laboratory of Molecular Biology of Domestic Animals, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming 650223, China
| |
Collapse
|
179
|
Abstract
For a subset of genes in our genome a change in gene dosage, by duplication or deletion, causes a phenotypic effect. These dosage-sensitive genes may confer an advantage upon copy number change, but more typically they are associated with disease, including heart disease, cancers and neuropsychiatric disorders. This gene copy number sensitivity creates characteristic evolutionary constraints that can serve as a diagnostic to identify dosage-sensitive genes. Though the link between copy number change and disease is well-established, the mechanism of pathogenicity is usually opaque. We propose that gene expression level may provide a common basis for the pathogenic effects of many copy number variants.
Collapse
Affiliation(s)
- Alan M Rice
- Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin 2, Ireland
| | - Aoife McLysaght
- Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin 2, Ireland.
| |
Collapse
|
180
|
Enhancing our brains: Genomic mechanisms underlying cortical evolution. Semin Cell Dev Biol 2017; 76:23-32. [PMID: 28864345 DOI: 10.1016/j.semcdb.2017.08.045] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2017] [Accepted: 08/24/2017] [Indexed: 12/31/2022]
Abstract
Our most distinguishing higher cognitive functions are controlled by the cerebral cortex. Comparative studies detail abundant anatomical and cellular features unique to the human developing and adult neocortex. Emerging genomic studies have further defined vast differences distinguishing developing human neocortices from related primates. These human-specific changes can affect gene function and/or expression, and result from structural variations such as chromosomal deletions and duplications, or from point mutations in coding and noncoding regulatory regions. Here, we review this rapidly growing field which aims to identify and characterize genetic loci unique to the human cerebral cortex. We catalog known human-specific genomic changes distinct from other primates, including those whose function has been interrogated in animal models. We also discuss how new model systems and technologies such as single cell RNA sequencing, primate iPSCs, and gene editing, are enabling the field to gain unprecedented resolution into function of these human-specific changes. Some neurological disorders are thought to uniquely present in humans, thus reinforcing the need to comprehensively understand human-specific gene expression in the developing brain.
Collapse
|
181
|
Wild HM, Heckemann RA, Studholme C, Hammers A. Gyri of the human parietal lobe: Volumes, spatial extents, automatic labelling, and probabilistic atlases. PLoS One 2017; 12:e0180866. [PMID: 28846692 PMCID: PMC5573296 DOI: 10.1371/journal.pone.0180866] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 06/22/2017] [Indexed: 01/16/2023] Open
Abstract
Accurately describing the anatomy of individual brains enables interlaboratory communication of functional and developmental studies and is crucial for possible surgical interventions. The human parietal lobe participates in multimodal sensory integration including language processing and also contains the primary somatosensory area. We describe detailed protocols to subdivide the parietal lobe, analyze morphological and volumetric characteristics, and create probabilistic atlases in MNI152 stereotaxic space. The parietal lobe was manually delineated on 3D T1 MR images of 30 healthy subjects and divided into four regions: supramarginal gyrus (SMG), angular gyrus (AG), superior parietal lobe (supPL) and postcentral gyrus (postCG). There was the expected correlation of male gender with larger brain and intracranial volume. We examined a wide range of anatomical features of the gyri and the sulci separating them. At least a rudimentary primary intermediate sulcus of Jensen (PISJ) separating SMG and AG was identified in nearly all (59/60) hemispheres. Presence of additional gyri in SMG and AG was related to sulcal features and volumetric characteristics. The parietal lobe was slightly (2%) larger on the left, driven by leftward asymmetries of the postCG and SMG. Intersubject variability was highest for SMG and AG, and lowest for postCG. Overall the morphological characteristics tended to be symmetrical, and volumes also tended to covary between hemispheres. This may reflect developmental as well as maturation factors. To assess the accuracy with which the labels can be used to segment newly acquired (unlabelled) T1-weighted brain images, we applied multi-atlas label propagation software (MAPER) in a leave-one-out experiment and compared the resulting automatic labels with the manually prepared ones. The results showed strong agreement (mean Jaccard index 0.69, corresponding to a mean Dice index of 0.82, average mean volume error of 0.6%). Stereotaxic probabilistic atlases of each subregion were obtained. They illustrate the physiological brain torque, with structures in the right hemisphere positioned more anteriorly than in the left, and right/left positional differences of up to 10 mm. They also allow an assessment of sulcal variability, e.g. low variability for parietooccipital fissure and cingulate sulcus. Illustrated protocols, individual label sets, probabilistic atlases, and a maximum-probability atlas which takes into account surrounding structures are available for free download under academic licences.
Collapse
Affiliation(s)
- Heather M. Wild
- Neurodis Foundation, Lyon, France
- Univ Lyon, Université Claude Bernard Lyon 1, Inserm, Stem Cell and Brain Research Institute U1208, Bron, France
| | - Rolf A. Heckemann
- Neurodis Foundation, Lyon, France
- MedTech West at Sahlgrenska University Hospital, University of Gothenburg, Gothenburg, Sweden
| | - Colin Studholme
- Department of Pediatrics, Division of Neonatology, University of Washington, Seattle, Washington, United States of America
| | - Alexander Hammers
- Neurodis Foundation, Lyon, France
- Division of Imaging Sciences and Biomedical Engineering, King’s College London, London, United Kingdom
- * E-mail:
| |
Collapse
|
182
|
Astling DP, Heft IE, Jones KL, Sikela JM. High resolution measurement of DUF1220 domain copy number from whole genome sequence data. BMC Genomics 2017; 18:614. [PMID: 28807002 PMCID: PMC5556342 DOI: 10.1186/s12864-017-3976-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2017] [Accepted: 07/31/2017] [Indexed: 11/10/2022] Open
Abstract
Background DUF1220 protein domains found primarily in Neuroblastoma BreakPoint Family (NBPF) genes show the greatest human lineage-specific increase in copy number of any coding region in the genome. There are 302 haploid copies of DUF1220 in hg38 (~160 of which are human-specific) and the majority of these can be divided into 6 different subtypes (referred to as clades). Copy number changes of specific DUF1220 clades have been associated in a dose-dependent manner with brain size variation (both evolutionarily and within the human population), cognitive aptitude, autism severity, and schizophrenia severity. However, no published methods can directly measure copies of DUF1220 with high accuracy and no method can distinguish between domains within a clade. Results Here we describe a novel method for measuring copies of DUF1220 domains and the NBPF genes in which they are found from whole genome sequence data. We have characterized the effect that various sequencing and alignment parameters and strategies have on the accuracy and precision of the method and defined the parameters that lead to optimal DUF1220 copy number measurement and resolution. We show that copy number estimates obtained using our read depth approach are highly correlated with those generated by ddPCR for three representative DUF1220 clades. By simulation, we demonstrate that our method provides sufficient resolution to analyze DUF1220 copy number variation at three levels: (1) DUF1220 clade copy number within individual genes and groups of genes (gene-specific clade groups) (2) genome wide DUF1220 clade copies and (3) gene copy number for DUF1220-encoding genes. Conclusions To our knowledge, this is the first method to accurately measure copies of all six DUF1220 clades and the first method to provide gene specific resolution of these clades. This allows one to discriminate among the ~300 haploid human DUF1220 copies to an extent not possible with any other method. The result is a greatly enhanced capability to analyze the role that these sequences play in human variation and disease. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3976-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- David P Astling
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, USA
| | - Ilea E Heft
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, USA
| | - Kenneth L Jones
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, USA
| | - James M Sikela
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO, USA.
| |
Collapse
|
183
|
Sousa AMM, Meyer KA, Santpere G, Gulden FO, Sestan N. Evolution of the Human Nervous System Function, Structure, and Development. Cell 2017; 170:226-247. [PMID: 28708995 DOI: 10.1016/j.cell.2017.06.036] [Citation(s) in RCA: 247] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2016] [Revised: 04/21/2017] [Accepted: 06/22/2017] [Indexed: 12/22/2022]
Abstract
The nervous system-in particular, the brain and its cognitive abilities-is among humans' most distinctive and impressive attributes. How the nervous system has changed in the human lineage and how it differs from that of closely related primates is not well understood. Here, we consider recent comparative analyses of extant species that are uncovering new evidence for evolutionary changes in the size and the number of neurons in the human nervous system, as well as the cellular and molecular reorganization of its neural circuits. We also discuss the developmental mechanisms and underlying genetic and molecular changes that generate these structural and functional differences. As relevant new information and tools materialize at an unprecedented pace, the field is now ripe for systematic and functionally relevant studies of the development and evolution of human nervous system specializations.
Collapse
Affiliation(s)
- André M M Sousa
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA
| | - Kyle A Meyer
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA
| | - Gabriel Santpere
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA
| | - Forrest O Gulden
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA
| | - Nenad Sestan
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, USA; Department of Genetics, Yale School of Medicine, New Haven, CT, USA; Department of Psychiatry, Yale School of Medicine, New Haven, CT, USA; Section of Comparative Medicine, Yale School of Medicine, New Haven, CT, USA; Program in Cellular Neuroscience, Neurodegeneration and Repair, Yale School of Medicine, New Haven, CT, USA; Yale Child Study Center, Yale School of Medicine, New Haven, CT, USA; Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT, USA.
| |
Collapse
|
184
|
Schröder J, Wirawan A, Schmidt B, Papenfuss AT. CLOVE: classification of genomic fusions into structural variation events. BMC Bioinformatics 2017; 18:346. [PMID: 28728542 PMCID: PMC5520322 DOI: 10.1186/s12859-017-1760-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2016] [Accepted: 07/13/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A precise understanding of structural variants (SVs) in DNA is important in the study of cancer and population diversity. Many methods have been designed to identify SVs from DNA sequencing data. However, the problem remains challenging because existing approaches suffer from low sensitivity, precision, and positional accuracy. Furthermore, many existing tools only identify breakpoints, and so not collect related breakpoints and classify them as a particular type of SV. Due to the rapidly increasing usage of high throughput sequencing technologies in this area, there is an urgent need for algorithms that can accurately classify complex genomic rearrangements (involving more than one breakpoint or fusion). RESULTS We present CLOVE, an algorithm for integrating the results of multiple breakpoint or SV callers and classifying the results as a particular SV. CLOVE is based on a graph data structure that is created from the breakpoint information. The algorithm looks for patterns in the graph that are characteristic of more complex rearrangement types. CLOVE is able to integrate the results of multiple callers, producing a consensus call. CONCLUSIONS We demonstrate using simulated and real data that re-classified SV calls produced by CLOVE improve on the raw call set of existing SV algorithms, particularly in terms of accuracy. CLOVE is freely available from http://www.github.com/PapenfussLab .
Collapse
Affiliation(s)
- Jan Schröder
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC, 3052, Australia. .,Department of Computing and Information Systems, University of Melbourne, Melbourne, VIC, Australia. .,Bioinformatics and Cancer Genomics, Peter MacCallum Cancer Centre, East Melbourne, VIC, 3000, Australia.
| | - Adrianto Wirawan
- Institut für Informatik, Johannes Gutenberg Universität Mainz, Mainz, Germany
| | - Bertil Schmidt
- Institut für Informatik, Johannes Gutenberg Universität Mainz, Mainz, Germany
| | - Anthony T Papenfuss
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, VIC, 3052, Australia. .,Bioinformatics and Cancer Genomics, Peter MacCallum Cancer Centre, East Melbourne, VIC, 3000, Australia. .,Department of Medical Biology, University of Melbourne, Melbourne, VIC, 3010, Australia. .,Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, VIC, 3010, Australia. .,Department of Mathematics and Statistics, University of Melbourne, Melbourne, VIC, 3010, Australia.
| |
Collapse
|
185
|
Botigué LR, Song S, Scheu A, Gopalan S, Pendleton AL, Oetjens M, Taravella AM, Seregély T, Zeeb-Lanz A, Arbogast RM, Bobo D, Daly K, Unterländer M, Burger J, Kidd JM, Veeramah KR. Ancient European dog genomes reveal continuity since the Early Neolithic. Nat Commun 2017; 8:16082. [PMID: 28719574 PMCID: PMC5520058 DOI: 10.1038/ncomms16082] [Citation(s) in RCA: 116] [Impact Index Per Article: 16.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Accepted: 05/25/2017] [Indexed: 12/19/2022] Open
Abstract
Europe has played a major role in dog evolution, harbouring the oldest uncontested Palaeolithic remains and having been the centre of modern dog breed creation. Here we sequence the genomes of an Early and End Neolithic dog from Germany, including a sample associated with an early European farming community. Both dogs demonstrate continuity with each other and predominantly share ancestry with modern European dogs, contradicting a previously suggested Late Neolithic population replacement. We find no genetic evidence to support the recent hypothesis proposing dual origins of dog domestication. By calibrating the mutation rate using our oldest dog, we narrow the timing of dog domestication to 20,000-40,000 years ago. Interestingly, we do not observe the extreme copy number expansion of the AMY2B gene characteristic of modern dogs that has previously been proposed as an adaptation to a starch-rich diet driven by the widespread adoption of agriculture in the Neolithic.
Collapse
Affiliation(s)
- Laura R Botigué
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York 11794-5245, USA
| | - Shiya Song
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Amelie Scheu
- Palaeogenetics Group, Johannes Gutenberg-University Mainz, 55099 Mainz, Germany.,Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland
| | - Shyamalika Gopalan
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York 11794-5245, USA
| | - Amanda L Pendleton
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Matthew Oetjens
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Angela M Taravella
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Timo Seregély
- Department of Prehistoric Archaeology, Institute of Archaeology, Heritage Sciences and Art History, University of Bamberg, 96045 Bamberg, Germany
| | - Andrea Zeeb-Lanz
- Generaldirektion Kulturelles Erbe Rheinland-Pfalz, Direktion Landesarchäologie, Außenstelle Speyer, 67346 Speyer, Germany
| | | | - Dean Bobo
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York 11794-5245, USA
| | - Kevin Daly
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland
| | - Martina Unterländer
- Palaeogenetics Group, Johannes Gutenberg-University Mainz, 55099 Mainz, Germany
| | - Joachim Burger
- Palaeogenetics Group, Johannes Gutenberg-University Mainz, 55099 Mainz, Germany
| | - Jeffrey M Kidd
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA.,Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Krishna R Veeramah
- Department of Ecology and Evolution, Stony Brook University, Stony Brook, New York 11794-5245, USA
| |
Collapse
|
186
|
Loftus A, Murphy G, Brown H, Montgomery A, Tabak J, Baus J, Carroll M, Green A, Sikka S, Sinha S. Development and validation of InnoQuant® HY, a system for quantitation and quality assessment of total human and male DNA using high copy targets. Forensic Sci Int Genet 2017; 29:205-217. [DOI: 10.1016/j.fsigen.2017.04.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Revised: 03/26/2017] [Accepted: 04/14/2017] [Indexed: 11/27/2022]
|
187
|
Steenwyk J, Rokas A. Extensive Copy Number Variation in Fermentation-Related Genes Among Saccharomyces cerevisiae Wine Strains. G3 (BETHESDA, MD.) 2017; 7:1475-1485. [PMID: 28292787 PMCID: PMC5427499 DOI: 10.1534/g3.117.040105] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Accepted: 03/08/2017] [Indexed: 01/30/2023]
Abstract
Due to the importance of Saccharomyces cerevisiae in wine-making, the genomic variation of wine yeast strains has been extensively studied. One of the major insights stemming from these studies is that wine yeast strains harbor low levels of genetic diversity in the form of single nucleotide polymorphisms (SNPs). Genomic structural variants, such as copy number (CN) variants, are another major type of variation segregating in natural populations. To test whether genetic diversity in CN variation is also low across wine yeast strains, we examined genome-wide levels of CN variation in 132 whole-genome sequences of S. cerevisiae wine strains. We found an average of 97.8 CN variable regions (CNVRs) affecting ∼4% of the genome per strain. Using two different measures of CN diversity, we found that gene families involved in fermentation-related processes such as copper resistance (CUP), flocculation (FLO), and glucose metabolism (HXT), as well as the SNO gene family whose members are expressed before or during the diauxic shift, showed substantial CN diversity across the 132 strains examined. Importantly, these same gene families have been shown, through comparative transcriptomic and functional assays, to be associated with adaptation to the wine fermentation environment. Our results suggest that CN variation is a substantial contributor to the genomic diversity of wine yeast strains, and identify several candidate loci whose levels of CN variation may affect the adaptation and performance of wine yeast strains during fermentation.
Collapse
Affiliation(s)
- Jacob Steenwyk
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37235
| | - Antonis Rokas
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee 37235
| |
Collapse
|
188
|
Selection To Increase Expression, Not Sequence Diversity, Precedes Gene Family Origin and Expansion in Rattlesnake Venom. Genetics 2017; 206:1569-1580. [PMID: 28476866 DOI: 10.1534/genetics.117.202655] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 05/02/2017] [Indexed: 11/18/2022] Open
Abstract
Gene duplication is the primary mechanism leading to new genes and phenotypic novelty, but the proximate evolutionary processes underlying gene family origin, maintenance, and expansion are poorly understood. Although sub- and neofunctionalization provide clear long-term advantages, selection does not act with foresight, and unless a redundant gene copy provides an immediate fitness advantage, the copy will most likely be lost. Many models for the evolution of genes immediately following duplication have been proposed, but the robustness and applicability of these models is unclear because of the lack of data at the population level. We used qPCR, protein expression data, genome sequencing, and hybrid enrichment to test three competing models that differ in whether selection favoring the spread of duplicates acts primarily on expression level or sequence diversity for specific toxin-encoding loci in the eastern diamondback rattlesnake (Crotalus adamanteus). We sampled 178 individuals and identified significant inter- and intrapopulation variation in copy number, demonstrated that copy number was significantly and positively correlated with protein expression, and found little to no sequence variation across paralogs in all populations. Collectively, these results demonstrate that selection for increased expression, not sequence diversity, was the proximate evolutionary process underlying gene family origin and expansion, providing data needed to resolve the debate over which evolutionary processes govern the fates of gene copies immediately following duplication.
Collapse
|
189
|
Lubin IM, Aziz N, Babb LJ, Ballinger D, Bisht H, Church DM, Cordes S, Eilbeck K, Hyland F, Kalman L, Landrum M, Lockhart ER, Maglott D, Marth G, Pfeifer JD, Rehm HL, Roy S, Tezak Z, Truty R, Ullman-Cullere M, Voelkerding KV, Worthey EA, Zaranek AW, Zook JM. Principles and Recommendations for Standardizing the Use of the Next-Generation Sequencing Variant File in Clinical Settings. J Mol Diagn 2017; 19:417-426. [PMID: 28315672 PMCID: PMC5417043 DOI: 10.1016/j.jmoldx.2016.12.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2016] [Revised: 12/05/2016] [Accepted: 12/23/2016] [Indexed: 11/30/2022] Open
Abstract
A national workgroup convened by the Centers for Disease Control and Prevention identified principles and made recommendations for standardizing the description of sequence data contained within the variant file generated during the course of clinical next-generation sequence analysis for diagnosing human heritable conditions. The specifications for variant files were initially developed to be flexible with regard to content representation to support a variety of research applications. This flexibility permits variation with regard to how sequence findings are described and this depends, in part, on the conventions used. For clinical laboratory testing, this poses a problem because these differences can compromise the capability to compare sequence findings among laboratories to confirm results and to query databases to identify clinically relevant variants. To provide for a more consistent representation of sequence findings described within variant files, the workgroup made several recommendations that considered alignment to a common reference sequence, variant caller settings, use of genomic coordinates, and gene and variant naming conventions. These recommendations were considered with regard to the existing variant file specifications presently used in the clinical setting. Adoption of these recommendations is anticipated to reduce the potential for ambiguity in describing sequence findings and facilitate the sharing of genomic data among clinical laboratories and other entities.
Collapse
Affiliation(s)
- Ira M Lubin
- Division of Laboratory Systems, Centers for Disease Control and Prevention, Atlanta, Georgia.
| | - Nazneen Aziz
- College of American Pathologists, Chicago, Illinois; Kaiser Permanente Research Bank, Oakland, California
| | - Lawrence J Babb
- Partners Healthcare Personalized Medicine, Cambridge, Massachusetts; GeneInsight, a Sunquest Company, Boston, Massachusetts
| | | | - Himani Bisht
- Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, Maryland
| | - Deanna M Church
- Personalis, Menlo Park, California; National Center for Biotechnology Information, NIH, Bethesda, Maryland; 10× Genomics, Pleasanton, California
| | | | - Karen Eilbeck
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah
| | | | - Lisa Kalman
- Division of Laboratory Systems, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - Melissa Landrum
- National Center for Biotechnology Information, NIH, Bethesda, Maryland
| | - Edward R Lockhart
- Division of Laboratory Systems, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - Donna Maglott
- National Center for Biotechnology Information, NIH, Bethesda, Maryland
| | - Gabor Marth
- Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah; Boston College, Chestnut Hill, Massachusetts
| | - John D Pfeifer
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri
| | - Heidi L Rehm
- Department of Pathology, Brigham and Women's Hospital and Harvard Medical School, Boston, Massachusetts
| | - Somak Roy
- Division of Molecular and Genomic Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania
| | - Zivana Tezak
- Center for Devices and Radiological Health, US Food and Drug Administration, Silver Spring, Maryland
| | - Rebecca Truty
- Complete Genomics, Mountain View, California; Invitae Corporation, San Francisco, California
| | | | - Karl V Voelkerding
- Department of Pathology, University of Utah and the Institute for Clinical and Experimental Pathology, Associated Regional and University Pathologists Laboratories, Salt Lake City, Utah
| | - Elizabeth A Worthey
- Department of Pediatrics, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Alexander W Zaranek
- Personal Genome Project, Harvard Medical School, Boston, Massachusetts; Curoverse, Inc., Somerville, Massachusetts
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, Maryland
| |
Collapse
|
190
|
Schneider VA, Graves-Lindsay T, Howe K, Bouk N, Chen HC, Kitts PA, Murphy TD, Pruitt KD, Thibaud-Nissen F, Albracht D, Fulton RS, Kremitzki M, Magrini V, Markovic C, McGrath S, Steinberg KM, Auger K, Chow W, Collins J, Harden G, Hubbard T, Pelan S, Simpson JT, Threadgold G, Torrance J, Wood JM, Clarke L, Koren S, Boitano M, Peluso P, Li H, Chin CS, Phillippy AM, Durbin R, Wilson RK, Flicek P, Eichler EE, Church DM. Evaluation of GRCh38 and de novo haploid genome assemblies demonstrates the enduring quality of the reference assembly. Genome Res 2017; 27:849-864. [PMID: 28396521 PMCID: PMC5411779 DOI: 10.1101/gr.213611.116] [Citation(s) in RCA: 558] [Impact Index Per Article: 79.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Accepted: 03/14/2017] [Indexed: 11/24/2022]
Abstract
The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.
Collapse
Affiliation(s)
- Valerie A Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Tina Graves-Lindsay
- McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
| | - Kerstin Howe
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Nathan Bouk
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Hsiu-Chuan Chen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Paul A Kitts
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | - Derek Albracht
- McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
| | - Robert S Fulton
- McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
| | - Milinn Kremitzki
- McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
| | - Vincent Magrini
- McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
| | - Chris Markovic
- McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
| | - Sean McGrath
- McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
| | | | - Kate Auger
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - William Chow
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Joanna Collins
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Glenn Harden
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Timothy Hubbard
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Sarah Pelan
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Jared T Simpson
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Glen Threadgold
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - James Torrance
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Jonathan M Wood
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Laura Clarke
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Sergey Koren
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | - Paul Peluso
- Pacific Biosciences, Menlo Park, California 94025, USA
| | - Heng Li
- Broad Institute, Cambridge, Massachusetts 02142, USA
| | | | - Adam M Phillippy
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Richard Durbin
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | - Richard K Wilson
- McDonnell Genome Institute at Washington University, St. Louis, Missouri 63018, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, USA
| | - Deanna M Church
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| |
Collapse
|
191
|
Gao C, Hsu FC, Dimitrov LM, Okut H, Chen YDI, Taylor KD, Rotter JI, Langefeld CD, Bowden DW, Palmer ND. A genome-wide linkage and association analysis of imputed insertions and deletions with cardiometabolic phenotypes in Mexican Americans: The Insulin Resistance Atherosclerosis Family Study. Genet Epidemiol 2017; 41:353-362. [PMID: 28378447 DOI: 10.1002/gepi.22042] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2016] [Revised: 12/23/2016] [Accepted: 02/04/2017] [Indexed: 11/09/2022]
Abstract
Insertions and deletions (INDELs) represent a significant fraction of interindividual variation in the human genome yet their contribution to phenotypes is poorly understood. To confirm the quality of imputed INDELs and investigate their roles in mediating cardiometabolic phenotypes, genome-wide association and linkage analyses were performed for 15 phenotypes with 1,273,952 imputed INDELs in 1,024 Mexican-origin Americans. Imputation quality was validated using whole exome sequencing with an average kappa of 0.93 in common INDELs (minor allele frequencies [MAFs] ≥ 5%). Association analysis revealed one genome-wide significant association signal for the cholesterylester transfer protein gene (CETP) with high-density lipoprotein levels (rs36229491, P = 3.06 × 10-12 ); linkage analysis identified two peaks with logarithm of the odds (LOD) > 5 (rs60560566, LOD = 5.36 with insulin sensitivity (SI ) and rs5825825, LOD = 5.11 with adiponectin levels). Suggestive overlapping signals between linkage and association were observed: rs59849892 in the WSC domain containing 2 gene (WSCD2) was associated and nominally linked with SI (P = 1.17 × 10-7 , LOD = 1.99). This gene has been implicated in glucose metabolism in human islet cell expression studies. In addition, rs201606363 was linked and nominally associated with low-density lipoprotein (P = 4.73 × 10-4 , LOD = 3.67), apolipoprotein B (P = 1.39 × 10-3 , LOD = 4.64), and total cholesterol (P = 1.35 × 10-2 , LOD = 3.80) levels. rs201606363 is an intronic variant of the UBE2F-SCLY (where UBE2F is ubiquitin-conjugating enzyme E2F and SCLY is selenocysteine lyase) fusion gene that may regulate cholesterol through selenium metabolism. In conclusion, these results confirm the feasibility of imputing INDELs from array-based single nucleotide polymorphism (SNP) genotypes. Analysis of these variants using association and linkage replicated previously identified SNP signals and identified multiple novel INDEL signals. These results support the inclusion of INDELs into genetic studies to more fully interrogate the spectrum of genetic variation.
Collapse
Affiliation(s)
- Chuan Gao
- Molecular Genetics and Genomics Program, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.,Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.,Center for Public Health Genomics, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Fang-Chi Hsu
- Department of Biostatistical Sciences, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Latchezar M Dimitrov
- Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Hayrettin Okut
- Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Yii-Der I Chen
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, United States of America.,Department of Pediatrics, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, United States of America
| | - Kent D Taylor
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, United States of America.,Department of Pediatrics, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, United States of America
| | - Jerome I Rotter
- Institute for Translational Genomics and Population Sciences, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, United States of America.,Department of Pediatrics, Los Angeles Biomedical Research Institute at Harbor-UCLA Medical Center, Torrance, California, United States of America
| | - Carl D Langefeld
- Center for Public Health Genomics, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.,Division of Public Health Sciences, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Donald W Bowden
- Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.,Center for Diabetes Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| | - Nicholette D Palmer
- Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.,Center for Public Health Genomics, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.,Department of Biochemistry, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America.,Center for Diabetes Research, Wake Forest School of Medicine, Winston-Salem, North Carolina, United States of America
| |
Collapse
|
192
|
Tomaszkiewicz M, Medvedev P, Makova KD. Y and W Chromosome Assemblies: Approaches and Discoveries. Trends Genet 2017; 33:266-282. [DOI: 10.1016/j.tig.2017.01.008] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 12/05/2016] [Accepted: 01/24/2017] [Indexed: 01/19/2023]
|
193
|
Gao Y, Jiang J, Yang S, Hou Y, Liu GE, Zhang S, Zhang Q, Sun D. CNV discovery for milk composition traits in dairy cattle using whole genome resequencing. BMC Genomics 2017; 18:265. [PMID: 28356085 PMCID: PMC5371188 DOI: 10.1186/s12864-017-3636-3] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Accepted: 03/17/2017] [Indexed: 01/08/2023] Open
Abstract
Background Copy number variations (CNVs) are important and widely distributed in the genome. CNV detection opens a new avenue for exploring genes associated with complex traits in humans, animals and plants. Herein, we present a genome-wide assessment of CNVs that are potentially associated with milk composition traits in dairy cattle. Results In this study, CNVs were detected based on whole genome re-sequencing data of eight Holstein bulls from four half- and/or full-sib families, with extremely high and low estimated breeding values (EBVs) of milk protein percentage and fat percentage. The range of coverage depth per individual was 8.2–11.9×. Using CNVnator, we identified a total of 14,821 CNVs, including 5025 duplications and 9796 deletions. Among them, 487 differential CNV regions (CNVRs) comprising ~8.23 Mb of the cattle genome were observed between the high and low groups. Annotation of these differential CNVRs were performed based on the cattle genome reference assembly (UMD3.1) and totally 235 functional genes were found within the CNVRs. By Gene Ontology and KEGG pathway analyses, we found that genes were significantly enriched for specific biological functions related to protein and lipid metabolism, insulin/IGF pathway-protein kinase B signaling cascade, prolactin signaling pathway and AMPK signaling pathways. These genes included INS, IGF2, FOXO3, TH, SCD5, GALNT18, GALNT16, ART3, SNCA and WNT7A, implying their potential association with milk protein and fat traits. In addition, 95 CNVRs were overlapped with 75 known QTLs that are associated with milk protein and fat traits of dairy cattle (Cattle QTLdb). Conclusions In conclusion, based on NGS of 8 Holstein bulls with extremely high and low EBVs for milk PP and FP, we identified a total of 14,821 CNVs, 487 differential CNVRs between groups, and 10 genes, which were suggested as promising candidate genes for milk protein and fat traits. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3636-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yahui Gao
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Jianping Jiang
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Shaohua Yang
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Yali Hou
- CAS Key Laboratory of Genomic and Precision Medicine, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, 100101, China
| | - George E Liu
- Animal Genomics and Improvement Laboratory, BARC, USDA-ARS, Beltsville, Md, 20705, USA
| | - Shengli Zhang
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Qin Zhang
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China
| | - Dongxiao Sun
- Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural University, Beijing, 100193, China.
| |
Collapse
|
194
|
Dougherty ML, Nuttle X, Penn O, Nelson BJ, Huddleston J, Baker C, Harshman L, Duyzend MH, Ventura M, Antonacci F, Sandstrom R, Dennis MY, Eichler EE. The birth of a human-specific neural gene by incomplete duplication and gene fusion. Genome Biol 2017; 18:49. [PMID: 28279197 PMCID: PMC5345166 DOI: 10.1186/s13059-017-1163-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 01/27/2017] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND Gene innovation by duplication is a fundamental evolutionary process but is difficult to study in humans due to the large size, high sequence identity, and mosaic nature of segmental duplication blocks. The human-specific gene hydrocephalus-inducing 2, HYDIN2, was generated by a 364 kbp duplication of 79 internal exons of the large ciliary gene HYDIN from chromosome 16q22.2 to chromosome 1q21.1. Because the HYDIN2 locus lacks the ancestral promoter and seven terminal exons of the progenitor gene, we sought to characterize transcription at this locus by coupling reverse transcription polymerase chain reaction and long-read sequencing. RESULTS 5' RACE indicates a transcription start site for HYDIN2 outside of the duplication and we observe fusion transcripts spanning both the 5' and 3' breakpoints. We observe extensive splicing diversity leading to the formation of altered open reading frames (ORFs) that appear to be under relaxed selection. We show that HYDIN2 adopted a new promoter that drives an altered pattern of expression, with highest levels in neural tissues. We estimate that the HYDIN duplication occurred ~3.2 million years ago and find that it is nearly fixed (99.9%) for diploid copy number in contemporary humans. Examination of 73 chromosome 1q21 rearrangement patients reveals that HYDIN2 is deleted or duplicated in most cases. CONCLUSIONS Together, these data support a model of rapid gene innovation by fusion of incomplete segmental duplications, altered tissue expression, and potential subfunctionalization or neofunctionalization of HYDIN2 early in the evolution of the Homo lineage.
Collapse
Affiliation(s)
- Max L Dougherty
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Xander Nuttle
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Osnat Penn
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Bradley J Nelson
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - John Huddleston
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA
| | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Lana Harshman
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Michael H Duyzend
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
| | - Mario Ventura
- Department of Biology, University of Bari, Bari, 70121, Italy
| | | | | | - Megan Y Dennis
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, 95616, CA, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, 3720 15 Ave NE, S413C, Box 355065, Seattle, WA, 98195-5065, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA.
| |
Collapse
|
195
|
Bekpen C, Künzel S, Xie C, Eaaswarkhanth M, Lin YL, Gokcumen O, Akdis CA, Tautz D. Segmental duplications and evolutionary acquisition of UV damage response in the SPATA31 gene family of primates and humans. BMC Genomics 2017; 18:222. [PMID: 28264649 PMCID: PMC5338094 DOI: 10.1186/s12864-017-3595-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 02/20/2017] [Indexed: 12/11/2022] Open
Abstract
Background Segmental duplications are an abundant source for novel gene functions and evolutionary adaptations. This mechanism of generating novelty was very active during the evolution of primates particularly in the human lineage. Here, we characterize the evolution and function of the SPATA31 gene family (former designation FAM75A), which was previously shown to be among the gene families with the strongest signal of positive selection in hominoids. The mouse homologue for this gene family is a single copy gene expressed during spermatogenesis. Results We show that in primates, the SPATA31 gene duplicated into SPATA31A and SPATA31C types and broadened the expression into many tissues. Each type became further segmentally duplicated in the line towards humans with the largest number of full-length copies found for SPATA31A in humans. Copy number estimates of SPATA31A based on digital PCR show an average of 7.5 with a range of 5–11 copies per diploid genome among human individuals. The primate SPATA31 genes also acquired new protein domains that suggest an involvement in UV response and DNA repair. We generated antibodies and show that the protein is re-localized from the nucleolus to the whole nucleus upon UV-irradiation suggesting a UV damage response. We used CRISPR/Cas mediated mutagenesis to knockout copies of the gene in human primary fibroblast cells. We find that cell lines with reduced functional copies as well as naturally occurring low copy number HFF cells show enhanced sensitivity towards UV-irradiation. Conclusion The acquisition of new SPATA31 protein functions and its broadening of expression may be related to the evolution of the diurnal life style in primates that required a higher UV tolerance. The increased segmental duplications in hominoids as well as its fast evolution suggest the acquisition of further specific functions particularly in humans. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-3595-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Cemalettin Bekpen
- Max-Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306, Plön, Germany.
| | - Sven Künzel
- Max-Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306, Plön, Germany
| | - Chen Xie
- Max-Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306, Plön, Germany
| | - Muthukrishnan Eaaswarkhanth
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, 14260-1300, NY, USA.,Present address: Population Genomics and Genetic Epidemiology Unit, Dasman Diabetes Institute, P.O.Box 1180, Dasman, 15462, Kuwait
| | - Yen-Lung Lin
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, 14260-1300, NY, USA
| | - Omer Gokcumen
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, 14260-1300, NY, USA
| | - Cezmi A Akdis
- Swiss Institute of Allergy and Asthma Research (SIAF), Davos, CH-7270, Switzerland
| | - Diethard Tautz
- Max-Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306, Plön, Germany.
| |
Collapse
|
196
|
Abstract
Deciphering the genetic basis of human disease requires a comprehensive knowledge of genetic variants irrespective of their class or frequency. Although an impressive number of human genetic variants have been catalogued, a large fraction of the genetic difference that distinguishes two human genomes is still not understood at the base-pair level. This is because the emphasis has been on single-nucleotide variation as opposed to less tractable and more complex genetic variants, including indels and structural variants. The latter, we propose, will have a large impact on human phenotypes but require a more systematic assessment of genomes at deeper coverage and alternate sequencing and mapping technologies.
Collapse
|
197
|
Carpenter D, Mitchell LM, Armour JAL. Copy number variation of human AMY1 is a minor contributor to variation in salivary amylase expression and activity. Hum Genomics 2017; 11:2. [PMID: 28219410 PMCID: PMC5319014 DOI: 10.1186/s40246-017-0097-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 02/02/2017] [Indexed: 12/14/2022] Open
Abstract
Background Salivary amylase in humans is encoded by the copy variable gene AMY1 in the amylase gene cluster on chromosome 1. Although the role of salivary amylase is well established, the consequences of the copy number variation (CNV) at AMY1 on salivary amylase protein production are less well understood. The amylase gene cluster is highly structured with a fundamental difference between odd and even AMY1 copy number haplotypes. In this study, we aimed to explore, in samples from 119 unrelated individuals, not only the effects of AMY1 CNV on salivary amylase protein expression and amylase enzyme activity but also whether there is any evidence for underlying difference between the common haplotypes containing odd numbers of AMY1 and even copy number haplotypes. Results AMY1 copy number was significantly correlated with the variation observed in salivary amylase production (11.7% of variance, P < 0.0005) and enzyme activity (13.6% of variance, P < 0.0005) but did not explain the majority of observed variation between individuals. AMY1-odd and AMY1-even haplotypes showed a different relationship between copy number and expression levels, but the difference was not statistically significant (P = 0.052). Conclusions Production of salivary amylase is correlated with AMY1 CNV, but the majority of interindividual variation comes from other sources. Long-range haplotype structure may affect expression, but this was not significant in our data. Electronic supplementary material The online version of this article (doi:10.1186/s40246-017-0097-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Danielle Carpenter
- School of Life Sciences, University of Nottingham, Nottingham, NG7 2UH, UK
| | - Laura M Mitchell
- School of Life Sciences, University of Nottingham, Nottingham, NG7 2UH, UK
| | - John A L Armour
- School of Life Sciences, University of Nottingham, Nottingham, NG7 2UH, UK.
| |
Collapse
|
198
|
Dennis MY, Harshman L, Nelson BJ, Penn O, Cantsilieris S, Huddleston J, Antonacci F, Penewit K, Denman L, Raja A, Baker C, Mark K, Malig M, Janke N, Espinoza C, Stessman HAF, Nuttle X, Hoekzema K, Lindsay-Graves TA, Wilson RK, Eichler EE. The evolution and population diversity of human-specific segmental duplications. Nat Ecol Evol 2017; 1:69. [PMID: 28580430 PMCID: PMC5450946 DOI: 10.1038/s41559-016-0069] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Segmental duplications contribute to human evolution, adaptation and genomic instability but are often poorly characterized. We investigate the evolution, genetic variation and coding potential of human-specific segmental duplications (HSDs). We identify 218 HSDs based on analysis of 322 deeply sequenced archaic and contemporary hominid genomes. We sequence 550 human and nonhuman primate genomic clones to reconstruct the evolution of the largest, most complex regions with protein-coding potential (n=80 genes/33 gene families). We show that HSDs are non-randomly organized, associate preferentially with ancestral ape duplications termed “core duplicons”, and evolved primarily in an interspersed inverted orientation. In addition to Homo sapiens-specific gene expansions (e.g., TCAF1/2), we highlight ten gene families (e.g., ARHGAP11B and SRGAP2C) where copy number never returns to the ancestral state, there is evidence of mRNA splicing, and no common gene-disruptive mutations are observed in the general population. Such duplicates are candidates for the evolution of human-specific adaptive traits.
Collapse
Affiliation(s)
- Megan Y Dennis
- Genome Center, MIND Institute, and Department of Biochemistry & Molecular Medicine, University of California, Davis, CA 95616, USA.,Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Lana Harshman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Bradley J Nelson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Osnat Penn
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Stuart Cantsilieris
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - John Huddleston
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Francesca Antonacci
- Dipartimento di Biologia, Università degli Studi di Bari "Aldo Moro", Bari 70125, Italy
| | - Kelsi Penewit
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Laura Denman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Archana Raja
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Carl Baker
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Kenneth Mark
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Maika Malig
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Nicolette Janke
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Claudia Espinoza
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Holly A F Stessman
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Xander Nuttle
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Tina A Lindsay-Graves
- McDonnell Genome Institute at Washington University, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Richard K Wilson
- McDonnell Genome Institute at Washington University, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| |
Collapse
|
199
|
Shwan NAA, Louzada S, Yang F, Armour JAL. Recurrent Rearrangements of Human Amylase Genes Create Multiple Independent CNV Series. Hum Mutat 2017; 38:532-539. [PMID: 28101908 DOI: 10.1002/humu.23182] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Accepted: 01/16/2017] [Indexed: 01/17/2023]
Abstract
The human amylase gene cluster includes the human salivary (AMY1) and pancreatic amylase genes (AMY2A and AMY2B), and is a highly variable and dynamic region of the genome. Copy number variation (CNV) of AMY1 has been implicated in human dietary adaptation, and in population association with obesity, but neither of these findings has been independently replicated. Despite these functional implications, the structural genomic basis of CNV has only been defined in detail very recently. In this work, we use high-resolution analysis of copy number, and analysis of segregation in trios, to define new, independent allelic series of amylase CNVs in sub-Saharan Africans, including a series of higher-order expansions of a unit consisting of one copy each of AMY1, AMY2A, and AMY2B. We use fiber-FISH (fluorescence in situ hybridization) to define unexpected complexity in the accompanying rearrangements. These findings demonstrate recurrent involvement of the amylase gene region in genomic instability, involving at least five independent rearrangements of the pancreatic amylase genes (AMY2A and AMY2B). Structural features shared by fundamentally distinct lineages strongly suggest that the common ancestral state for the human amylase cluster contained more than one, and probably three, copies of AMY1.
Collapse
Affiliation(s)
- Nzar A A Shwan
- School of Life Sciences, University of Nottingham, Medical School, Queen's Medical Centre, Nottingham, UK.,Scientific Research Centre, University of Salahaddin, Erbil, Kurdistan, Iraq
| | - Sandra Louzada
- Wellcome Trust Genome Campus, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Fengtang Yang
- Wellcome Trust Genome Campus, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - John A L Armour
- School of Life Sciences, University of Nottingham, Medical School, Queen's Medical Centre, Nottingham, UK
| |
Collapse
|
200
|
Ooi DSQ, Tan VMH, Ong SG, Chan YH, Heng CK, Lee YS. Differences in AMY1 Gene Copy Numbers Derived from Blood, Buccal Cells and Saliva Using Quantitative and Droplet Digital PCR Methods: Flagging the Pitfall. PLoS One 2017; 12:e0170767. [PMID: 28125683 PMCID: PMC5268653 DOI: 10.1371/journal.pone.0170767] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 01/10/2017] [Indexed: 11/18/2022] Open
Abstract
Introduction The human salivary (AMY1) gene, encoding salivary α-amylase, has variable copy number variants (CNVs) in the human genome. We aimed to determine if real-time quantitative polymerase chain reaction (qPCR) and the more recently available Droplet Digital PCR (ddPCR) can provide a precise quantification of the AMY1 gene copy number in blood, buccal cells and saliva samples derived from the same individual. Methods Seven participants were recruited and DNA was extracted from the blood, buccal cells and saliva samples provided by each participant. Taqman assay real-time qPCR and ddPCR were conducted to quantify AMY1 gene copy numbers. Statistical analysis was carried out to determine the difference in AMY1 gene copy number between the different biological specimens and different assay methods. Results We found significant within-individual difference (p<0.01) in AMY1 gene copy number between different biological samples as determined by qPCR. However, there was no significant within-individual difference in AMY1 gene copy number between different biological samples as determined by ddPCR. We also found that AMY1 gene copy number of blood samples were comparable between qPCR and ddPCR, while there is a significant difference (p<0.01) between AMY1 gene copy numbers measured by qPCR and ddPCR for both buccal swab and saliva samples. Conclusions Despite buccal cells and saliva samples being possible sources of DNA, it is pertinent that ddPCR or a single biological sample, preferably blood sample, be used for determining highly polymorphic gene copy numbers like AMY1, due to the large within-individual variability between different biological samples if real time qPCR is employed.
Collapse
Affiliation(s)
- Delicia Shu Qin Ooi
- Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
- Division of Endocrinology and Diabetes, Khoo Teck Puat-National University Children's Medical Institute, National University Hospital, National University Health System, Singapore
| | - Verena Ming Hui Tan
- Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
- Division of Endocrinology and Diabetes, Khoo Teck Puat-National University Children's Medical Institute, National University Hospital, National University Health System, Singapore
- Singapore Institute for Clinical Sciences, A*STAR, Singapore
| | - Siong Gim Ong
- Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
- Division of Endocrinology and Diabetes, Khoo Teck Puat-National University Children's Medical Institute, National University Hospital, National University Health System, Singapore
| | - Yiong Huak Chan
- Biostatistics Unit, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
| | - Chew Kiat Heng
- Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
- Division of Endocrinology and Diabetes, Khoo Teck Puat-National University Children's Medical Institute, National University Hospital, National University Health System, Singapore
| | - Yung Seng Lee
- Department of Paediatrics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore
- Division of Endocrinology and Diabetes, Khoo Teck Puat-National University Children's Medical Institute, National University Hospital, National University Health System, Singapore
- Singapore Institute for Clinical Sciences, A*STAR, Singapore
- * E-mail:
| |
Collapse
|