1
|
Hallast P, Ebert P, Loftus M, Yilmaz F, Audano PA, Logsdon GA, Bonder MJ, Zhou W, Höps W, Kim K, Li C, Hoyt SJ, Dishuck PC, Porubsky D, Tsetsos F, Kwon JY, Zhu Q, Munson KM, Hasenfeld P, Harvey WT, Lewis AP, Kordosky J, Hoekzema K, O'Neill RJ, Korbel JO, Tyler-Smith C, Eichler EE, Shi X, Beck CR, Marschall T, Konkel MK, Lee C. Assembly of 43 human Y chromosomes reveals extensive complexity and variation. Nature 2023; 621:355-364. [PMID: 37612510 PMCID: PMC10726138 DOI: 10.1038/s41586-023-06425-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Accepted: 07/11/2023] [Indexed: 08/25/2023]
Abstract
The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.
Collapse
|
2
|
Pawar H, Rymbekova A, Cuadros-Espinoza S, Huang X, de Manuel M, van der Valk T, Lobon I, Alvarez-Estape M, Haber M, Dolgova O, Han S, Esteller-Cucala P, Juan D, Ayub Q, Bautista R, Kelley JL, Cornejo OE, Lao O, Andrés AM, Guschanski K, Ssebide B, Cranfield M, Tyler-Smith C, Xue Y, Prado-Martinez J, Marques-Bonet T, Kuhlwilm M. Ghost admixture in eastern gorillas. Nat Ecol Evol 2023; 7:1503-1514. [PMID: 37500909 PMCID: PMC10482688 DOI: 10.1038/s41559-023-02145-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 06/30/2023] [Indexed: 07/29/2023]
Abstract
Archaic admixture has had a substantial impact on human evolution with multiple events across different clades, including from extinct hominins such as Neanderthals and Denisovans into modern humans. In great apes, archaic admixture has been identified in chimpanzees and bonobos but the possibility of such events has not been explored in other species. Here, we address this question using high-coverage whole-genome sequences from all four extant gorilla subspecies, including six newly sequenced eastern gorillas from previously unsampled geographic regions. Using approximate Bayesian computation with neural networks to model the demographic history of gorillas, we find a signature of admixture from an archaic 'ghost' lineage into the common ancestor of eastern gorillas but not western gorillas. We infer that up to 3% of the genome of these individuals is introgressed from an archaic lineage that diverged more than 3 million years ago from the common ancestor of all extant gorillas. This introgression event took place before the split of mountain and eastern lowland gorillas, probably more than 40 thousand years ago and may have influenced perception of bitter taste in eastern gorillas. When comparing the introgression landscapes of gorillas, humans and bonobos, we find a consistent depletion of introgressed fragments on the X chromosome across these species. However, depletion in protein-coding content is not detectable in eastern gorillas, possibly as a consequence of stronger genetic drift in this species.
Collapse
|
3
|
Lohse K, Hayward A, Laetsch DR, Marques V, Vila R, Tyler-Smith C. The genome sequence of the Mazarine Blue, Cyaniris semiargus (Rottemburg, 1775). Wellcome Open Res 2023; 8:181. [PMID: 38779052 PMCID: PMC11109581 DOI: 10.12688/wellcomeopenres.19362.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/05/2023] [Indexed: 05/25/2024] Open
Abstract
We present a genome assembly from an individual male Cyaniris semiargus (the Mazarine Blue; Arthropoda; Insecta; Lepidoptera; Lycaenidae). The genome sequence is 441.5 megabases in span. Most of the assembly is scaffolded into 24 chromosomal pseudomolecules, including the assembled Z sex chromosome. The mitochondrial genome has also been assembled and is 15.4 kilobases in length. Gene annotation of this assembly on Ensembl identified 16,408 protein coding genes.
Collapse
|
4
|
Boyes D, Tyler-Smith C. The genome sequence of the Brown Scallop, Philereme vetulata (Denis and Schiffermüller, 1775). Wellcome Open Res 2023. [DOI: 10.12688/wellcomeopenres.18948.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023] Open
Abstract
We present a genome assembly from an individual female Philereme vetulata (the Brown Scallop; Arthropoda; Insecta; Lepidoptera; Geometridae). The genome sequence is 771 megabases in span. Most of the assembly is scaffolded into 68 chromosomal pseudomolecules, including the assembled Z sex chromosome. The mitochondrial genome has also been assembled and is 16.3 kilobases in length. Gene annotation of this assembly on Ensembl has identified 18,096 protein coding genes.
Collapse
|
5
|
Pedrazza L, Martinez-Martinez A, Sánchez-de-Diego C, Valer JA, Pimenta-Lopes C, Sala-Gaston J, Szpak M, Tyler-Smith C, Ventura F, Rosa JL. HERC1 deficiency causes osteopenia through transcriptional program dysregulation during bone remodeling. Cell Death Dis 2023; 14:17. [PMID: 36635269 PMCID: PMC9837143 DOI: 10.1038/s41419-023-05549-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 12/19/2022] [Accepted: 01/03/2023] [Indexed: 01/13/2023]
Abstract
Bone remodeling is a continuous process between bone-forming osteoblasts and bone-resorbing osteoclasts, with any imbalance resulting in metabolic bone disease, including osteopenia. The HERC1 gene encodes an E3 ubiquitin ligase that affects cellular processes by regulating the ubiquitination of target proteins, such as C-RAF. Of interest, an association exists between biallelic pathogenic sequence variants in the HERC1 gene and the neurodevelopmental disorder MDFPMR syndrome (macrocephaly, dysmorphic facies, and psychomotor retardation). Most pathogenic variants cause loss of HERC1 function, and the affected individuals present with features related to altered bone homeostasis. Herc1-knockout mice offer an excellent model in which to study the role of HERC1 in bone remodeling and to understand its role in disease. In this study, we show that HERC1 regulates osteoblastogenesis and osteoclastogenesis, proving that its depletion increases gene expression of osteoblastic makers during the osteogenic differentiation of mesenchymal stem cells. During this process, HERC1 deficiency increases the levels of C-RAF and of phosphorylated ERK and p38. The Herc1-knockout adult mice developed imbalanced bone homeostasis that presented as osteopenia in both sexes of the adult mice. By contrast, only young female knockout mice had osteopenia and increased number of osteoclasts, with the changes associated with reductions in testosterone and dihydrotestosterone levels. Finally, osteocytes isolated from knockout mice showed a higher expression of osteocytic genes and an increase in the Rankl/Opg ratio, indicating a relevant cell-autonomous role of HERC1 when regulating the transcriptional program of bone formation. Overall, these findings present HERC1 as a modulator of bone homeostasis and highlight potential therapeutic targets for individuals affected by pathological HERC1 variants.
Collapse
|
6
|
Aghakhanian F, Hoh BP, Yew CW, Kumar Subbiah V, Xue Y, Tyler-Smith C, Ayub Q, Phipps ME. Sequence analyses of Malaysian Indigenous communities reveal historical admixture between Hoabinhian hunter-gatherers and Neolithic farmers. Sci Rep 2022; 12:13743. [PMID: 35962005 PMCID: PMC9374673 DOI: 10.1038/s41598-022-17884-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 04/08/2022] [Indexed: 11/09/2022] Open
Abstract
Southeast Asia comprises 11 countries that span mainland Asia across to numerous islands that stretch from the Andaman Sea to the South China Sea and Indian Ocean. This region harbors an impressive diversity of history, culture, religion and biology. Indigenous people of Malaysia display substantial phenotypic, linguistic, and anthropological diversity. Despite this remarkable diversity which has been documented for centuries, the genetic history and structure of indigenous Malaysians remain under-studied. To have a better understanding about the genetic history of these people, especially Malaysian Negritos, we sequenced whole genomes of 15 individuals belonging to five indigenous groups from Peninsular Malaysia and one from North Borneo to high coverage (30X). Our results demonstrate that indigenous populations of Malaysia are genetically close to East Asian populations. We show that present-day Malaysian Negritos can be modeled as an admixture of ancient Hoabinhian hunter-gatherers and Neolithic farmers. We observe gene flow from South Asian populations into the Malaysian indigenous groups, but not into Dusun of North Borneo. Our study proposes that Malaysian indigenous people originated from at least three distinct ancestral populations related to the Hoabinhian hunter-gatherers, Neolithic farmers and Austronesian speakers.
Collapse
|
7
|
Szpak M, Collins SC, Li Y, Liu X, Ayub Q, Fischer MC, Vancollie VE, Lelliott CJ, Xue Y, Yalcin B, Yang H, Tyler-Smith C. A Positively Selected MAGEE2 LoF Allele Is Associated with Sexual Dimorphism in Human Brain Size and Shows Similar Phenotypes in Magee2 Null Mice. Mol Biol Evol 2021; 38:5655-5663. [PMID: 34464968 PMCID: PMC8662591 DOI: 10.1093/molbev/msab243] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
A nonsense allele at rs1343879 in human MAGEE2 on chromosome X has previously been reported as a strong candidate for positive selection in East Asia. This premature stop codon causing ∼80% protein truncation is characterized by a striking geographical pattern of high population differentiation: common in Asia and the Americas (up to 84% in the 1000 Genomes Project East Asians) but rare elsewhere. Here, we generated a Magee2 mouse knockout mimicking the human loss-of-function mutation to study its functional consequences. The Magee2 null mice did not exhibit gross abnormalities apart from enlarged brain structures (13% increased total brain area, P = 0.0022) in hemizygous males. The area of the granular retrosplenial cortex responsible for memory, navigation, and spatial information processing was the most severely affected, exhibiting an enlargement of 34% (P = 3.4×10-6). The brain size in homozygous females showed the opposite trend of reduced brain size, although this did not reach statistical significance. With these insights, we performed human association analyses between brain size measurements and rs1343879 genotypes in 141 Chinese volunteers with brain MRI scans, replicating the sexual dimorphism seen in the knockout mouse model. The derived stop gain allele was significantly associated with a larger volume of gray matter in males (P = 0.00094), and smaller volumes of gray (P = 0.00021) and white (P = 0.0015) matter in females. It is unclear whether or not the observed neuroanatomical phenotypes affect behavior or cognition, but it might have been the driving force underlying the positive selection in humans.
Collapse
|
8
|
Almarri MA, Haber M, Lootah RA, Hallast P, Al Turki S, Martin HC, Xue Y, Tyler-Smith C. The genomic history of the Middle East. Cell 2021; 184:4612-4625.e14. [PMID: 34352227 PMCID: PMC8445022 DOI: 10.1016/j.cell.2021.07.013] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 05/17/2021] [Accepted: 07/09/2021] [Indexed: 11/22/2022]
Abstract
The Middle East region is important to understand human evolution and migrations but is underrepresented in genomic studies. Here, we generated 137 high-coverage physically phased genome sequences from eight Middle Eastern populations using linked-read sequencing. We found no genetic traces of early expansions out-of-Africa in present-day populations but found Arabians have elevated Basal Eurasian ancestry that dilutes their Neanderthal ancestry. Population sizes within the region started diverging 15–20 kya, when Levantines expanded while Arabians maintained smaller populations that derived ancestry from local hunter-gatherers. Arabians suffered a population bottleneck around the aridification of Arabia 6 kya, while Levantines had a distinct bottleneck overlapping the 4.2 kya aridification event. We found an association between movement and admixture of populations in the region and the spread of Semitic languages. Finally, we identify variants that show evidence of selection, including polygenic selection. Our results provide detailed insights into the genomic and selective histories of the Middle East. Middle Easterners do not have ancestry from an early out-of-Africa expansion Basal Eurasian and African ancestry in Arabians deplete their Neanderthal ancestry Populations experienced bottlenecks overlapping aridification events Identification of recent single and polygenic signals of selection in Arabia
Collapse
|
9
|
Fridman H, Yntema HG, Mägi R, Andreson R, Metspalu A, Mezzavila M, Tyler-Smith C, Xue Y, Carmi S, Levy-Lahad E, Gilissen C, Brunner HG. The landscape of autosomal-recessive pathogenic variants in European populations reveals phenotype-specific effects. Am J Hum Genet 2021; 108:608-619. [PMID: 33740458 DOI: 10.1016/j.ajhg.2021.03.004] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Accepted: 03/01/2021] [Indexed: 12/16/2022] Open
Abstract
The number and distribution of recessive alleles in the population for various diseases are not known at genome-wide-scale. Based on 6,447 exome sequences of healthy, genetically unrelated Europeans of two distinct ancestries, we estimate that every individual is a carrier of at least 2 pathogenic variants in currently known autosomal-recessive (AR) genes and that 0.8%-1% of European couples are at risk of having a child affected with a severe AR genetic disorder. This risk is 16.5-fold higher for first cousins but is significantly more increased for skeletal disorders and intellectual disabilities due to their distinct genetic architecture.
Collapse
|
10
|
Hallast P, Kibena L, Punab M, Arciero E, Rootsi S, Grigorova M, Flores R, Jobling MA, Poolamets O, Pomm K, Korrovits P, Rull K, Xue Y, Tyler-Smith C, Laan M. A common 1.6 mb Y-chromosomal inversion predisposes to subsequent deletions and severe spermatogenic failure in humans. eLife 2021; 10:65420. [PMID: 33781384 PMCID: PMC8009663 DOI: 10.7554/elife.65420] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 03/15/2021] [Indexed: 12/19/2022] Open
Abstract
Male infertility is a prevalent condition, affecting 5–10% of men. So far, few genetic factors have been described as contributors to spermatogenic failure. Here, we report the first re-sequencing study of the Y-chromosomal Azoospermia Factor c (AZFc) region, combined with gene dosage analysis of the multicopy DAZ, BPY2, and CDYgenes and Y-haplogroup determination. In analysing 2324 Estonian men, we uncovered a novel structural variant as a high-penetrance risk factor for male infertility. The Y lineage R1a1-M458, reported at >20% frequency in several European populations, carries a fixed ~1.6 Mb r2/r3 inversion, destabilizing the AZFc region and predisposing to large recurrent microdeletions. Such complex rearrangements were significantly enriched among severe oligozoospermia cases. The carrier vs non-carrier risk for spermatogenic failure was increased 8.6-fold (p=6.0×10−4). This finding contributes to improved molecular diagnostics and clinical management of infertility. Carrier identification at young age will facilitate timely counselling and reproductive decision-making.
Collapse
|
11
|
Lall GM, Larmuseau MHD, Wetton JH, Batini C, Hallast P, Huszar TI, Zadik D, Aase S, Baker T, Balaresque P, Bodmer W, Børglum AD, de Knijff P, Dunn H, Harding SE, Løvvik H, Dupuy BM, Pamjav H, Tillmar AO, Tomaszewski M, Tyler-Smith C, Verdugo MP, Winney B, Vohra P, Story J, King TE, Jobling MA. Subdividing Y-chromosome haplogroup R1a1 reveals Norse Viking dispersal lineages in Britain. Eur J Hum Genet 2020; 29:512-523. [PMID: 33139852 PMCID: PMC7940619 DOI: 10.1038/s41431-020-00747-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Revised: 09/08/2020] [Accepted: 10/07/2020] [Indexed: 12/16/2022] Open
Abstract
The influence of Viking-Age migrants to the British Isles is obvious in archaeological and place-names evidence, but their demographic impact has been unclear. Autosomal genetic analyses support Norse Viking contributions to parts of Britain, but show no signal corresponding to the Danelaw, the region under Scandinavian administrative control from the ninth to eleventh centuries. Y-chromosome haplogroup R1a1 has been considered as a possible marker for Viking migrations because of its high frequency in peninsular Scandinavia (Norway and Sweden). Here we select ten Y-SNPs to discriminate informatively among hg R1a1 sub-haplogroups in Europe, analyse these in 619 hg R1a1 Y chromosomes including 163 from the British Isles, and also type 23 short-tandem repeats (Y-STRs) to assess internal diversity. We find three specifically Western-European sub-haplogroups, two of which predominate in Norway and Sweden, and are also found in Britain; star-like features in the STR networks of these lineages indicate histories of expansion. We ask whether geographical distributions of hg R1a1 overall, and of the two sub-lineages in particular, correlate with regions of Scandinavian influence within Britain. Neither shows any frequency difference between regions that have higher (≥10%) or lower autosomal contributions from Norway and Sweden, but both are significantly overrepresented in the region corresponding to the Danelaw. These differences between autosomal and Y-chromosomal histories suggest either male-specific contribution, or the influence of patrilocality. Comparison of modern DNA with recently available ancient DNA data supports the interpretation that two sub-lineages of hg R1a1 spread with the Vikings from peninsular Scandinavia.
Collapse
|
12
|
Walsh S, Pagani L, Xue Y, Laayouni H, Tyler-Smith C, Bertranpetit J. Positive selection in admixed populations from Ethiopia. BMC Genet 2020; 21:108. [PMID: 33092534 PMCID: PMC7580818 DOI: 10.1186/s12863-020-00908-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 08/27/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND In the process of adaptation of humans to their environment, positive or adaptive selection has played a main role. Positive selection has, however, been under-studied in African populations, despite their diversity and importance for understanding human history. RESULTS Here, we have used 119 available whole-genome sequences from five Ethiopian populations (Amhara, Oromo, Somali, Wolayta and Gumuz) to investigate the modes and targets of positive selection in this part of the world. The site frequency spectrum-based test SFselect was applied to idfentify a wide range of events of selection (old and recent), and the haplotype-based statistic integrated haplotype score to detect more recent events, in each case with evaluation of the significance of candidate signals by extensive simulations. Additional insights were provided by considering admixture proportions and functional categories of genes. We identified both individual loci that are likely targets of classic sweeps and groups of genes that may have experienced polygenic adaptation. We found population-specific as well as shared signals of selection, with folate metabolism and the related ultraviolet response and skin pigmentation standing out as a shared pathway, perhaps as a response to the high levels of ultraviolet irradiation, and in addition strong signals in genes such as IFNA, MRC1, immunoglobulins and T-cell receptors which contribute to defend against pathogens. CONCLUSIONS Signals of positive selection were detected in Ethiopian populations revealing novel adaptations in East Africa, and abundant targets for functional follow-up.
Collapse
|
13
|
Hallast P, Agdzhoyan A, Balanovsky O, Xue Y, Tyler-Smith C. A Southeast Asian origin for present-day non-African human Y chromosomes. Hum Genet 2020; 140:299-307. [PMID: 32666166 PMCID: PMC7864842 DOI: 10.1007/s00439-020-02204-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 07/02/2020] [Indexed: 12/17/2022]
Abstract
The genomes of present-day humans outside Africa originated almost entirely from a single out-migration ~ 50,000–70,000 years ago, followed by mixture with Neanderthals contributing ~ 2% to all non-Africans. However, the details of this initial migration remain poorly understood because no ancient DNA analyses are available from this key time period, and interpretation of present-day autosomal data is complicated due to subsequent population movements/reshaping. One locus, however, does retain male-specific information from this early period: the Y chromosome, where a detailed calibrated phylogeny has been constructed. Three present-day Y lineages were carried by the initial migration: the rare haplogroup D, the moderately rare C, and the very common FT lineage which now dominates most non-African populations. Here, we show that phylogenetic analyses of haplogroup C, D and FT sequences, including very rare deep-rooting lineages, together with phylogeographic analyses of ancient and present-day non-African Y chromosomes, all point to East/Southeast Asia as the origin 50,000–55,000 years ago of all known surviving non-African male lineages (apart from recent migrants). This observation contrasts with the expectation of a West Eurasian origin predicted by a simple model of expansion from a source near Africa, and can be interpreted as resulting from extensive genetic drift in the initial population or replacement of early western Y lineages from the east, thus informing and constraining models of the initial expansion.
Collapse
|
14
|
Almarri MA, Bergström A, Prado-Martinez J, Yang F, Fu B, Dunham AS, Chen Y, Hurles ME, Tyler-Smith C, Xue Y. Population Structure, Stratification, and Introgression of Human Structural Variation. Cell 2020; 182:189-199.e15. [PMID: 32531199 PMCID: PMC7369638 DOI: 10.1016/j.cell.2020.05.024] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 03/04/2020] [Accepted: 05/12/2020] [Indexed: 02/07/2023]
Abstract
Structural variants contribute substantially to genetic diversity and are important evolutionarily and medically, but they are still understudied. Here we present a comprehensive analysis of structural variation in the Human Genome Diversity panel, a high-coverage dataset of 911 samples from 54 diverse worldwide populations. We identify, in total, 126,018 variants, 78% of which were not identified in previous global sequencing projects. Some reach high frequency and are private to continental groups or even individual populations, including regionally restricted runaway duplications and putatively introgressed variants from archaic hominins. By de novo assembly of 25 genomes using linked-read sequencing, we discover 1,643 breakpoint-resolved unique insertions, in aggregate accounting for 1.9 Mb of sequence absent from the GRCh38 reference. Our results illustrate the limitation of a single human reference and the need for high-quality genomes from diverse populations to fully discover and understand human genetic variation.
Collapse
|
15
|
Gurdasani D, Carstensen T, Fatumo S, Chen G, Franklin CS, Prado-Martinez J, Bouman H, Abascal F, Haber M, Tachmazidou I, Mathieson I, Ekoru K, DeGorter MK, Nsubuga RN, Finan C, Wheeler E, Chen L, Cooper DN, Schiffels S, Chen Y, Ritchie GRS, Pollard MO, Fortune MD, Mentzer AJ, Garrison E, Bergström A, Hatzikotoulas K, Adeyemo A, Doumatey A, Elding H, Wain LV, Ehret G, Auer PL, Kooperberg CL, Reiner AP, Franceschini N, Maher D, Montgomery SB, Kadie C, Widmer C, Xue Y, Seeley J, Asiki G, Kamali A, Young EH, Pomilla C, Soranzo N, Zeggini E, Pirie F, Morris AP, Heckerman D, Tyler-Smith C, Motala AA, Rotimi C, Kaleebu P, Barroso I, Sandhu MS. Uganda Genome Resource Enables Insights into Population History and Genomic Discovery in Africa. Cell 2020; 179:984-1002.e36. [PMID: 31675503 DOI: 10.1016/j.cell.2019.10.004] [Citation(s) in RCA: 112] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 04/03/2019] [Accepted: 10/02/2019] [Indexed: 12/19/2022]
Abstract
Genomic studies in African populations provide unique opportunities to understand disease etiology, human diversity, and population history. In the largest study of its kind, comprising genome-wide data from 6,400 individuals and whole-genome sequences from 1,978 individuals from rural Uganda, we find evidence of geographically correlated fine-scale population substructure. Historically, the ancestry of modern Ugandans was best represented by a mixture of ancient East African pastoralists. We demonstrate the value of the largest sequence panel from Africa to date as an imputation resource. Examining 34 cardiometabolic traits, we show systematic differences in trait heritability between European and African populations, probably reflecting the differential impact of genes and environment. In a multi-trait pan-African GWAS of up to 14,126 individuals, we identify novel loci associated with anthropometric, hematological, lipid, and glycemic traits. We find that several functionally important signals are driven by Africa-specific variants, highlighting the value of studying diverse populations across the region.
Collapse
|
16
|
Bergström A, McCarthy SA, Hui R, Almarri MA, Ayub Q, Danecek P, Chen Y, Felkel S, Hallast P, Kamm J, Blanché H, Deleuze JF, Cann H, Mallick S, Reich D, Sandhu MS, Skoglund P, Scally A, Xue Y, Durbin R, Tyler-Smith C. Insights into human genetic variation and population history from 929 diverse genomes. Science 2020; 367:eaay5012. [PMID: 32193295 PMCID: PMC7115999 DOI: 10.1126/science.aay5012] [Citation(s) in RCA: 353] [Impact Index Per Article: 88.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 02/04/2020] [Indexed: 12/17/2022]
Abstract
Genome sequences from diverse human groups are needed to understand the structure of genetic variation in our species and the history of, and relationships between, different populations. We present 929 high-coverage genome sequences from 54 diverse human populations, 26 of which are physically phased using linked-read sequencing. Analyses of these genomes reveal an excess of previously undocumented common genetic variation private to southern Africa, central Africa, Oceania, and the Americas, but an absence of such variants fixed between major geographical regions. We also find deep and gradual population separations within Africa, contrasting population size histories between hunter-gatherer and agriculturalist groups in the past 10,000 years, and a contrast between single Neanderthal but multiple Denisovan source populations contributing to present-day human populations.
Collapse
|
17
|
Shi W, Massaia A, Louzada S, Handsaker J, Chow W, McCarthy S, Collins J, Hallast P, Howe K, Church DM, Yang F, Xue Y, Tyler-Smith C. Birth, expansion, and death of VCY-containing palindromes on the human Y chromosome. Genome Biol 2019; 20:207. [PMID: 31610793 PMCID: PMC6790999 DOI: 10.1186/s13059-019-1816-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Accepted: 09/04/2019] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Large palindromes (inverted repeats) make up substantial proportions of mammalian sex chromosomes, often contain genes, and have high rates of structural variation arising via ectopic recombination. As a result, they underlie many genomic disorders. Maintenance of the palindromic structure by gene conversion between the arms has been documented, but over longer time periods, palindromes are remarkably labile. Mechanisms of origin and loss of palindromes have, however, received little attention. RESULTS Here, we use fiber-FISH, 10x Genomics Linked-Read sequencing, and breakpoint PCR sequencing to characterize the structural variation of the P8 palindrome on the human Y chromosome, which contains two copies of the VCY (Variable Charge Y) gene. We find a deletion of almost an entire arm of the palindrome, leading to death of the palindrome, a size increase by recruitment of adjacent sequence, and other complex changes including the formation of an entire new palindrome nearby. Together, these changes are found in ~ 1% of men, and we can assign likely molecular mechanisms to these mutational events. As a result, healthy men can have 1-4 copies of VCY. CONCLUSIONS Gross changes, especially duplications, in palindrome structure can be relatively frequent and facilitate the evolution of sex chromosomes in humans, and potentially also in other mammalian species.
Collapse
|
18
|
Shi W, Louzada S, Grigorova M, Massaia A, Arciero E, Kibena L, Ge XJ, Chen Y, Ayub Q, Poolamets O, Tyler-Smith C, Punab M, Laan M, Yang F, Hallast P, Xue Y. Evolutionary and functional analysis of RBMY1 gene copy number variation on the human Y chromosome. Hum Mol Genet 2019; 28:2785-2798. [PMID: 31108506 PMCID: PMC6687947 DOI: 10.1093/hmg/ddz101] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Revised: 05/10/2019] [Accepted: 05/11/2019] [Indexed: 01/17/2023] Open
Abstract
Human RBMY1 genes are located in four variable-sized clusters on the Y chromosome, expressed in male germ cells and possibly associated with sperm motility. We have re-investigated the mutational background and evolutionary history of the RBMY1 copy number distribution in worldwide samples and its relevance to sperm parameters in an Estonian cohort of idiopathic male factor infertility subjects. We estimated approximate RBMY1 copy numbers in 1218 1000 Genomes Project phase 3 males from sequencing read-depth, then chose 14 for valid ation by multicolour fibre-FISH. These fibre-FISH samples provided accurate calibration standards for the entire panel and led to detailed insights into population variation and mutational mechanisms. RBMY1 copy number worldwide ranged from 3 to 13 with a mode of 8. The two larger proximal clusters were the most variable, and additional duplications, deletions and inversions were detected. Placing the copy number estimates onto the published Y-SNP-based phylogeny of the same samples suggested a minimum of 562 mutational changes, translating to a mutation rate of 2.20 × 10-3 (95% CI 1.94 × 10-3 to 2.48 × 10-3) per father-to-son Y-transmission, higher than many short tandem repeat (Y-STRs), and showed no evidence for selection for increased or decreased copy number, but possible copy number stabilizing selection. An analysis of RBMY1 copy numbers among 376 infertility subjects failed to replicate a previously reported association with sperm motility and showed no significant effect on sperm count and concentration, serum follicle stimulating hormone (FSH), luteinizing hormone (LH) and testosterone levels or testicular and semen volume. These results provide the first in-depth insights into the structural rearrangements underlying RBMY1 copy number variation across diverse human lineages.
Collapse
|
19
|
Haber M, Jones AL, Connell BA, Asan, Arciero E, Yang H, Thomas MG, Xue Y, Tyler-Smith C. A Rare Deep-Rooting D0 African Y-Chromosomal Haplogroup and Its Implications for the Expansion of Modern Humans Out of Africa. Genetics 2019; 212:1421-1428. [PMID: 31196864 PMCID: PMC6707464 DOI: 10.1534/genetics.119.302368] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 06/10/2019] [Indexed: 12/31/2022] Open
Abstract
Present-day humans outside Africa descend mainly from a single expansion out ∼50,000-70,000 years ago, but many details of this expansion remain unclear, including the history of the male-specific Y chromosome at this time. Here, we reinvestigate a rare deep-rooting African Y-chromosomal lineage by sequencing the whole genomes of three Nigerian men described in 2003 as carrying haplogroup DE* Y chromosomes, and analyzing them in the context of a calibrated worldwide Y-chromosomal phylogeny. We confirm that these three chromosomes do represent a deep-rooting DE lineage, branching close to the DE bifurcation, but place them on the D branch as an outgroup to all other known D chromosomes, and designate the new lineage D0. We consider three models for the expansion of Y lineages out of Africa ∼50,000-100,000 years ago, incorporating migration back to Africa where necessary to explain present-day Y-lineage distributions. Considering both the Y-chromosomal phylogenetic structure incorporating the D0 lineage, and published evidence for modern humans outside Africa, the most favored model involves an origin of the DE lineage within Africa with D0 and E remaining there, and migration out of the three lineages (C, D, and FT) that now form the vast majority of non-African Y chromosomes. The exit took place 50,300-81,000 years ago (latest date for FT lineage expansion outside Africa - earliest date for the D/D0 lineage split inside Africa), and most likely 50,300-59,400 years ago (considering Neanderthal admixture). This work resolves a long-running debate about Y-chromosomal out-of-Africa/back-to-Africa migrations, and provides insights into the out-of-Africa expansion more generally.
Collapse
|
20
|
Haber M, Doumet-Serhal C, Scheib CL, Xue Y, Mikulski R, Martiniano R, Fischer-Genz B, Schutkowski H, Kivisild T, Tyler-Smith C. A Transient Pulse of Genetic Admixture from the Crusaders in the Near East Identified from Ancient Genome Sequences. Am J Hum Genet 2019; 104:977-984. [PMID: 31006515 PMCID: PMC6506814 DOI: 10.1016/j.ajhg.2019.03.015] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 03/18/2019] [Indexed: 11/12/2022] Open
Abstract
During the medieval period, hundreds of thousands of Europeans migrated to the Near East to take part in the Crusades, and many of them settled in the newly established Christian states along the Eastern Mediterranean coast. Here, we present a genetic snapshot of these events and their aftermath by sequencing the whole genomes of 13 individuals who lived in what is today known as Lebanon between the 3rd and 13th centuries CE. These include nine individuals from the “Crusaders’ pit” in Sidon, a mass burial in South Lebanon identified from the archaeology as the grave of Crusaders killed during a battle in the 13th century CE. We show that all of the Crusaders’ pit individuals were males; some were Western Europeans from diverse origins, some were locals (genetically indistinguishable from present-day Lebanese), and two individuals were a mixture of European and Near Eastern ancestries, providing direct evidence that the Crusaders admixed with the local population. However, these mixtures appear to have had limited genetic consequences since signals of admixture with Europeans are not significant in any Lebanese group today—in particular, Lebanese Christians are today genetically similar to local people who lived during the Roman period which preceded the Crusades by more than four centuries.
Collapse
|
21
|
Pinotti T, Bergström A, Geppert M, Bawn M, Ohasi D, Shi W, Lacerda DR, Solli A, Norstedt J, Reed K, Dawtry K, González-Andrade F, Paz-Y-Miño C, Revollo S, Cuellar C, Jota MS, Santos JE, Ayub Q, Kivisild T, Sandoval JR, Fujita R, Xue Y, Roewer L, Santos FR, Tyler-Smith C. Y Chromosome Sequences Reveal a Short Beringian Standstill, Rapid Expansion, and early Population structure of Native American Founders. Curr Biol 2018; 29:149-157.e3. [PMID: 30581024 DOI: 10.1016/j.cub.2018.11.029] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 09/03/2018] [Accepted: 11/09/2018] [Indexed: 10/27/2022]
Abstract
The Americas were the last inhabitable continents to be occupied by humans, with a growing multidisciplinary consensus for entry 15-25 thousand years ago (kya) from northeast Asia via the former Beringia land bridge [1-4]. Autosomal DNA analyses have dated the separation of Native American ancestors from the Asian gene pool to 23 kya or later [5, 6] and mtDNA analyses to ∼25 kya [7], followed by isolation ("Beringian Standstill" [8, 9]) for 2.4-9 ky and then a rapid expansion throughout the Americas. Here, we present a calibrated sequence-based analysis of 222 Native American and relevant Eurasian Y chromosomes (24 new) from haplogroups Q and C [10], with four major conclusions. First, we identify three to four independent lineages as autochthonous and likely founders: the major Q-M3 and rarer Q-CTS1780 present throughout the Americas, the very rare C3-MPB373 in South America, and possibly the C3-P39/Z30536 in North America. Second, from the divergence times and Eurasian/American distribution of lineages, we estimate a Beringian Standstill duration of 2.7 ky or 4.6 ky, according to alternative models, and entry south of the ice sheet after 19.5 kya. Third, we describe the star-like expansion of Q-M848 (within Q-M3) starting at 15 kya [11] in the Americas, followed by establishment of substantial spatial structure in South America by 12 kya. Fourth, the deep branches of the Q-CTS1780 lineage present at low frequencies throughout the Americas today [12] may reflect a separate out-of-Beringia dispersal after the melting of the glaciers at the end of the Pleistocene.
Collapse
|
22
|
Scheib CL, Li H, Desai T, Link V, Kendall C, Dewar G, Griffith PW, Mörseburg A, Johnson JR, Potter A, Kerr SL, Endicott P, Lindo J, Haber M, Xue Y, Tyler-Smith C, Sandhu MS, Lorenz JG, Randall TD, Faltyskova Z, Pagani L, Danecek P, O'Connell TC, Martz P, Boraas AS, Byrd BF, Leventhal A, Cambra R, Williamson R, Lesage L, Holguin B, Ygnacio-De Soto E, Rosas J, Metspalu M, Stock JT, Manica A, Scally A, Wegmann D, Malhi RS, Kivisild T. Ancient human parallel lineages within North America contributed to a coastal expansion. Science 2018; 360:1024-1027. [PMID: 29853687 DOI: 10.1126/science.aar6851] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2017] [Accepted: 04/20/2018] [Indexed: 12/12/2022]
Abstract
Little is known regarding the first people to enter the Americas and their genetic legacy. Genomic analysis of the oldest human remains from the Americas showed a direct relationship between a Clovis-related ancestral population and all modern Central and South Americans as well as a deep split separating them from North Americans in Canada. We present 91 ancient human genomes from California and Southwestern Ontario and demonstrate the existence of two distinct ancestries in North America, which possibly split south of the ice sheets. A contribution from both of these ancestral populations is found in all modern Central and South Americans. The proportions of these two ancestries in ancient and modern populations are consistent with a coastal dispersal and multiple admixture events.
Collapse
|
23
|
Arciero E, Kraaijenbrink T, Asan, Haber M, Mezzavilla M, Ayub Q, Wang W, Pingcuo Z, Yang H, Wang J, Jobling MA, van Driem G, Xue Y, de Knijff P, Tyler-Smith C. Demographic History and Genetic Adaptation in the Himalayan Region Inferred from Genome-Wide SNP Genotypes of 49 Populations. Mol Biol Evol 2018; 35:1916-1933. [PMID: 29796643 PMCID: PMC6063301 DOI: 10.1093/molbev/msy094] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
We genotyped 738 individuals belonging to 49 populations from Nepal, Bhutan, North India, or Tibet at over 500,000 SNPs, and analyzed the genotypes in the context of available worldwide population data in order to investigate the demographic history of the region and the genetic adaptations to the harsh environment. The Himalayan populations resembled other South and East Asians, but in addition displayed their own specific ancestral component and showed strong population structure and genetic drift. We also found evidence for multiple admixture events involving Himalayan populations and South/East Asians between 200 and 2,000 years ago. In comparisons with available ancient genomes, the Himalayans, like other East and South Asian populations, showed similar genetic affinity to Eurasian hunter-gatherers (a 24,000-year-old Upper Palaeolithic Siberian), and the related Bronze Age Yamnaya. The high-altitude Himalayan populations all shared a specific ancestral component, suggesting that genetic adaptation to life at high altitude originated only once in this region and subsequently spread. Combining four approaches to identifying specific positively selected loci, we confirmed that the strongest signals of high-altitude adaptation were located near the Endothelial PAS domain-containing protein 1 and Egl-9 Family Hypoxia Inducible Factor 1 loci, and discovered eight additional robust signals of high-altitude adaptation, five of which have strong biological functional links to such adaptation. In conclusion, the demographic history of Himalayan populations is complex, with strong local differentiation, reflecting both genetic and cultural factors; these populations also display evidence of multiple genetic adaptations to high-altitude environments.
Collapse
|
24
|
Bergström A, Tyler-Smith C. Human Genetics: Busy Subway Networks in Remote Oceania? Curr Biol 2018; 28:R549-R551. [PMID: 29738726 DOI: 10.1016/j.cub.2018.03.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Ancient human DNA from the Oceanian islands of Vanuatu reveals a surprisingly complex history of human settlement, featuring almost complete replacement shortly after initial colonisation, followed by mixing and a puzzling disconnect between genetic ancestry and language.
Collapse
|
25
|
Xue Y, Tyler-Smith C. Past successes and future opportunities for the genetics of the human Y chromosome. Hum Genet 2018; 136:481-483. [PMID: 28456835 PMCID: PMC5418311 DOI: 10.1007/s00439-017-1806-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|