1
|
DeCasien AR, Chiou KL, Testard C, Mercer A, Negrón-Del Valle JE, Bauman Surratt SE, González O, Stock MK, Ruiz-Lambides AV, Martínez MI, Antón SC, Walker CS, Sallet J, Wilson MA, Brent LJN, Montague MJ, Sherwood CC, Platt ML, Higham JP, Snyder-Mackler N. Evolutionary and biomedical implications of sex differences in the primate brain transcriptome. CELL GENOMICS 2024; 4:100589. [PMID: 38942023 PMCID: PMC11293591 DOI: 10.1016/j.xgen.2024.100589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 12/28/2023] [Accepted: 05/31/2024] [Indexed: 06/30/2024]
Abstract
Humans exhibit sex differences in the prevalence of many neurodevelopmental disorders and neurodegenerative diseases. Here, we generated one of the largest multi-brain-region bulk transcriptional datasets for the rhesus macaque and characterized sex-biased gene expression patterns to investigate the translatability of this species for sex-biased neurological conditions. We identify patterns similar to those in humans, which are associated with overlapping regulatory mechanisms, biological processes, and genes implicated in sex-biased human disorders, including autism. We also show that sex-biased genes exhibit greater genetic variance for expression and more tissue-specific expression patterns, which may facilitate rapid evolution of sex-biased genes. Our findings provide insights into the biological mechanisms underlying sex-biased disease and support the rhesus macaque model for the translational study of these conditions.
Collapse
Affiliation(s)
- Alex R DeCasien
- Department of Anthropology, New York University, New York, NY, USA; New York Consortium in Evolutionary Primatology, New York, NY, USA; Section on Developmental Neurogenomics, National Institute of Mental Health, Bethesda, MD, USA.
| | - Kenneth L Chiou
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA; School of Life Sciences, Arizona State University, Tempe, AZ, USA; Department of Psychology, University of Washington, Seattle, WA, USA; Nathan Shock Center of Excellence in the Basic Biology of Aging, University of Washington, Seattle, WA, USA.
| | - Camille Testard
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
| | - Arianne Mercer
- Department of Psychology, University of Washington, Seattle, WA, USA
| | | | | | - Olga González
- Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA
| | - Michala K Stock
- Department of Sociology and Anthropology, Metropolitan State University of Denver, Denver, CO, USA
| | | | - Melween I Martínez
- Caribbean Primate Research Center, University of Puerto Rico, San Juan, PR, USA
| | - Susan C Antón
- Department of Anthropology, New York University, New York, NY, USA; New York Consortium in Evolutionary Primatology, New York, NY, USA
| | - Christopher S Walker
- Department of Molecular Biomedical Sciences, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA
| | - Jérôme Sallet
- Stem Cell and Brain Research Institute, Université Lyon, Lyon, France
| | - Melissa A Wilson
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA; School of Life Sciences, Arizona State University, Tempe, AZ, USA; Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ, USA
| | - Lauren J N Brent
- Centre for Research in Animal Behavior, University of Exeter, Exeter, UK
| | - Michael J Montague
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA
| | - Chet C Sherwood
- Department of Anthropology, The George Washington University, Washington, DC, USA
| | - Michael L Platt
- Department of Neuroscience, University of Pennsylvania, Philadelphia, PA, USA; Department of Psychology, University of Pennsylvania, Philadelphia, PA, USA; Department of Marketing, University of Pennsylvania, Philadelphia, PA, USA
| | - James P Higham
- Department of Anthropology, New York University, New York, NY, USA; New York Consortium in Evolutionary Primatology, New York, NY, USA.
| | - Noah Snyder-Mackler
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA; School of Life Sciences, Arizona State University, Tempe, AZ, USA; Department of Psychology, University of Washington, Seattle, WA, USA; Nathan Shock Center of Excellence in the Basic Biology of Aging, University of Washington, Seattle, WA, USA; ASU-Banner Neurodegenerative Disease Research Center, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
2
|
Webster TH, Vannan A, Pinto BJ, Denbrock G, Morales M, Dolby GA, Fiddes IT, DeNardo DF, Wilson MA. Lack of Dosage Balance and Incomplete Dosage Compensation in the ZZ/ZW Gila Monster (Heloderma suspectum) Revealed by De Novo Genome Assembly. Genome Biol Evol 2024; 16:evae018. [PMID: 38319079 PMCID: PMC10950046 DOI: 10.1093/gbe/evae018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 01/23/2024] [Accepted: 01/23/2024] [Indexed: 02/07/2024] Open
Abstract
Reptiles exhibit a variety of modes of sex determination, including both temperature-dependent and genetic mechanisms. Among those species with genetic sex determination, sex chromosomes of varying heterogamety (XX/XY and ZZ/ZW) have been observed with different degrees of differentiation. Karyotype studies have demonstrated that Gila monsters (Heloderma suspectum) have ZZ/ZW sex determination and this system is likely homologous to the ZZ/ZW system in the Komodo dragon (Varanus komodoensis), but little else is known about their sex chromosomes. Here, we report the assembly and analysis of the Gila monster genome. We generated a de novo draft genome assembly for a male using 10X Genomics technology. We further generated and analyzed short-read whole genome sequencing and whole transcriptome sequencing data for three males and three females. By comparing female and male genomic data, we identified four putative Z chromosome scaffolds. These putative Z chromosome scaffolds are homologous to Z-linked scaffolds identified in the Komodo dragon. Further, by analyzing RNAseq data, we observed evidence of incomplete dosage compensation between the Gila monster Z chromosome and autosomes and a lack of balance in Z-linked expression between the sexes. In particular, we observe lower expression of the Z in females (ZW) than males (ZZ) on a global basis, though we find evidence suggesting local gene-by-gene compensation. This pattern has been observed in most other ZZ/ZW systems studied to date and may represent a general pattern for female heterogamety in vertebrates.
Collapse
Affiliation(s)
- Timothy H Webster
- Department of Anthropology, University of Utah, Salt Lake City, UT, USA
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Annika Vannan
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Brendan J Pinto
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
- Department of Zoology, Milwaukee Public Museum, Milwaukee, WI, USA
| | - Grant Denbrock
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Matheo Morales
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Department of Genetics, Yale University, New Haven, CT, USA
| | - Greer A Dolby
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | | | - Dale F DeNardo
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Melissa A Wilson
- School of Life Sciences, Arizona State University, Tempe, AZ, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, USA
- Center for Mechanisms of Evolution, Biodesign Institute, Tempe, AZ, USA
| |
Collapse
|
3
|
Wang S, Wang B, Drury V, Drake S, Sun N, Alkhairo H, Arbelaez J, Duhn C, Bal VH, Langley K, Martin J, Hoekstra PJ, Dietrich A, Xing J, Heiman GA, Tischfield JA, Fernandez TV, Owen MJ, O'Donovan MC, Thapar A, State MW, Willsey AJ. Rare X-linked variants carry predominantly male risk in autism, Tourette syndrome, and ADHD. Nat Commun 2023; 14:8077. [PMID: 38057346 PMCID: PMC10700338 DOI: 10.1038/s41467-023-43776-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 11/18/2023] [Indexed: 12/08/2023] Open
Abstract
Autism spectrum disorder (ASD), Tourette syndrome (TS), and attention-deficit/hyperactivity disorder (ADHD) display strong male sex bias, due to a combination of genetic and biological factors, as well as selective ascertainment. While the hemizygous nature of chromosome X (Chr X) in males has long been postulated as a key point of "male vulnerability", rare genetic variation on this chromosome has not been systematically characterized in large-scale whole exome sequencing studies of "idiopathic" ASD, TS, and ADHD. Here, we take advantage of informative recombinations in simplex ASD families to pinpoint risk-enriched regions on Chr X, within which rare maternally-inherited damaging variants carry substantial risk in males with ASD. We then apply a modified transmission disequilibrium test to 13,052 ASD probands and identify a novel high confidence ASD risk gene at exome-wide significance (MAGEC3). Finally, we observe that rare damaging variants within these risk regions carry similar effect sizes in males with TS or ADHD, further clarifying genetic mechanisms underlying male vulnerability in multiple neurodevelopmental disorders that can be exploited for systematic gene discovery.
Collapse
Affiliation(s)
- Sheng Wang
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Belinda Wang
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Vanessa Drury
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Sam Drake
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Nawei Sun
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Hasan Alkhairo
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Juan Arbelaez
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Clif Duhn
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - Vanessa H Bal
- Graduate School of Applied and Professional Psychology, Rutgers University, New Brunswick, NJ, USA
| | - Kate Langley
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
- School of Psychology, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Joanna Martin
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Pieter J Hoekstra
- University of Groningen, University Medical Center Groningen, Department of Child and Adolescent Psychiatry, Groningen, The Netherlands
- Accare Child Study Center, Groningen, The Netherlands
| | - Andrea Dietrich
- University of Groningen, University Medical Center Groningen, Department of Child and Adolescent Psychiatry, Groningen, The Netherlands
- Accare Child Study Center, Groningen, The Netherlands
| | - Jinchuan Xing
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, NJ, USA
| | - Gary A Heiman
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, NJ, USA
| | - Jay A Tischfield
- Department of Genetics and the Human Genetics Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, NJ, USA
| | - Thomas V Fernandez
- Yale Child Study Center and Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
| | - Michael J Owen
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Michael C O'Donovan
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Anita Thapar
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University School of Medicine, Cardiff, Wales, UK
| | - Matthew W State
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA
| | - A Jeremy Willsey
- Department of Psychiatry and Behavioral Sciences, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, 94143, USA.
- Quantitative Biosciences Institute (QBI), University of California, San Francisco, San Francisco, CA, 94143, USA.
| |
Collapse
|
4
|
Cotter DJ, Webster TH, Wilson MA. Genomic and demographic processes differentially influence genetic variation across the human X chromosome. PLoS One 2023; 18:e0287609. [PMID: 37910456 PMCID: PMC10619814 DOI: 10.1371/journal.pone.0287609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 06/08/2023] [Indexed: 11/03/2023] Open
Abstract
Many forces influence genetic variation across the genome including mutation, recombination, selection, and demography. Increased mutation and recombination both lead to increases in genetic diversity in a region-specific manner, while complex demographic patterns shape patterns of diversity on a more global scale. While these processes act across the entire genome, the X chromosome is particularly interesting because it contains several distinct regions that are subject to different combinations and strengths of these forces: the pseudoautosomal regions (PARs) and the X-transposed region (XTR). The X chromosome thus can serve as a unique model for studying how genetic and demographic forces act in different contexts to shape patterns of observed variation. We therefore sought to explore diversity, divergence, and linkage disequilibrium in each region of the X chromosome using genomic data from 26 human populations. Across populations, we find that both diversity and substitution rate are consistently elevated in PAR1 and the XTR compared to the rest of the X chromosome. In contrast, linkage disequilibrium is lowest in PAR1, consistent with the high recombination rate in this region, and highest in the region of the X chromosome that does not recombine in males. However, linkage disequilibrium in the XTR is intermediate between PAR1 and the autosomes, and much lower than the non-recombining X. Finally, in addition to these global patterns, we also observed variation in ratios of X versus autosomal diversity consistent with population-specific evolutionary history as well. While our results were generally consistent with previous work, two unexpected observations emerged. First, our results suggest that the XTR does not behave like the rest of the recombining X and may need to be evaluated separately in future studies. Second, the different regions of the X chromosome appear to exhibit unique patterns of linked selection across different human populations. Together, our results highlight profound regional differences across the X chromosome, simultaneously making it an ideal system for exploring the action of evolutionary forces as well as necessitating its careful consideration and treatment in genomic analyses.
Collapse
Affiliation(s)
- Daniel J. Cotter
- Department of Genetics, Stanford University, Stanford, CA, United States of America
| | - Timothy H. Webster
- Department of Anthropology, University of Utah, Salt Lake City, UT, United States of America
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
| | - Melissa A. Wilson
- School of Life Sciences, Arizona State University, Tempe, AZ, United States of America
- Center for Evolution and Medicine, Biodesign Institute, Arizona State University, Tempe, AZ, United States of America
| |
Collapse
|
5
|
Brand CM, Kuang S, Gilbertson EN, McArthur E, Pollard KS, Webster TH, Capra JA. Sequence-based machine learning reveals 3D genome differences between bonobos and chimpanzees. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.26.564272. [PMID: 37961120 PMCID: PMC10634871 DOI: 10.1101/2023.10.26.564272] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Phenotypic divergence between closely related species, including bonobos and chimpanzees (genus Pan), is largely driven by variation in gene regulation. The 3D structure of the genome mediates gene expression; however, genome folding differences in Pan are not well understood. Here, we apply machine learning to predict genome-wide 3D genome contact maps from DNA sequence for 56 bonobos and chimpanzees, encompassing all five extant lineages. We use a pairwise approach to estimate 3D divergence between individuals from the resulting contact maps in 4,420 1 Mb genomic windows. While most pairs were similar, ∼17% were predicted to be substantially divergent in genome folding. The most dissimilar maps were largely driven by single individuals with rare variants that produce unique 3D genome folding in a region. We also identified 89 genomic windows where bonobo and chimpanzee contact maps substantially diverged, including several windows harboring genes associated with traits implicated in Pan phenotypic divergence. We used in silico mutagenesis to identify 51 3D-modifying variants in these bonobo-chimpanzee divergent windows, finding that 34 or 66.67% induce genome folding changes via CTCF binding motif disruption. Our results reveal 3D genome variation at the population-level and identify genomic regions where changes in 3D folding may contribute to phenotypic differences in our closest living relatives.
Collapse
Affiliation(s)
- Colin M. Brand
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
| | - Shuzhen Kuang
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
| | - Erin N. Gilbertson
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
| | - Evonne McArthur
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN
| | - Katherine S. Pollard
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
- Chan Zuckerberg Biohub, San Francisco, CA
| | | | - John A. Capra
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA
- Biomedical Informatics Graduate Program, University of California San Francisco, San Francisco, CA
| |
Collapse
|
6
|
Pinto BJ, O’Connor B, Schatz MC, Zarate S, Wilson MA. Concerning the eXclusion in human genomics: the choice of sex chromosome representation in the human genome drastically affects the number of identified variants. G3 (BETHESDA, MD.) 2023; 13:jkad169. [PMID: 37497639 PMCID: PMC10542555 DOI: 10.1093/g3journal/jkad169] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 06/28/2023] [Accepted: 07/05/2023] [Indexed: 07/28/2023]
Abstract
Over the past 30 years, a community of scientists has pieced together every base pair of the human reference genome from telomere to telomere. Interestingly, most human genomics studies omit more than 5% of the genome from their analyses. Under "normal" circumstances, omitting any chromosome(s) from an analysis of the human genome would be a cause for concern, with the exception being sex chromosomes. Sex chromosomes in eutherians share an evolutionary origin as an ancestral pair of autosomes. In humans, they share 3 regions of high-sequence identity (∼98-100%), which, along with the unique transmission patterns of the sex chromosomes, introduce technical artifacts in genomic analyses. However, the human X chromosome bears numerous important genes, including more "immune response" genes than any other chromosome, which makes its exclusion irresponsible when sex differences across human diseases are widespread. To better characterize the possible effect of the inclusion/exclusion of the X chromosome on variants called, we conducted a pilot study on the Terra cloud platform to replicate a subset of standard genomic practices using both the CHM13 reference genome and the sex chromosome complement-aware reference genome. We compared the quality of variant calling, expression quantification, and allele-specific expression using these 2 reference genome versions across 50 human samples from the Genotype-Tissue Expression consortium annotated as females. We found that after correction, the whole X chromosome (100%) can generate reliable variant calls, allowing for the inclusion of the whole genome in human genomics analyses as a departure from the status quo of omitting the sex chromosomes from empirical and clinical genomics studies.
Collapse
Affiliation(s)
- Brendan J Pinto
- School of Life Sciences, Arizona State University, Tempe, AZ 85282, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ 85282, USA
- Department of Zoology, Milwaukee Public Museum, Milwaukee, WI 53233, USA
| | | | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Samantha Zarate
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Melissa A Wilson
- School of Life Sciences, Arizona State University, Tempe, AZ 85282, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ 85282, USA
- The Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85282, USA
| |
Collapse
|
7
|
Rhie A, Nurk S, Cechova M, Hoyt SJ, Taylor DJ, Altemose N, Hook PW, Koren S, Rautiainen M, Alexandrov IA, Allen J, Asri M, Bzikadze AV, Chen NC, Chin CS, Diekhans M, Flicek P, Formenti G, Fungtammasan A, Garcia Giron C, Garrison E, Gershman A, Gerton JL, Grady PGS, Guarracino A, Haggerty L, Halabian R, Hansen NF, Harris R, Hartley GA, Harvey WT, Haukness M, Heinz J, Hourlier T, Hubley RM, Hunt SE, Hwang S, Jain M, Kesharwani RK, Lewis AP, Li H, Logsdon GA, Lucas JK, Makalowski W, Markovic C, Martin FJ, Mc Cartney AM, McCoy RC, McDaniel J, McNulty BM, Medvedev P, Mikheenko A, Munson KM, Murphy TD, Olsen HE, Olson ND, Paulin LF, Porubsky D, Potapova T, Ryabov F, Salzberg SL, Sauria MEG, Sedlazeck FJ, Shafin K, Shepelev VA, Shumate A, Storer JM, Surapaneni L, Taravella Oill AM, Thibaud-Nissen F, Timp W, Tomaszkiewicz M, Vollger MR, Walenz BP, Watwood AC, Weissensteiner MH, Wenger AM, Wilson MA, Zarate S, Zhu Y, Zook JM, Eichler EE, O'Neill RJ, Schatz MC, Miga KH, Makova KD, Phillippy AM. The complete sequence of a human Y chromosome. Nature 2023; 621:344-354. [PMID: 37612512 PMCID: PMC10752217 DOI: 10.1038/s41586-023-06457-y] [Citation(s) in RCA: 87] [Impact Index Per Article: 87.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Accepted: 07/19/2023] [Indexed: 08/25/2023]
Abstract
The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.
Collapse
Affiliation(s)
- Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Oxford Nanopore Technologies Inc., Oxford, UK
| | - Monika Cechova
- Faculty of Informatics, Masaryk University, Brno, Czech Republic
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Savannah J Hoyt
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Dylan J Taylor
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Nicolas Altemose
- Department of Molecular and Cell Biology, University of California, Berkeley, CA, USA
| | - Paul W Hook
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Ivan A Alexandrov
- Federal Research Center of Biotechnology of the Russian Academy of Sciences, Moscow, Russia
- Center for Algorithmic Biotechnology, Saint Petersburg State University, St Petersburg, Russia
- Department of Anatomy and Anthropology and Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv-Yafo, Israel
| | - Jamie Allen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Mobin Asri
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Andrey V Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, CA, USA
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Chen-Shan Chin
- GeneDX Holdings Corp, Stamford, CT, USA
- Foundation of Biological Data Science, Belmont, CA, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | | | | | - Carlos Garcia Giron
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Ariel Gershman
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer L Gerton
- Stowers Institute for Medical Research, Kansas City, MO, USA
- University of Kansas Medical Center, Kansas City, MO, USA
| | - Patrick G S Grady
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
- Genomics Research Centre, Human Technopole, Milan, Italy
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Reza Halabian
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, Münster, Germany
| | - Nancy F Hansen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Robert Harris
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Gabrielle A Hartley
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Jakob Heinz
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Stephen Hwang
- XDBio Program, Johns Hopkins University, Baltimore, MD, USA
| | - Miten Jain
- Department of Bioengineering, Department of Physics, Northeastern University, Boston, MA, USA
| | - Rupesh K Kesharwani
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Julian K Lucas
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Wojciech Makalowski
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, Münster, Germany
| | - Christopher Markovic
- Genome Technology Access Center at the McDonnell Genome Institute, Washington University, St. Louis, MO, USA
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ann M Mc Cartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
| | - Jennifer McDaniel
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Brandy M McNulty
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Paul Medvedev
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA, USA
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, USA
- Center for Computational Biology and Bioinformatics, Pennsylvania State University, University Park, PA, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Saint Petersburg State University, St Petersburg, Russia
- UCL Queen Square Institute of Neurology, UCL, London, UK
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Hugh E Olsen
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Nathan D Olson
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Luis F Paulin
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | - Steven L Salzberg
- Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MD, USA
| | | | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
- Department of Computer Science, Rice University, Houston, TX, USA
| | | | | | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | | | - Likhitha Surapaneni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Angela M Taravella Oill
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Winston Timp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Marta Tomaszkiewicz
- Department of Biology, Pennsylvania State University, University Park, PA, USA
- Department of Biomedical Engineering, Pennsylvania State University, State College, PA, USA
| | - Mitchell R Vollger
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Brian P Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Allison C Watwood
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | | | | | - Melissa A Wilson
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Samantha Zarate
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Yiming Zhu
- Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston, TX, USA
| | - Justin M Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Investigator, Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Rachel J O'Neill
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
- Department of Genetics and Genome Sciences, UConn Health, Farmington, CT, USA
| | - Michael C Schatz
- Department of Biology, Johns Hopkins University, Baltimore, MD, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Karen H Miga
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| | - Kateryna D Makova
- Department of Biology, Pennsylvania State University, University Park, PA, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
8
|
Pinto BJ, Gamble T, Smith CH, Wilson MA. A lizard is never late: Squamate genomics as a recent catalyst for understanding sex chromosome and microchromosome evolution. J Hered 2023; 114:445-458. [PMID: 37018459 PMCID: PMC10445521 DOI: 10.1093/jhered/esad023] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 04/03/2023] [Indexed: 04/07/2023] Open
Abstract
In 2011, the first high-quality genome assembly of a squamate reptile (lizard or snake) was published for the green anole. Dozens of genome assemblies were subsequently published over the next decade, yet these assemblies were largely inadequate for answering fundamental questions regarding genome evolution in squamates due to their lack of contiguity or annotation. As the "genomics age" was beginning to hit its stride in many organismal study systems, progress in squamates was largely stagnant following the publication of the green anole genome. In fact, zero high-quality (chromosome-level) squamate genomes were published between the years 2012 and 2017. However, since 2018, an exponential increase in high-quality genome assemblies has materialized with 24 additional high-quality genomes published for species across the squamate tree of life. As the field of squamate genomics is rapidly evolving, we provide a systematic review from an evolutionary genomics perspective. We collated a near-complete list of publicly available squamate genome assemblies from more than half-a-dozen international and third-party repositories and systematically evaluated them with regard to their overall quality, phylogenetic breadth, and usefulness for continuing to provide accurate and efficient insights into genome evolution across squamate reptiles. This review both highlights and catalogs the currently available genomic resources in squamates and their ability to address broader questions in vertebrates, specifically sex chromosome and microchromosome evolution, while addressing why squamates may have received less historical focus and has caused their progress in genomics to lag behind peer taxa.
Collapse
Affiliation(s)
- Brendan J Pinto
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, United States
- Department of Zoology, Milwaukee Public Museum, Milwaukee, WI, United States
| | - Tony Gamble
- Department of Zoology, Milwaukee Public Museum, Milwaukee, WI, United States
- Department of Biological Sciences, Marquette University, Milwaukee, WI, United States
- Bell Museum of Natural History, University of Minnesota, St Paul, MN, United States
| | - Chase H Smith
- Department of Integrative Biology, University of Texas at Austin, Austin, TX, United States
| | - Melissa A Wilson
- School of Life Sciences, Arizona State University, Tempe, AZ, United States
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, United States
- Center for Mechanisms of Evolution, Biodesign Institute, Tempe, AZ, United States
| |
Collapse
|
9
|
Sun L, Wang Z, Lu T, Manolio TA, Paterson AD. eXclusionarY: 10 years later, where are the sex chromosomes in GWASs? Am J Hum Genet 2023; 110:903-912. [PMID: 37267899 DOI: 10.1016/j.ajhg.2023.04.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2023] Open
Abstract
10 years ago, a detailed analysis showed that only 33% of genome-wide association study (GWAS) results included the X chromosome. Multiple recommendations were made to combat such exclusion. Here, we re-surveyed the research landscape to determine whether these earlier recommendations had been translated. Unfortunately, among the genome-wide summary statistics reported in 2021 in the NHGRI-EBI GWAS Catalog, only 25% provided results for the X chromosome and 3% for the Y chromosome, suggesting that the exclusion phenomenon not only persists but has also expanded into an exclusionary problem. Normalizing by physical length of the chromosome, the average number of studies published through November 2022 with genome-wide-significant findings on the X chromosome is ∼1 study/Mb. By contrast, it ranges from ∼6 to ∼16 studies/Mb for chromosomes 4 and 19, respectively. Compared with the autosomal growth rate of ∼0.086 studies/Mb/year over the last decade, studies of the X chromosome grew at less than one-seventh that rate, only ∼0.012 studies/Mb/year. Among the studies that reported significant associations on the X chromosome, we noted extreme heterogeneities in data analysis and reporting of results, suggesting the need for clear guidelines. Unsurprisingly, among the 430 scores sampled from the PolyGenic Score Catalog, 0% contained weights for sex chromosomal SNPs. To overcome the dearth of sex chromosome analyses, we provide five sets of recommendations and future directions. Finally, until the sex chromosomes are included in a whole-genome study, instead of GWASs, we propose such studies would more properly be referred to as "AWASs," meaning "autosome-wide scans."
Collapse
Affiliation(s)
- Lei Sun
- Department of Statistical Sciences, Faculty of Arts and Science, University of Toronto, Toronto, ON, Canada; Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada.
| | - Zhong Wang
- Department of Statistics and Data Science, Faculty of Science, National University of Singapore, Singapore
| | - Tianyuan Lu
- Department of Statistical Sciences, Faculty of Arts and Science, University of Toronto, Toronto, ON, Canada; Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, QC, Canada
| | - Teri A Manolio
- Division of Genomic Medicine, National Human Genome Research Institute, NIH, Bethesda, MD, USA
| | - Andrew D Paterson
- Division of Biostatistics, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada; Division of Epidemiology, Dalla Lana School of Public Health, University of Toronto, Toronto, ON, Canada; Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada.
| |
Collapse
|
10
|
Khramtsova EA, Wilson MA, Martin J, Winham SJ, He KY, Davis LK, Stranger BE. Quality control and analytic best practices for testing genetic models of sex differences in large populations. Cell 2023; 186:2044-2061. [PMID: 37172561 PMCID: PMC10266536 DOI: 10.1016/j.cell.2023.04.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 01/31/2023] [Accepted: 04/07/2023] [Indexed: 05/15/2023]
Abstract
Phenotypic sex-based differences exist for many complex traits. In other cases, phenotypes may be similar, but underlying biology may vary. Thus, sex-aware genetic analyses are becoming increasingly important for understanding the mechanisms driving these differences. To this end, we provide a guide outlining the current best practices for testing various models of sex-dependent genetic effects in complex traits and disease conditions, noting that this is an evolving field. Insights from sex-aware analyses will not only teach us about the biology of complex traits but also aid in achieving the goals of precision medicine and health equity for all.
Collapse
Affiliation(s)
- Ekaterina A Khramtsova
- Population Analytics and Insights, Data Science Analytics & Insights, Janssen R&D, Lower Gwynedd Township, PA, USA.
| | - Melissa A Wilson
- School of Life Sciences, Center for Evolution and Medicine, Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85282, USA
| | - Joanna Martin
- Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, UK
| | - Stacey J Winham
- Department of Quantitative Health Sciences, Division of Computational Biology, Mayo Clinic, Rochester, MN, USA
| | - Karen Y He
- Population Analytics and Insights, Data Science Analytics & Insights, Janssen R&D, Lower Gwynedd Township, PA, USA
| | - Lea K Davis
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA; Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Barbara E Stranger
- Center for Genetic Medicine, Department of Pharmacology, Northwestern University, Chicago, IL, USA.
| |
Collapse
|
11
|
Lewis MA, Schulte J, Matthews L, Vaden KI, Steves CJ, Williams FMK, Schulte BA, Dubno JR, Steel KP. Accurate phenotypic classification and exome sequencing allow identification of novel genes and variants associated with adult-onset hearing loss. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.04.27.23289040. [PMID: 37163093 PMCID: PMC10168485 DOI: 10.1101/2023.04.27.23289040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Adult-onset progressive hearing loss is a common, complex disease with a strong genetic component. Although to date over 150 genes have been identified as contributing to human hearing loss, many more remain to be discovered, as does most of the underlying genetic diversity. Many different variants have been found to underlie adult-onset hearing loss, but they tend to be rare variants with a high impact upon the gene product. It is likely that combinations of more common, lower impact variants also play a role in the prevalence of the disease. Here we present our exome study of hearing loss in a cohort of 532 older adult volunteers with extensive phenotypic data, including 99 older adults with normal hearing, an important control set. Firstly, we carried out an outlier analysis to identify genes with a high variant load in older adults with hearing loss compared to those with normal hearing. Secondly, we used audiometric threshold data to identify individual variants which appear to contribute to different threshold values. We followed up these analyses in a second cohort. Using these approaches, we identified genes and variants linked to better hearing as well as those linked to worse hearing. These analyses identified some known deafness genes, demonstrating proof of principle of our approach. However, most of the candidate genes are novel associations with hearing loss. While the results support the suggestion that genes responsible for severe deafness may also be involved in milder hearing loss, they also suggest that there are many more genes involved in hearing which remain to be identified. Our candidate gene lists may provide useful starting points for improved diagnosis and drug development.
Collapse
Affiliation(s)
- Morag A Lewis
- Wolfson Centre for Age-Related Diseases, King's College London, SE1 1UL, UK
- The Medical University of South Carolina, SC, USA
| | | | | | | | - Claire J Steves
- Department of Twin Research and Genetic Epidemiology, King's College London, School of Life Course and Population Sciences, London, UK
| | - Frances M K Williams
- Department of Twin Research and Genetic Epidemiology, King's College London, School of Life Course and Population Sciences, London, UK
| | | | - Judy R Dubno
- The Medical University of South Carolina, SC, USA
| | - Karen P Steel
- Wolfson Centre for Age-Related Diseases, King's College London, SE1 1UL, UK
- The Medical University of South Carolina, SC, USA
| |
Collapse
|
12
|
Nix JL, Schettini GP, Biase FH. Sexing of cattle embryos using RNA-sequencing data or polymerase chain reaction based on a complete sequence of cattle chromosome Y. Front Genet 2023; 14:1038291. [PMID: 37077537 PMCID: PMC10106624 DOI: 10.3389/fgene.2023.1038291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 03/24/2023] [Indexed: 04/05/2023] Open
Abstract
When necessary, RNA-sequencing data or polymerase chain reaction (PCR) assays can be used to determine the presence of the chromosome Y (ChrY) in samples. This information allows for biological variation due to sexual dimorphism to be studied. A prime example is when researchers conduct RNA-sequencing of single embryos, or conceptuses, prior to the development of gonads. A recent publication of a complete sequence of the ChrY has removed limitations for the development of these procedures in cattle, otherwise imposed by the absence of a ChrY in the reference genome. Using the sequence of the cattle ChrY and transcriptome data, we conducted a systematic search for genes in the ChrY that are exclusively expressed in male tissues. The genes ENSBIXG00000029763, ENSBIXG00000029774, ENSBIXG00000029788, and ENSBIXG00000029892 were consistently expressed across male tissues and lowly expressed or absent in female samples. We observed that the cumulative values of counts per million were 2688-fold greater in males than the equivalent values in female samples. Thus, we deemed these genes suitable for the sexing of samples using RNA-sequencing data. We successfully used this set of genes to infer the sex of 22 cattle blastocysts (8 females and 14 males). Additionally, the completed sequence of the cattle ChrY has segments in the male-specific region that are not repeated. We designed a pair of oligonucleotides that targets one of these non-repeated regions in the male-specific sequence of the ChrY. Using this pair of oligonucleotides, in a multiplexed PCR assay with oligonucleotides that anneal to an autosome chromosome, we accurately identified the sex of cattle blastocysts. We developed efficient procedures for the sexing of samples in cattle using either transcriptome data or their DNA. The procedures using RNA-sequencing will greatly benefit researchers who work with samples limited in cell numbers which are only sufficient to produce transcriptome data. The oligonucleotides used for the accurate sexing of samples using PCR are transferable to other cattle tissue samples.
Collapse
|
13
|
Pinto BJ, Gamble T, Smith CH, Wilson MA. A lizard is never late: squamate genomics as a recent catalyst for understanding sex chromosome and microchromosome evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.20.524006. [PMID: 37034614 PMCID: PMC10081179 DOI: 10.1101/2023.01.20.524006] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
In 2011, the first high-quality genome assembly of a squamate reptile (lizard or snake) was published for the green anole. Dozens of genome assemblies were subsequently published over the next decade, yet these assemblies were largely inadequate for answering fundamental questions regarding genome evolution in squamates due to their lack of contiguity or annotation. As the "genomics age" was beginning to hit its stride in many organismal study systems, progress in squamates was largely stagnant following the publication of the green anole genome. In fact, zero high-quality (chromosome-level) squamate genomes were published between the years 2012-2017. However, since 2018, an exponential increase in high-quality genome assemblies has materialized with 24 additional high-quality genomes published for species across the squamate tree of life. As the field of squamate genomics is rapidly evolving, we provide a systematic review from an evolutionary genomics perspective. We collated a near-complete list of publicly available squamate genome assemblies from more than half-a-dozen international and third-party repositories and systematically evaluated them with regard to their overall quality, phylogenetic breadth, and usefulness for continuing to provide accurate and efficient insights into genome evolution across squamate reptiles. This review both highlights and catalogs the currently available genomic resources in squamates and their ability to address broader questions in vertebrates, specifically sex chromosome and microchromosome evolution, while addressing why squamates may have received less historical focus and has caused their progress in genomics to lag behind peer taxa.
Collapse
Affiliation(s)
- Brendan J Pinto
- School of Life Sciences, Arizona State University, Tempe, AZ USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ USA
- Department of Zoology, Milwaukee Public Museum, Milwaukee, WI USA
| | - Tony Gamble
- Department of Zoology, Milwaukee Public Museum, Milwaukee, WI USA
- Department of Biological Sciences, Marquette University, Milwaukee WI USA
- Bell Museum of Natural History, University of Minnesota, St Paul, MN USA
| | - Chase H Smith
- Department of Integrative Biology, University of Texas at Austin, Austin, TX, USA
| | - Melissa A Wilson
- School of Life Sciences, Arizona State University, Tempe, AZ USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ USA
- Center for Mechanisms of Evolution, Biodesign Institute, Tempe, AZ USA
| |
Collapse
|
14
|
Burley JT, Orzechowski SCM, Sin SYW, Edwards SV. Whole-genome phylogeography of the blue-faced honeyeater (Entomyzon cyanotis) and discovery and characterization of a neo-Z chromosome. Mol Ecol 2023; 32:1248-1270. [PMID: 35797346 DOI: 10.1111/mec.16604] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Revised: 06/22/2022] [Accepted: 07/04/2022] [Indexed: 11/28/2022]
Abstract
Whole-genome surveys of genetic diversity and geographic variation often yield unexpected discoveries of novel structural variation, which long-read DNA sequencing can help clarify. Here, we report on whole-genome phylogeography of a bird exhibiting classic vicariant geographies across Australia and New Guinea, the blue-faced honeyeater (Entomyzon cyanotis), and the discovery and characterization of a novel neo-Z chromosome by long-read sequencing. Using short-read genome-wide SNPs, we inferred population divergence events within E. cyanotis across the Carpentarian and other biogeographic barriers during the Pleistocene (~0.3-1.7 Ma). Evidence for introgression between nonsister populations supports a hypothesis of reticulate evolution around a triad of dynamic barriers around Pleistocene Lake Carpentaria between Australia and New Guinea. During this phylogeographic survey, we discovered a large (134 Mbp) neo-Z chromosome and we explored its diversity, divergence and introgression landscape. We show that, as in some sylvioid passerine birds, a fusion occurred between chromosome 5 and the Z chromosome to form a neo-Z chromosome; and in E. cyanotis, the ancestral pseudoautosomal region (PAR) appears nonrecombinant between Z and W, along with most of the fused chromosome 5. The added recombination-suppressed portion of the neo-Z (~37.2 Mbp) displays reduced diversity and faster population genetic differentiation compared with the ancestral-Z. Yet, the new PAR (~17.4 Mbp) shows elevated diversity and reduced differentiation compared to autosomes, potentially resulting from introgression. In our case, long-read sequencing helped clarify the genomic landscape of population divergence on autosomes and sex chromosomes in a species where prior knowledge of genome structure was still incomplete.
Collapse
Affiliation(s)
- John T Burley
- Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts, USA.,Department of Evolutionary Biology, Evolutionary Biology Centre (EBC), Uppsala University, Uppsala, Sweden.,Department of Ecology Evolution and Organismal Biology, Brown University, Providence, Rhode Island, USA.,Institute at Brown for Environment and Society, Brown University, Providence, Rhode Island, USA
| | | | - Simon Yung Wa Sin
- Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts, USA.,School of Biological Sciences, The University of Hong Kong, Hong Kong SAR, China
| | - Scott V Edwards
- Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
15
|
Pinto BJ, O’Connor B, Schatz MC, Zarate S, Wilson MA. Concerning the eXclusion in human genomics: The choice of sex chromosome representation in the human genome drastically affects number of identified variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.22.529542. [PMID: 36865318 PMCID: PMC9980147 DOI: 10.1101/2023.02.22.529542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
Over the past 30 years, a community of scientists have pieced together every base pair of the human reference genome from telomere-to-telomere. Interestingly, most human genomics studies omit more than 5% of the genome from their analyses. Under 'normal' circumstances, omitting any chromosome(s) from analysis of the human genome would be reason for concern-the exception being the sex chromosomes. Sex chromosomes in eutherians share an evolutionary origin as an ancestral pair of autosomes. In humans, they share three regions of high sequence identity (~98-100%), which-along with the unique transmission patterns of the sex chromosomes-introduce technical artifacts into genomic analyses. However, the human X chromosome bears numerous important genes-including more "immune response" genes than any other chromosome-which makes its exclusion irresponsible when sex differences across human diseases are widespread. To better characterize the effect that including/excluding the X chromosome may have on variants called, we conducted a pilot study on the Terra cloud platform to replicate a subset of standard genomic practices using both the CHM13 reference genome and sex chromosome complement-aware (SCC-aware) reference genome. We compared quality of variant calling, expression quantification, and allele-specific expression using these two reference genome versions across 50 human samples from the Genotype-Tissue-Expression consortium annotated as females. We found that after correction, the whole X chromosome (100%) can generate reliable variant calls-allowing for the inclusion of the whole genome in human genomics analyses as a departure from the status quo of omitting the sex chromosomes from empirical and clinical genomics studies.
Collapse
Affiliation(s)
- Brendan J. Pinto
- School of Life Sciences, Arizona State University, Tempe AZ 85282 USA
- Center for Evolution and Medicine, Arizona State University, Tempe AZ 85282 USA
- Department of Zoology, Milwaukee Public Museum, Milwaukee, WI 53233 USA
| | | | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Samantha Zarate
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Melissa A. Wilson
- School of Life Sciences, Arizona State University, Tempe AZ 85282 USA
- Center for Evolution and Medicine, Arizona State University, Tempe AZ 85282 USA
- The Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe AZ 85282 USA
| |
Collapse
|
16
|
Hyden B, Feng K, Yates TB, Jawdy S, Cereghino C, Smart LB, Muchero W. De Novo Assembly and Annotation of 11 Diverse Shrub Willow ( Salix) Genomes Reveals Novel Gene Organization in Sex-Linked Regions. Int J Mol Sci 2023; 24:2904. [PMID: 36769224 PMCID: PMC9917877 DOI: 10.3390/ijms24032904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 01/13/2023] [Accepted: 01/31/2023] [Indexed: 02/05/2023] Open
Abstract
Poplar and willow species in the Salicaceae are dioecious, yet have been shown to use different sex determination systems located on different chromosomes. Willows in the subgenus Vetrix are interesting for comparative studies of sex determination systems, yet genomic resources for these species are still quite limited. Only a few annotated reference genome assemblies are available, despite many species in use in breeding programs. Here we present de novo assemblies and annotations of 11 shrub willow genomes from six species. Copy number variation of candidate sex determination genes within each genome was characterized and revealed remarkable differences in putative master regulator gene duplication and deletion. We also analyzed copy number and expression of candidate genes involved in floral secondary metabolism, and identified substantial variation across genotypes, which can be used for parental selection in breeding programs. Lastly, we report on a genotype that produces only female descendants and identified gene presence/absence variation in the mitochondrial genome that may be responsible for this unusual inheritance.
Collapse
Affiliation(s)
- Brennan Hyden
- Horticulture Section, School of Integrative Plant Science, Cornell University, Geneva, NY 14456, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Kai Feng
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Timothy B. Yates
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Sara Jawdy
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Chelsea Cereghino
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Lawrence B. Smart
- Horticulture Section, School of Integrative Plant Science, Cornell University, Geneva, NY 14456, USA
| | - Wellington Muchero
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| |
Collapse
|
17
|
Why does the X chromosome lag behind autosomes in GWAS findings? PLoS Genet 2023; 19:e1010472. [PMID: 36848382 PMCID: PMC9997976 DOI: 10.1371/journal.pgen.1010472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 03/09/2023] [Accepted: 02/15/2023] [Indexed: 03/01/2023] Open
Abstract
The X-chromosome is among the largest human chromosomes. It differs from autosomes by a number of important features including hemizygosity in males, an almost complete inactivation of one copy in females, and unique patterns of recombination. We used data from the Catalog of Published Genome Wide Association Studies to compare densities of the GWAS-detected SNPs on the X-chromosome and autosomes. The density of GWAS-detected SNPs on the X-chromosome is 6-fold lower compared to the density of the GWAS-detected SNPs on autosomes. Differences between the X-chromosome and autosomes cannot be explained by differences in the overall SNP density, lower X-chromosome coverage by genotyping platforms or low call rate of X-chromosomal SNPs. Similar differences in the density of GWAS-detected SNPs were found in female-only GWASs (e.g. ovarian cancer GWASs). We hypothesized that the lower density of GWAS-detected SNPs on the X-chromosome compared to autosomes is not a result of a methodological bias, e.g. differences in coverage or call rates, but has a real underlying biological reason-a lower density of functional SNPs on the X-chromosome versus autosomes. This hypothesis is supported by the observation that (i) the overall SNP density of X-chromosome is lower compared to the SNP density on autosomes and that (ii) the density of genic SNPs on the X-chromosome is lower compared to autosomes while densities of intergenic SNPs are similar.
Collapse
|
18
|
Ciesielski TH, Bartlett J, Iyengar SK, Williams SM. Hemizygosity can reveal variant pathogenicity on the X-chromosome. Hum Genet 2023; 142:11-19. [PMID: 35994124 PMCID: PMC9840679 DOI: 10.1007/s00439-022-02478-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Accepted: 08/10/2022] [Indexed: 01/24/2023]
Abstract
Pathogenic variants on the X-chromosome can have more severe consequences for hemizygous males, while heterozygote females can avoid severe consequences due to diploidy and the capacity for nonrandom expression. Thus, when an allele is more common in females this could indicate that it increases the probability of early death in the male hemizygous state, which can be considered a measure of pathogenicity. Importantly, large-scale genomic data now makes it possible to compare allele proportions between the sexes. To discover pathogenic variants on the X-chromosome, we analyzed exome data from 125,748 ancestrally diverse participants in the Genome Aggregation Database (gnomAD). After filtering out duplicates and extremely rare variants, 44,606 of the original 348,221 remained for analysis. We divided the proportion of variant alleles in females by the proportion in males for all variant sites, and then placed each variant into one of three a priori categories: (1) Reference (Primarily synonymous and intronic), (2) Unlikely-to-be-tolerated (Primarily missense), and (3) Least-likely-to-be-tolerated (Primarily frameshift). To assess the impact of ploidy, we compared the distribution of these ratios between pseudoautosomal and non-pseudoautosomal regions. In the non-pseudoautosomal regions, mean female-to-male ratios were lowest among Reference (2.40), greater for Unlikely-to-be-tolerated (2.77) and highest for Least-likely-to-be-tolerated (3.28) variants. Corresponding ratios were lower in the pseudoautosomal regions (1.52, 1.57, and 1.68, respectively), with the most extreme ratio being just below 11. Because pathogenic effects in the pseudoautosomal regions should not drive ratio increases, this maximum ratio provides an upper bound for baseline noise. In the non-pseudoautosomal regions, 319 variants had a ratio over 11. In sum, we identified a measure with a dataset specific threshold for identifying pathogenicity in non-pseudoautosomal X-chromosome variants: the female-to-male allele proportion ratio.
Collapse
Affiliation(s)
- Timothy H. Ciesielski
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,Mary Ann Swetland Center for Environmental Health at Case Western Reserve University School of Medicine, Cleveland, OH,Ronin Institute, Montclair, NJ
| | - Jacquelaine Bartlett
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH
| | - Sudha K. Iyengar
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,The Department of Genetics and Genome Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,Cleveland Institute for Computational Biology, Cleveland, OH
| | - Scott M. Williams
- The Department of Population and Quantitative Health Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,The Department of Genetics and Genome Sciences at Case Western Reserve University School of Medicine, Cleveland, OH,Cleveland Institute for Computational Biology, Cleveland, OH
| |
Collapse
|
19
|
Olney KC, Plaisier SB, Phung TN, Silasi M, Perley L, O'Bryan J, Ramirez L, Kliman HJ, Wilson MA. Sex differences in early and term placenta are conserved in adult tissues. Biol Sex Differ 2022; 13:74. [PMID: 36550527 PMCID: PMC9773522 DOI: 10.1186/s13293-022-00470-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 09/19/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Pregnancy complications vary based on the fetus's genetic sex, which may, in part, be modulated by the placenta. Furthermore, developmental differences early in life can have lifelong health outcomes. Yet, sex differences in gene expression within the placenta at different timepoints throughout pregnancy and comparisons to adult tissues remains poorly characterized. METHODS Here, we collect and characterize sex differences in gene expression in term placentas (≥ 36.6 weeks; 23 male XY and 27 female XX). These are compared with sex differences in previously collected first trimester placenta samples and 42 non-reproductive adult tissues from GTEx. RESULTS We identify 268 and 53 sex-differentially expressed genes in the uncomplicated late first trimester and term placentas, respectively. Of the 53 sex-differentially expressed genes observed in the term placentas, 31 are also sex-differentially expressed genes in the late first trimester placentas. Furthermore, sex differences in gene expression in term placentas are highly correlated with sex differences in the late first trimester placentas. We found that sex-differential gene expression in the term placenta is significantly correlated with sex differences in gene expression in 42 non-reproductive adult tissues (correlation coefficient ranged from 0.892 to 0.957), with the highest correlation in brain tissues. Sex differences in gene expression were largely driven by gene expression on the sex chromosomes. We further show that some gametologous genes (genes with functional copies on X and Y) will have different inferred sex differences if the X-linked gene expression in females is compared to the sum of the X-linked and Y-linked gene expression in males. CONCLUSIONS We find that sex differences in gene expression are conserved in late first trimester and term placentas and that these sex differences are conserved in adult tissues. We demonstrate that there are sex differences associated with innate immune response in late first trimester placentas but there is no significant difference in gene expression of innate immune genes between sexes in healthy full-term placentas. Finally, sex differences are predominantly driven by expression from sex-linked genes.
Collapse
Affiliation(s)
- Kimberly C Olney
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85282, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, 85282, USA
| | - Seema B Plaisier
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85282, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, 85282, USA
| | - Tanya N Phung
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85282, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, 85282, USA
| | - Michelle Silasi
- Division of Maternal-Fetal Medicine, Department of Obstetrics and Gynecology, Mercy Hospital St. Louis, St. Louis, MO, 63141, USA
| | - Lauren Perley
- Department of Obstetrics, Gynecology and Reproductive Sciences, Yale University School of Medicine, New Haven, CT, 06520, USA
| | - Jane O'Bryan
- Department of Obstetrics, Gynecology and Reproductive Sciences, Yale University School of Medicine, New Haven, CT, 06520, USA
| | - Lucia Ramirez
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85282, USA
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, 85282, USA
| | - Harvey J Kliman
- Department of Obstetrics, Gynecology and Reproductive Sciences, Yale University School of Medicine, New Haven, CT, 06520, USA
| | - Melissa A Wilson
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85282, USA.
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ, 85282, USA.
- The Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ, 85282, USA.
| |
Collapse
|
20
|
Muñoz-Barrera A, Rubio-Rodríguez LA, Díaz-de Usera A, Jáspez D, Lorenzo-Salazar JM, González-Montelongo R, García-Olivares V, Flores C. From Samples to Germline and Somatic Sequence Variation: A Focus on Next-Generation Sequencing in Melanoma Research. Life (Basel) 2022; 12:1939. [PMID: 36431075 PMCID: PMC9695713 DOI: 10.3390/life12111939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/12/2022] [Accepted: 11/16/2022] [Indexed: 11/24/2022] Open
Abstract
Next-generation sequencing (NGS) applications have flourished in the last decade, permitting the identification of cancer driver genes and profoundly expanding the possibilities of genomic studies of cancer, including melanoma. Here we aimed to present a technical review across many of the methodological approaches brought by the use of NGS applications with a focus on assessing germline and somatic sequence variation. We provide cautionary notes and discuss key technical details involved in library preparation, the most common problems with the samples, and guidance to circumvent them. We also provide an overview of the sequence-based methods for cancer genomics, exposing the pros and cons of targeted sequencing vs. exome or whole-genome sequencing (WGS), the fundamentals of the most common commercial platforms, and a comparison of throughputs and key applications. Details of the steps and the main software involved in the bioinformatics processing of the sequencing results, from preprocessing to variant prioritization and filtering, are also provided in the context of the full spectrum of genetic variation (SNVs, indels, CNVs, structural variation, and gene fusions). Finally, we put the emphasis on selected bioinformatic pipelines behind (a) short-read WGS identification of small germline and somatic variants, (b) detection of gene fusions from transcriptomes, and (c) de novo assembly of genomes from long-read WGS data. Overall, we provide comprehensive guidance across the main methodological procedures involved in obtaining sequencing results for the most common short- and long-read NGS platforms, highlighting key applications in melanoma research.
Collapse
Affiliation(s)
- Adrián Muñoz-Barrera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Luis A. Rubio-Rodríguez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Ana Díaz-de Usera
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
| | - David Jáspez
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - José M. Lorenzo-Salazar
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Rafaela González-Montelongo
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Víctor García-Olivares
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Research Unit, Hospital Universitario Nuestra Señora de Candelaria, 38010 Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, 28029 Madrid, Spain
- Facultad de Ciencias de la Salud, Universidad Fernando de Pessoa Canarias, 35450 Las Palmas de Gran Canaria, Spain
| |
Collapse
|
21
|
Grenn FP, Makarious MB, Bandres-Ciga S, Iwaki H, Singleton AB, Nalls MA, Blauwendraat C. Analysis of Y chromosome haplogroups in Parkinson's disease. Brain Commun 2022; 4:fcac277. [PMID: 36387750 PMCID: PMC9665271 DOI: 10.1093/braincomms/fcac277] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 07/01/2022] [Accepted: 10/27/2022] [Indexed: 11/13/2022] Open
Abstract
Parkinson's disease is a complex neurodegenerative disorder that is about 1.5 times more prevalent in males than females. Extensive work has been done to identify the genetic risk factors behind Parkinson's disease on autosomes and more recently on Chromosome X, but work remains to be done on the male-specific Y chromosome. In an effort to explore the role of the Y chromosome in Parkinson's disease, we analysed whole-genome sequencing data from the Accelerating Medicines Partnership-Parkinson's disease initiative (1466 cases and 1664 controls), genotype data from NeuroX (3491 cases and 3232 controls) and genotype data from UKBiobank (182 517 controls, 1892 cases and 3783 proxy cases), all consisting of male European ancestry samples. We classified sample Y chromosomes by haplogroup using three different tools for comparison (Snappy, Yhaplo and Y-LineageTracker) and meta-analysed this data to identify haplogroups associated with Parkinson's disease. This was followed up with a Y-chromosome association study to identify specific variants associated with disease. We also analysed blood-based RNASeq data obtained from the Accelerating Medicines Partnership-Parkinson's disease initiative (1020 samples) and RNASeq data obtained from the North American Brain Expression Consortium (171 samples) to identify Y-chromosome genes differentially expressed in cases, controls, specific haplogroups and specific tissues. RNASeq analyses suggest Y-chromosome gene expression differs between brain and blood tissues but does not differ significantly in cases, controls or specific haplogroups. Overall, we did not find any strong associations between Y-chromosome genetics and Parkinson's disease, suggesting the explanation for the increased prevalence in males may lie elsewhere.
Collapse
Affiliation(s)
- Francis P Grenn
- Molecular Genetics Section, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | - Mary B Makarious
- Molecular Genetics Section, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
- Department of Clinical and Movement Neurosciences, UCL Queen Square Institute of Neurology, London, UK
- UCL Movement Disorders Centre, University College London, London, UK
| | - Sara Bandres-Ciga
- Molecular Genetics Section, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | - Hirotaka Iwaki
- Molecular Genetics Section, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Data Tecnica International, Washington, DC, USA
| | - Andrew B Singleton
- Molecular Genetics Section, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
| | - Mike A Nalls
- Molecular Genetics Section, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Data Tecnica International, Washington, DC, USA
| | - Cornelis Blauwendraat
- Center for Alzheimer’s and Related Dementias (CARD), National Institute on Aging and National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, USA
- Integrative Neurogenomics Unit, Laboratory of Neurogenetics, National Institute on Aging, National Institutes of Health, Bethesda, MD, USA
| | | |
Collapse
|
22
|
Phung TN, Olney KC, Pinto BJ, Silasi M, Perley L, O’Bryan J, Kliman HJ, Wilson MA. X chromosome inactivation in the human placenta is patchy and distinct from adult tissues. HGG ADVANCES 2022; 3:100121. [PMID: 35712697 PMCID: PMC9194956 DOI: 10.1016/j.xhgg.2022.100121] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 05/16/2022] [Indexed: 11/24/2022] Open
Abstract
In humans, one of the X chromosomes in genetic females is inactivated by a process called X chromosome inactivation (XCI). Variation in XCI across the placenta may contribute to observed sex differences and variability in pregnancy outcomes. However, XCI has predominantly been studied in human adult tissues. Here, we sequenced and analyzed DNA and RNA from two locations from 30 full-term pregnancies. Implementing an allele-specific approach to examine XCI, we report evidence that XCI in the human placenta is patchy, with large patches of either maternal or paternal X chromosomes inactivated. Further, using similar measurements, we show that this is in contrast to adult tissues, which generally exhibit mosaic X inactivation, where bulk samples exhibit both maternal and paternal X chromosome expression. Further, by comparing skewed samples in placenta and adult tissues, we identify genes that are uniquely inactivated or expressed in the placenta compared with adult tissues, highlighting the need for tissue-specific maps of XCI.
Collapse
Affiliation(s)
- Tanya N. Phung
- Center for Evolution and Medicine, Arizona State University, PO Box 874501, Tempe, AZ 85282, USA
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ 85282, USA
| | - Kimberly C. Olney
- Center for Evolution and Medicine, Arizona State University, PO Box 874501, Tempe, AZ 85282, USA
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ 85282, USA
| | - Brendan J. Pinto
- Center for Evolution and Medicine, Arizona State University, PO Box 874501, Tempe, AZ 85282, USA
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ 85282, USA
- Department of Zoology, Milwaukee Public Museum, Milwaukee, WI 53233, USA
| | - Michelle Silasi
- Department of Maternal-Fetal Medicine, Mercy Hospital St. Louis, St. Louis, MO 63141, USA
| | - Lauren Perley
- Department of Obstetrics, Gynecology and Reproductive Sciences, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Jane O’Bryan
- Department of Obstetrics, Gynecology and Reproductive Sciences, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Harvey J. Kliman
- Department of Obstetrics, Gynecology and Reproductive Sciences, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Melissa A. Wilson
- Center for Evolution and Medicine, Arizona State University, PO Box 874501, Tempe, AZ 85282, USA
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ 85282, USA
- The Biodesign Center for Mechanisms of Evolution, Arizona State University, PO Box 874501, Tempe, AZ 85282, USA
| |
Collapse
|
23
|
Halabian R, Makałowski W. A Map of 3' DNA Transduction Variants Mediated by Non-LTR Retroelements on 3202 Human Genomes. BIOLOGY 2022; 11:1032. [PMID: 36101413 PMCID: PMC9311842 DOI: 10.3390/biology11071032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/05/2022] [Accepted: 07/06/2022] [Indexed: 05/03/2023]
Abstract
As one of the major structural constituents, mobile elements comprise more than half of the human genome, among which Alu, L1, and SVA elements are still active and continue to generate new offspring. One of the major characteristics of L1 and SVA elements is their ability to co-mobilize adjacent downstream sequences to new loci in a process called 3' DNA transduction. Transductions influence the structure and content of the genome in different ways, such as increasing genome variation, exon shuffling, and gene duplication. Moreover, given their mutagenicity capability, 3' transductions are often involved in tumorigenesis or in the development of some diseases. In this study, we analyzed 3202 genomes sequenced at high coverage by the New York Genome Center to catalog and characterize putative 3' transduced segments mediated by L1s and SVAs. Here, we present a genome-wide map of inter/intrachromosomal 3' transduction variants, including their genomic and functional location, length, progenitor location, and allelic frequency across 26 populations. In total, we identified 7103 polymorphic L1s and 3040 polymorphic SVAs. Of these, 268 and 162 variants were annotated as high-confidence L1 and SVA 3' transductions, respectively, with lengths that ranged from 7 to 997 nucleotides. We found specific loci within chromosomes X, 6, 7, and 6_GL000253v2_alt as master L1s and SVAs that had yielded more transductions, among others. Together, our results demonstrate the dynamic nature of transduction events within the genome and among individuals and their contribution to the structural variations of the human genome.
Collapse
Affiliation(s)
| | - Wojciech Makałowski
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, 48149 Münster, Germany;
| |
Collapse
|
24
|
Santos C, Mendes T, Antunes A. The genes from the pseudoautosomal region 1 (PAR1) of the mammalian sex chromosomes: Synteny, phylogeny and selection. Genomics 2022; 114:110419. [PMID: 35753589 DOI: 10.1016/j.ygeno.2022.110419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 06/10/2022] [Accepted: 06/20/2022] [Indexed: 11/04/2022]
Abstract
Sex chromosomes recombine restrictly in their homologous area, the pseudoautosomal region (PAR), represented by PAR1 and PAR2, which behave like an autosome in both pairing and recombination. The PAR1, common to most of the eutherian mammals, is located at the terminus of the sex chromosomes short arm and exhibit recombination rates ~20 times higher than the autosomes. Here, we assessed the interspecific evolutionary genomic dynamics of 15 genes of the PAR1 across 41 mammalian genera (representing six orders). The strong negative selection detected in most of the assessed groups reinforces the presence of evolutionary constraints, imposed by the important function of the PAR1 genes. Indeed, mutations in these genes are associated with various diseases in humans, including stature problems (Klinefelter Syndrome), leukemia and mental diseases. Yet, a few genes exhibiting positive selection (ω-value >1) were depicted in Rodentia (ASMT and ZBED1) and Primates (CRLF2 and CSF2RA). Rodents have the smallest described PAR1, while that of simian primates/humans underwent a 3 to 5 fold size reduction. The assessment of the PAR1 genes synteny revealed differences among the mammalian species, especially in the Rodentia order where chromosomic translocations from the sex chromosomes to the autosomes were observed. Such syntenic changes may be an evidence of the rapid evolution in rodents, as previous referred in other papers, also depicted by their increased branch lengths in the phylogenetic analyses. Concluding, we suggest that genome migration is an important factor influencing the evolution of mammals and may result in changes of the selective pressures operating on the genome.
Collapse
Affiliation(s)
- Carla Santos
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Avenida General Norton de Matos, s/n, 4450-208 Porto, Portugal; Institute of Biomedical Sciences Abel Salazar (ICBAS), University of Porto, Portugal
| | - Tito Mendes
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Avenida General Norton de Matos, s/n, 4450-208 Porto, Portugal
| | - Agostinho Antunes
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Avenida General Norton de Matos, s/n, 4450-208 Porto, Portugal; Department of Biology, Faculty of Sciences, University of Porto, Porto, Portugal.
| |
Collapse
|
25
|
Niu YN, Roberts EG, Denisko D, Hoffman MM. Assessing and assuring interoperability of a genomics file format. Bioinformatics 2022; 38:3327-3336. [PMID: 35575355 PMCID: PMC9237710 DOI: 10.1093/bioinformatics/btac327] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2022] [Revised: 03/30/2022] [Accepted: 05/11/2022] [Indexed: 12/01/2022] Open
Abstract
Motivation Bioinformatics software tools operate largely through the use of specialized genomics file formats. Often these formats lack formal specification, making it difficult or impossible for the creators of these tools to robustly test them for correct handling of input and output. This causes problems in interoperability between different tools that, at best, wastes time and frustrates users. At worst, interoperability issues could lead to undetected errors in scientific results. Results We developed a new verification system, Acidbio, which tests for correct behavior in bioinformatics software packages. We crafted tests to unify correct behavior when tools encounter various edge cases—potentially unexpected inputs that exemplify the limits of the format. To analyze the performance of existing software, we tested the input validation of 80 Bioconda packages that parsed the Browser Extensible Data (BED) format. We also used a fuzzing approach to automatically perform additional testing. Of 80 software packages examined, 75 achieved less than 70% correctness on our test suite. We categorized multiple root causes for the poor performance of different types of software. Fuzzing detected other errors that the manually designed test suite could not. We also created a badge system that developers can use to indicate more precisely which BED variants their software accepts and to advertise the software’s performance on the test suite. Availability and implementation Acidbio is available at https://github.com/hoffmangroup/acidbio. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yi Nian Niu
- Princess Margaret Cancer Centre University Health Network, Toronto, ON, M5G 2C1, Canada
| | - Eric G Roberts
- Princess Margaret Cancer Centre University Health Network, Toronto, ON, M5G 2C1, Canada
| | - Danielle Denisko
- Princess Margaret Cancer Centre University Health Network, Toronto, ON, M5G 2C1, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada
| | - Michael M Hoffman
- Princess Margaret Cancer Centre University Health Network, Toronto, ON, M5G 2C1, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada.,Vector Institute, Toronto, ON, M5G 1M1, Canada
| |
Collapse
|
26
|
Carey SB, Lovell JT, Jenkins J, Leebens-Mack J, Schmutz J, Wilson MA, Harkess A. Representing sex chromosomes in genome assemblies. CELL GENOMICS 2022; 2. [PMID: 35720975 PMCID: PMC9205529 DOI: 10.1016/j.xgen.2022.100132] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Sex chromosomes have evolved hundreds of independent times across eukaryotes. As genome sequencing, assembly, and scaffolding techniques rapidly improve, it is now feasible to build fully phased sex chromosome assemblies. Despite technological advances enabling phased assembly of whole chromosomes, there are currently no standards for representing sex chromosomes when publicly releasing a genome. Furthermore, most computational analysis tools are unable to efficiently investigate their unique biology relative to autosomes. We discuss a diversity of sex chromosome systems and consider the challenges of representing sex chromosome pairs in genome assemblies. By addressing these issues now as technologies for full phasing of chromosomal assemblies are maturing, we can collectively ensure that future genome analysis toolkits can be broadly applied to all eukaryotes with diverse types of sex chromosome systems. Here we provide best practice guidelines for presenting a genome assembly that contains sex chromosomes. These guidelines can also be applied to other non-recombining genomic regions, such as S-loci in plants and mating-type loci in fungi and algae.
Collapse
Affiliation(s)
- Sarah B Carey
- Department of Crop, Soil, and Environmental Sciences, Auburn University, Auburn, AL 36849, USA.,HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - John T Lovell
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Jerry Jenkins
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| | - Jim Leebens-Mack
- Department of Plant Biology, University of Georgia, Athens, GA 30602, USA
| | - Jeremy Schmutz
- HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA.,US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Melissa A Wilson
- School of Life Sciences, Center for Evolution and Medicine, The Biodesign Institute, Arizona State University, Tempe, AZ 85287, USA
| | - Alex Harkess
- Department of Crop, Soil, and Environmental Sciences, Auburn University, Auburn, AL 36849, USA.,HudsonAlpha Institute for Biotechnology, Huntsville, AL 35806, USA
| |
Collapse
|
27
|
Estimating bonobo ( Pan paniscus) and chimpanzee ( Pan troglodytes) evolutionary history from nucleotide site patterns. Proc Natl Acad Sci U S A 2022; 119:e2200858119. [PMID: 35452306 PMCID: PMC9170072 DOI: 10.1073/pnas.2200858119] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
There is genomic evidence of widespread admixture in deep time between many closely related species, including humans. Our closest living relatives, bonobos and chimpanzees, may also exhibit such patterns. However, assessing the exact degree of interbreeding remains challenging because previous studies have resulted in multiple inconsistent demographic models. We use an approach that addresses these gaps by analyzing all lineages, simultaneously estimating parameters, and comparing previously models. We find evidence of considerable introgression from western into eastern chimpanzees. We also show more breeding females than males and evidence of male-biased dispersal in western chimpanzees. These findings highlight the extent of admixture in bonobo and chimpanzee evolutionary history and are consistent with substantial differences between past and present chimpanzee biogeography. Admixture appears increasingly ubiquitous in the evolutionary history of various taxa, including humans. Such gene flow likely also occurred among our closest living relatives: bonobos (Pan paniscus) and chimpanzees (Pan troglodytes). However, our understanding of their evolutionary history has been limited by studies that do not consider all Pan lineages or do not analyze all lineages simultaneously, resulting in conflicting demographic models. Here, we investigate this gap in knowledge using nucleotide site patterns calculated from whole-genome sequences from the autosomes of 71 bonobos and chimpanzees, representing all five extant Pan lineages. We estimated demographic parameters and compared all previously proposed demographic models for this clade. We further considered sex bias in Pan evolutionary history by analyzing the site patterns from the X chromosome. We show that 1) 21% of autosomal DNA in eastern chimpanzees derives from western chimpanzee introgression and that 2) all four chimpanzee lineages share a common ancestor about 987,000 y ago, much earlier than previous estimates. In addition, we suggest that 3) there was male reproductive skew throughout Pan evolutionary history and find evidence of 4) male-biased dispersal from western to eastern chimpanzees. Collectively, these results offer insight into bonobo and chimpanzee evolutionary history and suggest considerable differences between current and historic chimpanzee biogeography.
Collapse
|
28
|
Fisher JL, Jones EF, Flanary VL, Williams AS, Ramsey EJ, Lasseigne BN. Considerations and challenges for sex-aware drug repurposing. Biol Sex Differ 2022; 13:13. [PMID: 35337371 PMCID: PMC8949654 DOI: 10.1186/s13293-022-00420-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 03/06/2022] [Indexed: 01/09/2023] Open
Abstract
Sex differences are essential factors in disease etiology and manifestation in many diseases such as cardiovascular disease, cancer, and neurodegeneration [33]. The biological influence of sex differences (including genomic, epigenetic, hormonal, immunological, and metabolic differences between males and females) and the lack of biomedical studies considering sex differences in their study design has led to several policies. For example, the National Institute of Health's (NIH) sex as a biological variable (SABV) and Sex and Gender Equity in Research (SAGER) policies to motivate researchers to consider sex differences [204]. However, drug repurposing, a promising alternative to traditional drug discovery by identifying novel uses for FDA-approved drugs, lacks sex-aware methods that can improve the identification of drugs that have sex-specific responses [7, 11, 14, 33]. Sex-aware drug repurposing methods either select drug candidates that are more efficacious in one sex or deprioritize drug candidates based on if they are predicted to cause a sex-bias adverse event (SBAE), unintended therapeutic effects that are more likely to occur in one sex. Computational drug repurposing methods are encouraging approaches to develop for sex-aware drug repurposing because they can prioritize sex-specific drug candidates or SBAEs at lower cost and time than traditional drug discovery. Sex-aware methods currently exist for clinical, genomic, and transcriptomic information [1, 7, 155]. They have not expanded to other data types, such as DNA variation, which has been beneficial in other drug repurposing methods that do not consider sex [114]. Additionally, some sex-aware methods suffer from poorer performance because a disproportionate number of male and female samples are available to train computational methods [7]. However, there is development potential for several different categories (i.e., data mining, ligand binding predictions, molecular associations, and networks). Low-dimensional representations of molecular association and network approaches are also especially promising candidates for future sex-aware drug repurposing methodologies because they reduce the multiple hypothesis testing burden and capture sex-specific variation better than the other methods [151, 159]. Here we review how sex influences drug response, the current state of drug repurposing including with respect to sex-bias drug response, and how model organism study design choices influence drug repurposing validation.
Collapse
Affiliation(s)
- Jennifer L. Fisher
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294 USA
| | - Emma F. Jones
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294 USA
| | - Victoria L. Flanary
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294 USA
| | - Avery S. Williams
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294 USA
| | - Elizabeth J. Ramsey
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294 USA
| | - Brittany N. Lasseigne
- Department of Cell, Developmental and Integrative Biology, Heersink School of Medicine, University of Alabama at Birmingham, Birmingham, AL 35294 USA
| |
Collapse
|
29
|
Liu S, Zeng Y, Wang C, Zhang Q, Chen M, Wang X, Wang L, Lu Y, Guo H, Bu F. seGMM: A New Tool for Gender Determination From Massively Parallel Sequencing Data. Front Genet 2022; 13:850804. [PMID: 35309142 PMCID: PMC8930203 DOI: 10.3389/fgene.2022.850804] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Accepted: 02/10/2022] [Indexed: 11/18/2022] Open
Abstract
In clinical genetic testing, checking the concordance between self-reported gender and genotype-inferred gender from genomic data is a significant quality control measure because mismatched gender due to sex chromosomal abnormalities or misregistration of clinical information can significantly affect molecular diagnosis and treatment decisions. Targeted gene sequencing (TGS) is widely recommended as a first-tier diagnostic step in clinical genetic testing. However, the existing gender-inference tools are optimized for whole genome and whole exome data and are not adequate and accurate for analyzing TGS data. In this study, we validated a new gender-inference tool, seGMM, which uses unsupervised clustering (Gaussian mixture model) to determine the gender of a sample. The seGMM tool can also identify sex chromosomal abnormalities in samples by aligning the sequencing reads from the genotype data. The seGMM tool consistently demonstrated >99% gender-inference accuracy in a publicly available 1,000-gene panel dataset from the 1,000 Genomes project, an in-house 785 hearing loss gene panel dataset of 16,387 samples, and a 187 autism risk gene panel dataset from the Autism Clinical and Genetic Resources in China (ACGC) database. The performance and accuracy of seGMM was significantly higher for the targeted gene sequencing (TGS), whole exome sequencing (WES), and whole genome sequencing (WGS) datasets compared to the other existing gender-inference tools such as PLINK, seXY, and XYalign. The results of seGMM were confirmed by the short tandem repeat analysis of the sex chromosome marker gene, amelogenin. Furthermore, our data showed that seGMM accurately identified sex chromosomal abnormalities in the samples. In conclusion, the seGMM tool shows great potential in clinical genetics by determining the sex chromosomal karyotypes of samples from massively parallel sequencing data with high accuracy.
Collapse
Affiliation(s)
- Sihan Liu
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
| | - Yuanyuan Zeng
- School of Medicine, National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
| | - Chao Wang
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
| | - Qian Zhang
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
| | - Meilin Chen
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
| | - Xiaolu Wang
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
| | - Lanchen Wang
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
| | - Yu Lu
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
- *Correspondence: Yu Lu, ; Hui Guo, ; Fengxiao Bu,
| | - Hui Guo
- Center for Medical Genetics and Hunan Provincial Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, China
- *Correspondence: Yu Lu, ; Hui Guo, ; Fengxiao Bu,
| | - Fengxiao Bu
- Institute of Rare Diseases, West China Hospital of Sichuan University, Chengdu, China
- *Correspondence: Yu Lu, ; Hui Guo, ; Fengxiao Bu,
| |
Collapse
|
30
|
Pharmacogenomic analysis of a genetically distinct Indigenous population. THE PHARMACOGENOMICS JOURNAL 2022; 22:100-108. [PMID: 34824386 DOI: 10.1038/s41397-021-00262-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Revised: 11/08/2021] [Accepted: 11/10/2021] [Indexed: 11/09/2022]
Abstract
Indigenous Australians face a disproportionately severe burden of chronic disease relative to other Australians, with elevated rates of morbidity and mortality. While genomics technologies are slowly gaining momentum in personalised treatments for many, a lack of pharmacogenomic research in Indigenous peoples could delay adoption. Appropriately implementing pharmacogenomics in clinical care necessitates an understanding of the frequencies of pharmacologically relevant genetic variants within Indigenous populations. We analysed whole-genome sequence data from 187 individuals from the Tiwi Islands and characterised the pharmacogenomic landscape of this population. Specifically, we compared variant profiles and allelic distributions of previously described pharmacologically significant genes and variants with other population groups. We identified 22 translationally relevant pharmacogenomic variants and 18 clinically actionable guidelines with implications for drug dosing and treatment of conditions including heart disease, diabetes and cancer. We specifically observed increased poor and intermediate metabolizer phenotypes in the CYP2C9 (PM:19%, IM:44%) and CYP2C19 (PM:18%, IM:44%) genes.
Collapse
|
31
|
Hansen CCR, Westfall KM, Pálsson S. Evaluation of four methods to identify the homozygotic sex chromosome in small populations. BMC Genomics 2022; 23:160. [PMID: 35209843 PMCID: PMC8867824 DOI: 10.1186/s12864-022-08393-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 02/15/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Whole genomes are commonly assembled into a collection of scaffolds and often lack annotations of autosomes, sex chromosomes, and organelle genomes (i.e., mitochondrial and chloroplast). As these chromosome types differ in effective population size and can have highly disparate evolutionary histories, it is imperative to take this information into account when analysing genomic variation. Here we assessed the accuracy of four methods for identifying the homogametic sex chromosome in a small population using two whole genome sequences (WGS) and 133 RAD sequences of white-tailed eagles (Haliaeetus albicilla): i) difference in read depth per scaffold in a male and a female, ii) heterozygosity per scaffold in a male and a female, iii) mapping to the reference genome of a related species (chicken) with annotated sex chromosomes, and iv) analysis of SNP-loadings from a principal components analysis (PCA), based on the low-depth RADseq data. RESULTS The best performing approach was the reference mapping (method iii), which identified 98.12% of the expected homogametic sex chromosome (Z). Read depth per scaffold (method i) identified 86.41% of the homogametic sex chromosome with few false positives. SNP-loading scores (method iv) identified 78.6% of the Z-chromosome and had a false positive discovery rate of more than 10%. Heterozygosity per scaffold (method ii) did not provide clear results due to a lack of diversity in both the Z and autosomal chromosomes, and potential interference from the heterogametic sex chromosome (W). The evaluation of these methods also revealed 10 Mb of putative PAR and gametologous regions. CONCLUSION Identification of the homogametic sex chromosome in a small population is best accomplished by reference mapping or examining differences in read depth between sexes.
Collapse
Affiliation(s)
| | - Kristen M Westfall
- Department of Life and Environmental Sciences, University of Iceland, Reykjavik, Iceland.,Current: Fisheries and Oceans Canada, Pacific Biological Station, Nanaimo, BC, Canada
| | - Snæbjörn Pálsson
- Department of Life and Environmental Sciences, University of Iceland, Reykjavik, Iceland
| |
Collapse
|
32
|
Ramos L, Antunes A. Decoding sex: Elucidating sex determination and how high-quality genome assemblies are untangling the evolutionary dynamics of sex chromosomes. Genomics 2022; 114:110277. [PMID: 35104609 DOI: 10.1016/j.ygeno.2022.110277] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 12/22/2021] [Accepted: 01/26/2022] [Indexed: 11/28/2022]
Abstract
Sexual reproduction is a diverse and widespread process. In gonochoristic species, the differentiation of sexes occurs through diverse mechanisms, influenced by environmental and genetic factors. In most vertebrates, a master-switch gene is responsible for triggering a sex determination network. However, only a few genes have acquired master-switch functions, and this process is associated with the evolution of sex-chromosomes, which have a significant influence in evolution. Additionally, their highly repetitive regions impose challenges for high-quality sequencing, even using high-throughput, state-of-the-art techniques. Here, we review the mechanisms involved in sex determination and their role in the evolution of species, particularly vertebrates, focusing on sex chromosomes and the challenges involved in sequencing these genomic elements. We also address the improvements provided by the growth of sequencing projects, by generating a massive number of near-gapless, telomere-to-telomere, chromosome-level, phased assemblies, increasing the number and quality of sex-chromosome sequences available for further studies.
Collapse
Affiliation(s)
- Luana Ramos
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal; Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos, s/n, 4450-208 Porto, Portugal; Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal.
| |
Collapse
|
33
|
Borden ES, Adams AC, Buetow KH, Wilson MA, Bauman JE, Curiel-Lewandrowski C, Chow HHS, LaFleur BJ, Hastings KT. Shared Gene Expression and Immune Pathway Changes Associated with Progression from Nevi to Melanoma. Cancers (Basel) 2021; 14:cancers14010003. [PMID: 35008167 PMCID: PMC8749980 DOI: 10.3390/cancers14010003] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Revised: 12/16/2021] [Accepted: 12/20/2021] [Indexed: 02/07/2023] Open
Abstract
Simple Summary Melanoma is a deadly skin cancer, and the incidence of melanoma is rising. Chemoprevention, using small molecule drugs to prevent the development of cancer, is a key strategy that could reduce the burden of melanoma on society. The long-term goal of our study is to develop a gene signature biomarker of progression from nevi to melanoma. We found that a small number of genes can distinguish nevi from melanoma and identified shared genes and immune-related pathways that are associated with progression from nevi to melanoma across independent datasets. This study demonstrates (1) a novel approach to aid melanoma chemoprevention trials by using a gene signature as a surrogate endpoint and (2) the feasibility of determining a gene signature biomarker of melanoma progression. Abstract There is a need to identify molecular biomarkers of melanoma progression to assist the development of chemoprevention strategies to lower melanoma incidence. Using datasets containing gene expression for dysplastic nevi and melanoma or melanoma arising in a nevus, we performed differential gene expression analysis and regularized regression models to identify genes and pathways that were associated with progression from nevi to melanoma. A small number of genes distinguished nevi from melanoma. Differential expression of seven genes was identified between nevi and melanoma in three independent datasets. C1QB, CXCL9, CXCL10, DFNA5 (GSDME), FCGR1B, and PRAME were increased in melanoma, and SCGB1D2 was decreased in melanoma, compared to dysplastic nevi or nevi that progressed to melanoma. Further supporting an association with melanomagenesis, these genes demonstrated a linear change in expression from benign nevi to dysplastic nevi to radial growth phase melanoma to vertical growth phase melanoma. The genes associated with melanoma progression showed significant enrichment of multiple pathways related to the immune system. This study demonstrates (1) a novel application of bioinformatic approaches to aid clinical trials of melanoma chemoprevention and (2) the feasibility of determining a gene signature biomarker of melanomagenesis.
Collapse
Affiliation(s)
- Elizabeth S. Borden
- Department of Basic Medical Sciences, University of Arizona College of Medicine Phoenix, Phoenix, AZ 85004, USA; (E.S.B.); (A.C.A.)
- Phoenix Veterans Affairs Health Care System, Phoenix, AZ 85012, USA
| | - Anngela C. Adams
- Department of Basic Medical Sciences, University of Arizona College of Medicine Phoenix, Phoenix, AZ 85004, USA; (E.S.B.); (A.C.A.)
- Phoenix Veterans Affairs Health Care System, Phoenix, AZ 85012, USA
| | - Kenneth H. Buetow
- School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA; (K.H.B.); (M.A.W.)
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ 85281, USA
| | - Melissa A. Wilson
- School of Life Sciences, Arizona State University, Tempe, AZ 85281, USA; (K.H.B.); (M.A.W.)
- Center for Evolution and Medicine, Arizona State University, Tempe, AZ 85281, USA
| | - Julie E. Bauman
- Department of Medicine, University of Arizona College of Medicine Tucson, Tucson, AZ 85724, USA; (J.E.B.); (C.C.-L.); (H.-H.S.C.)
- University of Arizona Cancer Center, University of Arizona, Tucson, AZ 85724, USA
| | - Clara Curiel-Lewandrowski
- Department of Medicine, University of Arizona College of Medicine Tucson, Tucson, AZ 85724, USA; (J.E.B.); (C.C.-L.); (H.-H.S.C.)
- University of Arizona Cancer Center, University of Arizona, Tucson, AZ 85724, USA
| | - H.-H. Sherry Chow
- Department of Medicine, University of Arizona College of Medicine Tucson, Tucson, AZ 85724, USA; (J.E.B.); (C.C.-L.); (H.-H.S.C.)
- University of Arizona Cancer Center, University of Arizona, Tucson, AZ 85724, USA
| | | | - Karen Taraszka Hastings
- Department of Basic Medical Sciences, University of Arizona College of Medicine Phoenix, Phoenix, AZ 85004, USA; (E.S.B.); (A.C.A.)
- Phoenix Veterans Affairs Health Care System, Phoenix, AZ 85012, USA
- University of Arizona Cancer Center, University of Arizona, Tucson, AZ 85724, USA
- Correspondence: ; Tel.: +1-602-827-2106
| |
Collapse
|
34
|
Wilson MA. The Y chromosome and its impact on health and disease. Hum Mol Genet 2021; 30:R296-R300. [PMID: 34328177 PMCID: PMC8490013 DOI: 10.1093/hmg/ddab215] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2021] [Revised: 07/19/2021] [Accepted: 07/20/2021] [Indexed: 11/14/2022] Open
Abstract
The Y chromosome is the most gene-deficient chromosome in the human genome (though not the smallest chromosome) and has largely been sequestered away from large-scale studies of the effects of genetics on human health. Here I review the literature, focusing on the last 2 years, for recent evidence of the role of the Y chromosome in protecting from or contributing to disease. Although many studies have focused on Y chromosome gene copy number and variants in fertility, the role of the Y chromosome in human health is now known to extend too many other conditions including the development of multiple cancers and Alzheimer's disease. I further include the discussion of current technology and methods for analyzing Y chromosome variation. The true role of the Y chromosome and associated genetic variants in human disease will only become clear when the Y chromosome is integrated into larger studies of human genetic variation, rather than being analyzed in isolation.
Collapse
Affiliation(s)
- Melissa A Wilson
- School of Life Sciences, Center for Evolution and Medicine, Biodesign Center for Mechanisms of Evolution, Arizona State University, Tempe, AZ 85282 USA
| |
Collapse
|
35
|
Unique evolutionary trajectories of breast cancers with distinct genomic and spatial heterogeneity. Sci Rep 2021; 11:10571. [PMID: 34011996 PMCID: PMC8134446 DOI: 10.1038/s41598-021-90170-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 04/30/2021] [Indexed: 11/08/2022] Open
Abstract
Breast cancers exhibit intratumoral heterogeneity associated with disease progression and therapeutic resistance. To define the sources and the extent of heterogeneity, we performed an in-depth analysis of the genomic architecture of three chemoradiation-naïve breast cancers with well-defined clinical features including variable ER, PR, ERBB2 receptor expression and two distinct pathogenic BRCA2mut genotypes. The latter included a germ line carrier and a patient with a somatic variant. In each case we combined DNA content-based flow cytometry with whole exome sequencing and genome wide copy number variant (CNV) analysis of distinct populations sorted from multiple (4–18) mapped biopsies within the tumors and involved lymph nodes. Interrogating flow-sorted tumor populations from each biopsy provided an objective method to distinguish fixed and variable genomic lesions in each tumor. Notably we show that tumors exploit CNVs to fix mutations and deletions in distinct populations throughout each tumor. The identification of fixed genomic lesions that are shared or unique within each tumor, has broad implications for the study of tumor heterogeneity including the presence of tumor markers and therapeutic targets, and of candidate neoepitopes in breast and other solid tumors that can advance more effective treatment and clinical management of patients with disease.
Collapse
|
36
|
Flynn E, Chang A, Altman RB. Large-scale labeling and assessment of sex bias in publicly available expression data. BMC Bioinformatics 2021; 22:168. [PMID: 33784977 PMCID: PMC8011224 DOI: 10.1186/s12859-021-04070-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 03/08/2021] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Women are at more than 1.5-fold higher risk for clinically relevant adverse drug events. While this higher prevalence is partially due to gender-related effects, biological sex differences likely also impact drug response. Publicly available gene expression databases provide a unique opportunity for examining drug response at a cellular level. However, missingness and heterogeneity of metadata prevent large-scale identification of drug exposure studies and limit assessments of sex bias. To address this, we trained organism-specific models to infer sample sex from gene expression data, and used entity normalization to map metadata cell line and drug mentions to existing ontologies. Using this method, we inferred sex labels for 450,371 human and 245,107 mouse microarray and RNA-seq samples from refine.bio. RESULTS Overall, we find slight female bias (52.1%) in human samples and (62.5%) male bias in mouse samples; this corresponds to a majority of mixed sex studies in humans and single sex studies in mice, split between female-only and male-only (25.8% vs. 18.9% in human and 21.6% vs. 31.1% in mouse, respectively). In drug studies, we find limited evidence for sex-sampling bias overall; however, specific categories of drugs, including human cancer and mouse nervous system drugs, are enriched in female-only and male-only studies, respectively. We leverage our expression-based sex labels to further examine the complexity of cell line sex and assess the frequency of metadata sex label misannotations (2-5%). CONCLUSIONS Our results demonstrate limited overall sex bias, while highlighting high bias in specific subfields and underscoring the importance of including sex labels to better understand the underlying biology. We make our inferred and normalized labels, along with flags for misannotated samples, publicly available to catalyze the routine use of sex as a study variable in future analyses.
Collapse
Affiliation(s)
- Emily Flynn
- Biomedical Informatics Training Program, Stanford University, Stanford, CA, USA
| | - Annie Chang
- Program in Human Biology, Stanford University, Stanford, CA, USA
| | - Russ B Altman
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Medicine, Stanford University, Stanford, CA, USA.
| |
Collapse
|
37
|
Richmond PA, Kaye AM, Kounkou GJ, Av-Shalom TV, Wasserman WW. Demonstrating the utility of flexible sequence queries against indexed short reads with FlexTyper. PLoS Comput Biol 2021; 17:e1008815. [PMID: 33750951 PMCID: PMC8016220 DOI: 10.1371/journal.pcbi.1008815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 04/01/2021] [Accepted: 02/17/2021] [Indexed: 11/26/2022] Open
Abstract
Across the life sciences, processing next generation sequencing data commonly relies upon a computationally expensive process where reads are mapped onto a reference sequence. Prior to such processing, however, there is a vast amount of information that can be ascertained from the reads, potentially obviating the need for processing, or allowing optimized mapping approaches to be deployed. Here, we present a method termed FlexTyper which facilitates a “reverse mapping” approach in which high throughput sequence queries, in the form of k-mer searches, are run against indexed short-read datasets in order to extract useful information. This reverse mapping approach enables the rapid counting of target sequences of interest. We demonstrate FlexTyper’s utility for recovering depth of coverage, and accurate genotyping of SNP sites across the human genome. We show that genotyping unmapped reads can correctly inform a sample’s population, sex, and relatedness in a family setting. Detection of pathogen sequences within RNA-seq data was sensitive and accurate, performing comparably to existing methods, but with increased flexibility. We present two examples of ways in which this flexibility allows the analysis of genome features not well-represented in a linear reference. First, we analyze contigs from African genome sequencing studies, showing how they distribute across families from three distinct populations. Second, we show how gene-marking k-mers for the killer immune receptor locus allow allele detection in a region that is challenging for standard read mapping pipelines. The future adoption of the reverse mapping approach represented by FlexTyper will be enabled by more efficient methods for FM-index generation and biology-informed collections of reference queries. In the long-term, selection of population-specific references or weighting of edges in pan-population reference genome graphs will be possible using the FlexTyper approach. FlexTyper is available at https://github.com/wassermanlab/OpenFlexTyper. In the past 15 years, next generation sequencing technology has revolutionized our capacity to process and analyze DNA sequencing data. From agriculture to medicine, this technology is enabling a deeper understanding of the blueprint of life. Next generation sequencing data is composed of short sequences of DNA, referred to as “reads”, which are often shorter than 200 base pairs making them many orders of magnitude smaller than the entirety of a human genome. Gaining insights from this data has typically leveraged a reference-guided mapping approach, where the reads are aligned to a reference genome and then post-processed to gain actionable information such as presence or absence of genomic sequence, or variation between the reference genome and the sequenced sample. Many experts in the field of genomics have concluded that selecting a single, linear reference genome for mapping reads against is limiting, and several current research endeavors are focused on exploring options for improved analysis methods to unlock the full utility of sequencing data. Among these improvements are the usage of sex-matched genomes, population-specific reference genomes, and emergent graph-based reference pan-genomes. However, advanced methods that use raw DNA sequencing data to inform the choice of reference genome and guide the alignment of reads to enriched reference genomes are needed. Here we develop a method termed FlexTyper, which creates a searchable index of the short read data and enables flexible, user-guided queries to provide valuable insights without the need for reference-guided mapping. We demonstrate the utility of our method by identifying sample ancestry and sex in human whole genome sequencing data, detecting viral pathogen reads in RNA-seq data, African-enriched genome regions absent from the global reference, and killer-cell immune receptor alleles that are complex to discern using standard read mapping. We anticipate early adoption of FlexTyper within analysis pipelines as a pre-mapping component, and further envision the bioinformatics and genomics community will leverage the tool for creative uses of sequence queries from unmapped data.
Collapse
Affiliation(s)
- Phillip Andrew Richmond
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children’s Hospital Research Institute, University of British Columbia, Vancouver, Canada
| | - Alice Mary Kaye
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children’s Hospital Research Institute, University of British Columbia, Vancouver, Canada
| | - Godfrain Jacques Kounkou
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children’s Hospital Research Institute, University of British Columbia, Vancouver, Canada
| | - Tamar Vered Av-Shalom
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children’s Hospital Research Institute, University of British Columbia, Vancouver, Canada
| | - Wyeth W. Wasserman
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children’s Hospital Research Institute, University of British Columbia, Vancouver, Canada
- * E-mail:
| |
Collapse
|
38
|
Martignano F, Munagala U, Crucitta S, Mingrino A, Semeraro R, Del Re M, Petrini I, Magi A, Conticello SG. Nanopore sequencing from liquid biopsy: analysis of copy number variations from cell-free DNA of lung cancer patients. Mol Cancer 2021; 20:32. [PMID: 33579306 PMCID: PMC7881593 DOI: 10.1186/s12943-021-01327-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Accepted: 01/27/2021] [Indexed: 12/11/2022] Open
Abstract
In the "precision oncology" era the characterization of tumor genetic features is a pivotal step in cancer patients' management. Liquid biopsy approaches, such as analysis of cell-free DNA from plasma, represent a powerful and noninvasive strategy to obtain information about the genomic status of the tumor. Sequencing-based analyses of cell-free DNA, currently performed with second generation sequencers, are extremely powerful but poorly scalable and not always accessible also due to instrumentation costs. Third generation sequencing platforms, such as Nanopore sequencers, aim at overcoming these obstacles but, unfortunately, are not designed for cell-free DNA analysis.Here we present a customized workflow to exploit low-coverage Nanopore sequencing for the detection of copy number variations from plasma of cancer patients. Whole genome molecular karyotypes of 6 lung cancer patients and 4 healthy subjects were successfully produced with as few as 2 million reads, and common lung-related copy number alterations were readily detected.This is the first successful use of Nanopore sequencing for copy number profiling from plasma DNA. In this context, Nanopore represents a reliable alternative to Illumina sequencing, with the advantages of minute instrumentation costs and extremely short analysis time.The availability of protocols for Nanopore-based cell-free DNA analysis will make this analysis finally accessible, exploiting the full potential of liquid biopsy both for research and clinical purposes.
Collapse
Affiliation(s)
- Filippo Martignano
- Core Research Laboratory, ISPRO, Florence, Italy.,Department of Medical Biotechnologies, University of Siena, Siena, Italy
| | - Uday Munagala
- Core Research Laboratory, ISPRO, Florence, Italy.,Department of Neuroscience, Psychology, Pharmacology and Child Health (NEUROFARBA), University of Florence, Largo Brambilla 3, 50134, Florence, Italy
| | - Stefania Crucitta
- Unit of Clinical Pharmacology and Pharmacogenetics, Department of Clinical and Experimental Medicine, University of Pisa, Pisa, Italy
| | - Alessandra Mingrino
- Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
| | - Roberto Semeraro
- Unit of Clinical Pharmacology and Pharmacogenetics, Department of Clinical and Experimental Medicine, University of Pisa, Pisa, Italy
| | - Marzia Del Re
- Unit of Clinical Pharmacology and Pharmacogenetics, Department of Clinical and Experimental Medicine, University of Pisa, Pisa, Italy
| | - Iacopo Petrini
- Unit of Respiratory Medicine, Department of Critical Area and Surgical, Medical and Molecular Pathology, University Hospital of Pisa, Pisa, Italy
| | - Alberto Magi
- Department of Information Engineering, University of Florence, Florence, Italy
| | - Silvestro G Conticello
- Core Research Laboratory, ISPRO, Florence, Italy. .,Institute of Clinical Physiology, National Research Council, Pisa, Italy.
| |
Collapse
|
39
|
Cechova M. Probably Correct: Rescuing Repeats with Short and Long Reads. Genes (Basel) 2020; 12:48. [PMID: 33396198 PMCID: PMC7823596 DOI: 10.3390/genes12010048] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 12/23/2020] [Accepted: 12/24/2020] [Indexed: 02/07/2023] Open
Abstract
Ever since the introduction of high-throughput sequencing following the human genome project, assembling short reads into a reference of sufficient quality posed a significant problem as a large portion of the human genome-estimated 50-69%-is repetitive. As a result, a sizable proportion of sequencing reads is multi-mapping, i.e., without a unique placement in the genome. The two key parameters for whether or not a read is multi-mapping are the read length and genome complexity. Long reads are now able to span difficult, heterochromatic regions, including full centromeres, and characterize chromosomes from "telomere to telomere". Moreover, identical reads or repeat arrays can be differentiated based on their epigenetic marks, such as methylation patterns, aiding in the assembly process. This is despite the fact that long reads still contain a modest percentage of sequencing errors, disorienting the aligners and assemblers both in accuracy and speed. Here, I review the proposed and implemented solutions to the repeat resolution and the multi-mapping read problem, as well as the downstream consequences of reference choice, repeat masking, and proper representation of sex chromosomes. I also consider the forthcoming challenges and solutions with regards to long reads, where we expect the shift from the problem of repeat localization within a single individual to the problem of repeat positioning within pangenomes.
Collapse
Affiliation(s)
- Monika Cechova
- Genetics and Reproductive Biotechnologies, Veterinary Research Institute, Central European Institute of Technology (CEITEC), 621 00 Brno, Czech Republic
| |
Collapse
|
40
|
Lopes-Ramos CM, Quackenbush J, DeMeo DL. Genome-Wide Sex and Gender Differences in Cancer. Front Oncol 2020; 10:597788. [PMID: 33330090 PMCID: PMC7719817 DOI: 10.3389/fonc.2020.597788] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Accepted: 10/19/2020] [Indexed: 12/12/2022] Open
Abstract
Despite their known importance in clinical medicine, differences based on sex and gender are among the least studied factors affecting cancer susceptibility, progression, survival, and therapeutic response. In particular, the molecular mechanisms driving sex differences are poorly understood and so most approaches to precision medicine use mutational or other genomic data to assign therapy without considering how the sex of the individual might influence therapeutic efficacy. The mandate by the National Institutes of Health that research studies include sex as a biological variable has begun to expand our understanding on its importance. Sex differences in cancer may arise due to a combination of environmental, genetic, and epigenetic factors, as well as differences in gene regulation, and expression. Extensive sex differences occur genome-wide, and ultimately influence cancer biology and outcomes. In this review, we summarize the current state of knowledge about sex-specific genetic and genome-wide influences in cancer, describe how differences in response to environmental exposures and genetic and epigenetic alterations alter the trajectory of the disease, and provide insights into the importance of integrative analyses in understanding the interplay of sex and genomics in cancer. In particular, we will explore some of the emerging analytical approaches, such as the use of network methods, that are providing a deeper understanding of the drivers of differences based on sex and gender. Better understanding these complex factors and their interactions will improve cancer prevention, treatment, and outcomes for all individuals.
Collapse
Affiliation(s)
- Camila M Lopes-Ramos
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States
| | - John Quackenbush
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States.,Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, United States.,Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States
| | - Dawn L DeMeo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, United States.,Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, United States
| |
Collapse
|
41
|
Turkmen AS, Lin S. Detecting X-linked common and rare variant effects in family-based sequencing studies. Genet Epidemiol 2020; 45:36-45. [PMID: 32864779 DOI: 10.1002/gepi.22352] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 06/26/2020] [Accepted: 08/03/2020] [Indexed: 11/08/2022]
Abstract
The breakthroughs in next generation sequencing have allowed us to access data consisting of both common and rare variants, and in particular to investigate the impact of rare genetic variation on complex diseases. Although rare genetic variants are thought to be important components in explaining genetic mechanisms of many diseases, discovering these variants remains challenging, and most studies are restricted to population-based designs. Further, despite the shift in the field of genome-wide association studies (GWAS) towards studying rare variants due to the "missing heritability" phenomenon, little is known about rare X-linked variants associated with complex diseases. For instance, there is evidence that X-linked genes are highly involved in brain development and cognition when compared with autosomal genes; however, like most GWAS for other complex traits, previous GWAS for mental diseases have provided poor resources to deal with identification of rare variant associations on X-chromosome. In this paper, we address the two issues described above by proposing a method that can be used to test X-linked variants using sequencing data on families. Our method is much more general than existing methods, as it can be applied to detect both common and rare variants, and is applicable to autosomes as well. Our simulation study shows that the method is efficient, and exhibits good operational characteristics. An application to the University of Miami Study on Genetics of Autism and Related Disorders also yielded encouraging results.
Collapse
Affiliation(s)
- Asuman S Turkmen
- Statistics Department, The Ohio State University, Columbus, Ohio.,Statistics Department, The Ohio State University, Newark, Ohio
| | - Shili Lin
- Statistics Department, The Ohio State University, Columbus, Ohio
| |
Collapse
|
42
|
Phung TN, Lenkiewicz E, Malasi S, Sharma A, Anderson KS, Wilson MA, Pockaj BA, Barrett MT. Unique genomic and neoepitope landscapes across tumors: a study across time, tissues, and space within a single lynch syndrome patient. Sci Rep 2020; 10:12190. [PMID: 32699259 PMCID: PMC7376229 DOI: 10.1038/s41598-020-68939-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 06/29/2020] [Indexed: 12/12/2022] Open
Abstract
Lynch syndrome (LS) arises in patients with pathogenic germline variants in DNA mismatch repair genes. LS is the most common inherited cancer predisposition condition and confers an elevated lifetime risk of multiple cancers notably colorectal and endometrial carcinomas. A distinguishing feature of LS associated tumors is accumulation of variants targeting microsatellite repeats and the potential for high tumor specific neoepitope levels. Recurrent somatic variants targeting a small subset of genes have been identified in tumors with microsatellite instability. Notably these include frameshifts that can activate immune responses and provide vaccine targets to affect the lifetime cancer risk associated with LS. However the presence and persistence of targeted neoepitopes across multiple tumors in single LS patients has not been rigorously studied. Here we profiled the genomic landscapes of five distinct treatment naïve tumors, a papillary transitional cell renal cell carcinoma, a duodenal carcinoma, two metachronous colorectal carcinomas, and multi-regional sampling in a triple-negative breast tumor, arising in a LS patient over 10 years. Our analyses suggest each tumor evolves a unique complement of variants and that vaccines based on potential neoepitopes from one tissue may not be effective across all tumors that can arise during the lifetime of LS patients.
Collapse
Affiliation(s)
- Tanya N Phung
- School of Life Sciences, Arizona State University, Tempe, AZ, 85282, USA.,Center for Evolution and Medicine, Arizona State University, Tempe, AZ, 85282, USA
| | - Elizabeth Lenkiewicz
- Division of Hematology-Oncology, Mayo Clinic in Arizona, Scottsdale, AZ, 85259, USA
| | - Smriti Malasi
- Division of Hematology-Oncology, Mayo Clinic in Arizona, Scottsdale, AZ, 85259, USA
| | - Amit Sharma
- The Biodesign Institute, Arizona State University, Tempe, AZ, 85282, USA
| | - Karen S Anderson
- School of Life Sciences, Arizona State University, Tempe, AZ, 85282, USA.,Division of Hematology-Oncology, Mayo Clinic in Arizona, Scottsdale, AZ, 85259, USA.,The Biodesign Institute, Arizona State University, Tempe, AZ, 85282, USA
| | - Melissa A Wilson
- School of Life Sciences, Arizona State University, Tempe, AZ, 85282, USA.,Center for Evolution and Medicine, Arizona State University, Tempe, AZ, 85282, USA.,The Biodesign Institute, Arizona State University, Tempe, AZ, 85282, USA
| | - Barbara A Pockaj
- Division of General Surgery, Section of Surgical Oncology, Mayo Clinic in Arizona, Phoenix, AZ, 85054, USA
| | - Michael T Barrett
- Division of Hematology-Oncology, Mayo Clinic in Arizona, Scottsdale, AZ, 85259, USA.
| |
Collapse
|
43
|
Olney KC, Brotman SM, Andrews JP, Valverde-Vesling VA, Wilson MA. Reference genome and transcriptome informed by the sex chromosome complement of the sample increase ability to detect sex differences in gene expression from RNA-Seq data. Biol Sex Differ 2020; 11:42. [PMID: 32693839 PMCID: PMC7374973 DOI: 10.1186/s13293-020-00312-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Accepted: 06/17/2020] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Human X and Y chromosomes share an evolutionary origin and, as a consequence, sequence similarity. We investigated whether the sequence homology between the X and Y chromosomes affects the alignment of RNA-Seq reads and estimates of differential expression. We tested the effects of using reference genomes and reference transcriptomes informed by the sex chromosome complement of the sample's genome on the measurements of RNA-Seq abundance and sex differences in expression. RESULTS The default genome includes the entire human reference genome (GRCh38), including the entire sequence of the X and Y chromosomes. We created two sex chromosome complement informed reference genomes. One sex chromosome complement informed reference genome was used for samples that lacked a Y chromosome; for this reference genome version, we hard-masked the entire Y chromosome. For the other sex chromosome complement informed reference genome, to be used for samples with a Y chromosome, we hard-masked only the pseudoautosomal regions of the Y chromosome, because these regions are duplicated identically in the reference genome on the X chromosome. We analyzed the transcript abundance in the whole blood, brain cortex, breast, liver, and thyroid tissues from 20 genetic female (46, XX) and 20 genetic male (46, XY) samples. Each sample was aligned twice: once to the default reference genome and then independently aligned to a reference genome informed by the sex chromosome complement of the sample, repeated using two different read aligners, HISAT and STAR. We then quantified sex differences in gene expression using featureCounts to get the raw count estimates followed by Limma/Voom for normalization and differential expression. We additionally created sex chromosome complement informed transcriptome references for use in pseudo-alignment using Salmon. Transcript abundance was quantified twice for each sample: once to the default target transcripts and then independently to target transcripts informed by the sex chromosome complement of the sample. CONCLUSIONS We show that regardless of the choice of the read aligner, using an alignment protocol informed by the sex chromosome complement of the sample results in higher expression estimates on the pseudoautosomal regions of the X chromosome in both genetic male and genetic female samples, as well as an increased number of unique genes being called as differentially expressed between the sexes. We additionally show that using a pseudo-alignment approach informed on the sex chromosome complement of the sample eliminates Y-linked expression in female XX samples.
Collapse
Affiliation(s)
- Kimberly C Olney
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85287-4501, USA.,Center for Evolution and Medicine, Arizona State University, Tempe, AZ, 85282, USA
| | - Sarah M Brotman
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85287-4501, USA.,Department of Genetics, University of North Carolina, Chapel Hill, NC, 27599, USA
| | - Jocelyn P Andrews
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85287-4501, USA.,College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, CA, 91766, USA
| | | | - Melissa A Wilson
- School of Life Sciences, Arizona State University, PO Box 874501, Tempe, AZ, 85287-4501, USA. .,Center for Evolution and Medicine, Arizona State University, Tempe, AZ, 85282, USA. .,Center for Mechanisms of Evolution, The Biodesign Institute, Arizona State University, Tempe, AZ, 85282, USA.
| |
Collapse
|
44
|
Prokopenko D, Hecker J, Kirchner R, Chapman BA, Hoffman O, Mullin K, Hide W, Bertram L, Laird N, DeMeo DL, Lange C, Tanzi RE. Identification of Novel Alzheimer's Disease Loci Using Sex-Specific Family-Based Association Analysis of Whole-Genome Sequence Data. Sci Rep 2020; 10:5029. [PMID: 32193444 PMCID: PMC7081222 DOI: 10.1038/s41598-020-61883-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Accepted: 02/17/2020] [Indexed: 11/21/2022] Open
Abstract
With the advent of whole genome-sequencing (WGS) studies, family-based designs enable sex-specific analysis approaches that can be applied to only affected individuals; tests using family-based designs are attractive because they are completely robust against the effects of population substructure. These advantages make family-based association tests (FBATs) that use siblings as well as parents especially suited for the analysis of late-onset diseases such as Alzheimer's Disease (AD). However, the application of FBATs to assess sex-specific effects can require additional filtering steps, as sensitivity to sequencing errors is amplified in this type of analysis. Here, we illustrate the implementation of robust analysis approaches and additional filtering steps that can minimize the chances of false positive-findings due to sex-specific sequencing errors. We apply this approach to two family-based AD datasets and identify four novel loci (GRID1, RIOK3, MCPH1, ZBTB7C) showing sex-specific association with AD risk. Following stringent quality control filtering, the strongest candidate is ZBTB7C (Pinter = 1.83 × 10-7), in which the minor allele of rs1944572 confers increased risk for AD in females and protection in males. ZBTB7C encodes the Zinc Finger and BTB Domain Containing 7C, a transcriptional repressor of membrane metalloproteases (MMP). Members of this MMP family were implicated in AD neuropathology.
Collapse
Affiliation(s)
- Dmitry Prokopenko
- Genetics and Aging Unit and McCance Center for Brain Health, Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
- Harvard Medical School, Boston, MA, USA
| | - Julian Hecker
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Rory Kirchner
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Brad A Chapman
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Oliver Hoffman
- Department of Clinical Pathology, University of Melbourne, Victoria, 3000, Melbourne, Australia
| | - Kristina Mullin
- Genetics and Aging Unit and McCance Center for Brain Health, Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Winston Hide
- Harvard Medical School, Boston, MA, USA
- Department of Neuroscience, Sheffield Institute for Translational Neurosciences, University of Sheffield, Sheffield, UK
- Department of Pathology, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, MA, US
| | - Lars Bertram
- Lübeck Interdisciplinary Platform for Genome Analytics, Institutes of Neurogenetics and Cardiogenetics, University of Lübeck, Lübeck, Germany
- Department of Psychology, University of Oslo, Oslo, Norway
| | - Nan Laird
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Dawn L DeMeo
- Harvard Medical School, Boston, MA, USA
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
- Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA
| | - Christoph Lange
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.
- Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA.
| | - Rudolph E Tanzi
- Genetics and Aging Unit and McCance Center for Brain Health, Department of Neurology, Massachusetts General Hospital, Boston, MA, USA.
- Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
45
|
Dash HR, Rawat N, Das S. Alternatives to amelogenin markers for sex determination in humans and their forensic relevance. Mol Biol Rep 2020; 47:2347-2360. [DOI: 10.1007/s11033-020-05268-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2019] [Accepted: 01/20/2020] [Indexed: 12/15/2022]
|
46
|
Sex Determination Using RNA-Sequencing Analyses in Early Prenatal Pig Development. Genes (Basel) 2019; 10:genes10121010. [PMID: 31817322 PMCID: PMC6947224 DOI: 10.3390/genes10121010] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 11/20/2019] [Accepted: 11/27/2019] [Indexed: 12/18/2022] Open
Abstract
Sexual dimorphism is a relevant factor in animal science, since it can affect the gene expression of economically important traits. Eventually, the interest in the prenatal phase in a transcriptome study may not comprise the period of development in which male and female conceptuses are phenotypically divergent. Therefore, it would be interesting if sex differentiation could be performed using transcriptome data, with no need for extra techniques. In this study, the sex of pig conceptuses (embryos at 25 days-old and fetuses at 35 days-old) was determined by reads counts per million (CPM) of Y chromosome-linked genes that were discrepant among samples. Thus, ten genes were used: DDX3Y, KDM5D, ZFY, EIF2S3Y, EIF1AY, LOC110255320, LOC110257894, LOC396706, LOC100625207, and LOC110255257. Conceptuses that presented reads CPM sum for these genes (ΣCPMchrY) greater than 400 were classified as males and those with ΣCPMchrY below 2 were classified as females. It was demonstrated that the sex identification can be performed at early stages of pig development from RNA-sequencing analysis of genes mapped on Y chromosome. Additionally, these results reinforce that sex determination is a mechanism conserved across mammals, highlighting the importance of using pigs as an animal model to study sex determination during human prenatal development.
Collapse
|
47
|
Wilson MA, Buetow KH. Novel Mechanisms of Cancer Emerge When Accounting for Sex as a Biological Variable. Cancer Res 2019; 80:27-29. [PMID: 31722998 DOI: 10.1158/0008-5472.can-19-2634] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2019] [Revised: 10/23/2019] [Accepted: 11/05/2019] [Indexed: 11/16/2022]
Abstract
There is a large gap between the aspiration of considering sex as biological variable and the execution of such studies, particularly in genomic studies of human cancer. This represents a lost opportunity to identify sex-specific molecular etiologies that may underpin the dramatic sex differences in cancer incidence and outcome. There are conceptual and practical challenges associated with considering sex as a biological variable, including the definition of sex itself and the need for novel study designs. A better understanding of cancer mechanisms, resulting in improved outcomes, will reward the effort invested in incorporating sex as a biological variable.
Collapse
Affiliation(s)
- Melissa A Wilson
- School of Life Sciences, Arizona State University, Tempe, Arizona.,Center for Evolution and Medicine, Arizona State University, Tempe, Arizona
| | - Kenneth H Buetow
- School of Life Sciences, Arizona State University, Tempe, Arizona. .,Center for Evolution and Medicine, Arizona State University, Tempe, Arizona
| |
Collapse
|
48
|
Natri HM, Wilson MA, Buetow KH. Distinct molecular etiologies of male and female hepatocellular carcinoma. BMC Cancer 2019; 19:951. [PMID: 31615477 PMCID: PMC6794913 DOI: 10.1186/s12885-019-6167-2] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Accepted: 09/16/2019] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Sex-differences in cancer occurrence and mortality are evident across tumor types; men exhibit higher rates of incidence and often poorer responses to treatment. Targeted approaches to the treatment of tumors that account for these sex-differences require the characterization and understanding of the fundamental biological mechanisms that differentiate them. Hepatocellular Carcinoma (HCC) is the second leading cause of cancer death worldwide, with the incidence rapidly rising. HCC exhibits a male-bias in occurrence and mortality, but previous studies have failed to explore the sex-specific dysregulation of gene expression in HCC. METHODS Here, we characterize the sex-shared and sex-specific regulatory changes in HCC tumors in the TCGA LIHC cohort using combined and sex-stratified differential expression and eQTL analyses. RESULTS By using a sex-specific differential expression analysis of tumor and tumor-adjacent samples, we uncovered etiologically relevant genes and pathways differentiating male and female HCC. While both sexes exhibited activation of pathways related to apoptosis and cell cycle, males and females differed in the activation of several signaling pathways, with females showing PPAR pathway enrichment while males showed PI3K, PI3K/AKT, FGFR, EGFR, NGF, GF1R, Rap1, DAP12, and IL-2 signaling pathway enrichment. Using eQTL analyses, we discovered germline variants with differential effects on tumor gene expression between the sexes. 24.3% of the discovered eQTLs exhibit differential effects between the sexes, illustrating the substantial role of sex in modifying the effects of eQTLs in HCC. The genes that showed sex-specific dysregulation in tumors and those that harbored a sex-specific eQTL converge in clinically relevant pathways, suggesting that the molecular etiologies of male and female HCC are partially driven by differential genetic effects on gene expression. CONCLUSIONS Sex-stratified analyses detect sex-specific molecular etiologies of HCC. Overall, our results provide new insight into the role of inherited genetic regulation of transcription in modulating sex-differences in HCC etiology and provide a framework for future studies on sex-biased cancers.
Collapse
Affiliation(s)
- Heini M Natri
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA.
| | - Melissa A Wilson
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| | - Kenneth H Buetow
- Center for Evolution and Medicine, School of Life Sciences, Arizona State University, Tempe, AZ, USA
| |
Collapse
|
49
|
Anderson K, Cañadas-Garre M, Chambers R, Maxwell AP, McKnight AJ. The Challenges of Chromosome Y Analysis and the Implications for Chronic Kidney Disease. Front Genet 2019; 10:781. [PMID: 31552093 PMCID: PMC6737325 DOI: 10.3389/fgene.2019.00781] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 07/24/2019] [Indexed: 12/17/2022] Open
Abstract
The role of chromosome Y in chronic kidney disease (CKD) remains unknown, as chromosome Y is typically excluded from genetic analysis in CKD. The complex, sex-specific presentation of CKD could be influenced by chromosome Y genetic variation, but there is limited published research available to confirm or reject this hypothesis. Although traditionally thought to be associated with male-specific disease, evidence linking chromosome Y genetic variation to common complex disorders highlights a potential gap in CKD research. Chromosome Y variation has been associated with cardiovascular disease, a condition closely linked to CKD and one with a very similar sexual dimorphism. Relatively few sources of genetic variation in chromosome Y have been examined in CKD. The association between chromosome Y aneuploidy and CKD has never been explored comprehensively, while analyses of microdeletions, copy number variation, and single-nucleotide polymorphisms in CKD have been largely limited to the autosomes or chromosome X. In many studies, it is unclear whether the analyses excluded chromosome Y or simply did not report negative results. Lack of imputation, poor cross-study comparability, and requirement for separate or additional analyses in comparison with autosomal chromosomes means that chromosome Y is under-investigated in the context of CKD. Limitations in genotyping arrays could be overcome through use of whole-chromosome sequencing of chromosome Y that may allow analysis of many different types of genetic variation across the chromosome to determine if chromosome Y genetic variation is associated with CKD.
Collapse
Affiliation(s)
- Kerry Anderson
- Epidemiology and Public Health Research Group, Centre for Public Health, Queen's University of Belfast, c/o Regional Genetics Centre, Belfast City Hospital, Belfast, United Kingdom
| | - Marisa Cañadas-Garre
- Epidemiology and Public Health Research Group, Centre for Public Health, Queen's University of Belfast, c/o Regional Genetics Centre, Belfast City Hospital, Belfast, United Kingdom
| | - Robyn Chambers
- Epidemiology and Public Health Research Group, Centre for Public Health, Queen's University of Belfast, c/o Regional Genetics Centre, Belfast City Hospital, Belfast, United Kingdom
| | - Alexander Peter Maxwell
- Epidemiology and Public Health Research Group, Centre for Public Health, Queen's University of Belfast, c/o Regional Genetics Centre, Belfast City Hospital, Belfast, United Kingdom.,Regional Nephrology Unit, Belfast City Hospital, Belfast, United Kingdom
| | - Amy Jayne McKnight
- Epidemiology and Public Health Research Group, Centre for Public Health, Queen's University of Belfast, c/o Regional Genetics Centre, Belfast City Hospital, Belfast, United Kingdom
| |
Collapse
|