1
|
Alkhalil SS, Almanaa TN, Altamimi RA, Abdalla M, El-Arabey AA. Interactions between microbiota and uterine corpus endometrial cancer: A bioinformatic investigation of potential immunotherapy. PLoS One 2024; 19:e0312590. [PMID: 39475915 PMCID: PMC11524446 DOI: 10.1371/journal.pone.0312590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Accepted: 10/09/2024] [Indexed: 11/02/2024] Open
Abstract
Microorganisms in the gut and other niches may contribute to carcinogenesis while also altering cancer immune surveillance and therapeutic response. However, determining the impact of genetic variations and interplay with intestinal microbes' environment is difficult and unanswered. Here, we examined the frequency of thirteen mutant genes that caused aberrant gut in thirty different types of cancer using The Cancer Genomic Atlas (TCGA) database. Substantially, our findings show that all these mutated genes are quite frequent in uterine corpus endometrial cancer (UCEC). Further, these mutant genes are implicated in the infiltration of different subset of immune cells within the Tumor Microenvironment (TME) of UCEC patients. The top-ranking mutant genes that promote immune cell invasion into the TME of UCEC patients were PGLYRP2, OLFM4, and TLR5. In this regard, we used the same deconvolution of the TCGA database to analyze the microbiome that have a strong association with immune cells invasion with TME of UCEC patients. Several bacteria and viruses have been linked to the invasion of immune cells, such as B cell memory and T cell regulatory (Tregs), into the TME of UCEC patients. As a result, our findings pave the way for future research into generating novel immunizations against bacteria or viruses as immunotherapy for UCEC patients.
Collapse
Affiliation(s)
- Samia S. Alkhalil
- Department of Medical Laboratory Sciences, College of Applied Medical Sciences, Shaqra University, Alquwayiyah, Riyadh, Saudi Arabia
| | - Taghreed N. Almanaa
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh, Saudi Arabia
| | - Raghad A. Altamimi
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh, Saudi Arabia
| | - Mohnad Abdalla
- Pediatric Research Institute, Children’s Hospital Affiliated to Shandong University, Jinan, China
| | - Amr Ahmed El-Arabey
- Department of Pharmacology and Toxicology, Faculty of Pharmacy, Al-Azhar University, Cairo, Egypt
- Center of Bee Research and its Products (CBRP), Unit of Bee Research and Honey Production, King Khalid University, Abha, Saudi Arabia
- Applied College, King Khalid University, Abha, Saudi Arabia
| |
Collapse
|
2
|
Saha K, Nielsen GI, Nandani R, Zhang Y, Kong L, Ye P, An W. YY1 is a transcriptional activator of the mouse LINE-1 Tf subfamily. Nucleic Acids Res 2024:gkae949. [PMID: 39460630 DOI: 10.1093/nar/gkae949] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2023] [Revised: 09/07/2024] [Accepted: 10/08/2024] [Indexed: 10/28/2024] Open
Abstract
Long interspersed element type 1 (LINE-1, L1) is an active autonomous transposable element in human and mouse genomes. L1 transcription is controlled by an internal RNA polymerase II promoter in the 5' untranslated region (5'UTR) of a full-length L1. It has been shown that transcription factor YY1 binds to a conserved sequence at the 5' end of the human L1 5'UTR and primarily dictates where transcription initiates. Putative YY1-binding motifs have been predicted in the 5'UTRs of two distinct mouse L1 subfamilies, Tf and Gf. Using site-directed mutagenesis, in vitro binding and gene knockdown assays, we experimentally tested the role of YY1 in mouse L1 transcription. Our results indicate that Tf, but not Gf subfamily, harbors functional YY1-binding sites in 5'UTR monomers and YY1 functions as a transcriptional activator for the mouse Tf subfamily. Activation of Tf transcription by YY1 during early embryogenesis is also supported by a reanalysis of published zygotic knockdown data. Furthermore, YY1-binding motifs are solely responsible for the synergistic interaction between Tf monomers, consistent with a model wherein distant monomers act as enhancers for mouse L1 transcription. The abundance of YY1-binding sites in Tf elements also raise important implications for gene regulation across the genome.
Collapse
Affiliation(s)
- Karabi Saha
- Department of Pharmaceutical Sciences, South Dakota State University, 1055 Campanile Ave, Brookings, SD 57007, USA
| | - Grace I Nielsen
- Department of Pharmaceutical Sciences, South Dakota State University, 1055 Campanile Ave, Brookings, SD 57007, USA
| | - Raj Nandani
- Department of Pharmaceutical Sciences, South Dakota State University, 1055 Campanile Ave, Brookings, SD 57007, USA
| | - Yizi Zhang
- Department of Pharmaceutical Sciences, South Dakota State University, 1055 Campanile Ave, Brookings, SD 57007, USA
| | - Lingqi Kong
- Department of Pharmaceutical Sciences, South Dakota State University, 1055 Campanile Ave, Brookings, SD 57007, USA
| | - Ping Ye
- Department of Pharmaceutical Sciences, South Dakota State University, 1055 Campanile Ave, Brookings, SD 57007, USA
| | - Wenfeng An
- Department of Pharmaceutical Sciences, South Dakota State University, 1055 Campanile Ave, Brookings, SD 57007, USA
| |
Collapse
|
3
|
Goldberg LR, Baskin BM, Adla Y, Beierle JA, Kelliher JC, Yao EJ, Kirkpatrick SL, Reed ER, Jenkins DF, Cox J, Luong AM, Luttik KP, Scotellaro JA, Drescher TA, Crotts SB, Yazdani N, Ferris MT, Johnson WE, Mulligan MK, Bryant CD. Atp1a2 and Kcnj9 are candidate genes underlying sensitivity to oxycodone-induced locomotor activation and withdrawal-induced anxiety-like behaviors in C57BL/6 substrains. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.16.589731. [PMID: 38798314 PMCID: PMC11123399 DOI: 10.1101/2024.04.16.589731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Opioid use disorder is heritable, yet its genetic etiology is largely unknown. C57BL/6J and C57BL/6NJ mouse substrains exhibit phenotypic diversity in the context of limited genetic diversity which together can facilitate genetic discovery. Here, we found C57BL/6NJ mice were less sensitive to oxycodone (OXY)-induced locomotor activation versus C57BL/6J mice in a conditioned place preference paradigm. Narrow-sense heritability was estimated at 0.22-0.31, implicating suitability for genetic analysis. Quantitative trait locus (QTL) mapping in an F2 cross identified a chromosome 1 QTL explaining 7-12% of the variance in OXY locomotion and anxiety-like withdrawal in the elevated plus maze. A second QTL for EPM withdrawal behavior on chromosome 5 near Gabra2 (alpha-2 subunit of GABA-A receptor) explained 9% of the variance. To narrow the chromosome 1 locus, we generated recombinant lines spanning 163-181 Mb, captured the QTL for OXY locomotor traits and withdrawal, and fine-mapped a 2.45-Mb region (170.16-172.61 Mb). Transcriptome analysis identified five, localized striatal cis-eQTL transcripts and two were confirmed at the protein level (KCNJ9, ATP1A2). Kcnj9 codes for a potassium channel (GIRK3) that is a major effector of mu opioid receptor signaling. Atp1a2 codes for a subunit of a Na+/K+ ATPase enzyme that regulates neuronal excitability and shows functional adaptations following chronic opioid administration. To summarize, we identified two candidate genes underlying the physiological and behavioral properties of opioids, with direct preclinical relevance to investigators employing these widely used substrains and clinical relevance to human genetic studies of opioid use disorder.
Collapse
Affiliation(s)
- Lisa R. Goldberg
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
- Graduate Program in Biomolecular Pharmacology, Department of Pharmacology, Physiology & Biophysics, Boston University Chobanian and Avedisian School of Medicine, Boston, MA USA
| | - Britahny M. Baskin
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
- T32 Training Program on Development of Medications for Substance Use Disorder, Center for Drug Discovery, Northeastern University
| | - Yahia Adla
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
| | - Jacob A. Beierle
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
- Graduate Program in Biomolecular Pharmacology, Department of Pharmacology, Physiology & Biophysics, Boston University Chobanian and Avedisian School of Medicine, Boston, MA USA
- Transformative Training Program in Addiction Science, Boston University
| | - Julia C. Kelliher
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
| | - Emily J. Yao
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
| | - Stacey L. Kirkpatrick
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
| | - Eric R. Reed
- Graduate Program in Bioinformatics, Boston University, Boston, MA USA
| | - David F. Jenkins
- Graduate Program in Bioinformatics, Boston University, Boston, MA USA
| | - Jiayi Cox
- Genetics and Graduate Program in Genetics and Genomics, Program in Biomedical Sciences, Boston University Chobanian & Avedisian School of Medicine
| | - Alexander M. Luong
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
| | - Kimberly P. Luttik
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
| | - Julia A. Scotellaro
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
- Undergraduate Research Opportunity Program (UROP), Boston University
| | - Timothy A. Drescher
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
| | - Sydney B. Crotts
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
| | - Neema Yazdani
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
- Graduate Program in Biomolecular Pharmacology, Department of Pharmacology, Physiology & Biophysics, Boston University Chobanian and Avedisian School of Medicine, Boston, MA USA
- Transformative Training Program in Addiction Science, Boston University
| | - Martin T. Ferris
- Department of Genetics, University of North Carolina, Chapel Hill, NC USA
| | - W. Evan Johnson
- Division of Infectious Disease, Department of Medicine, Center for Data Science, Rutgers University, New Jersey, USA
| | - Megan K. Mulligan
- Department of Genetics, Genomics, and Informatics, University of Tennessee Health Science Center, Memphis, TN USA
| | - Camron D. Bryant
- Laboratory of Addiction Genetics, Department of Pharmaceutical Sciences and Center for Drug Discovery, Northeastern University, Boston, MA USA
- T32 Training Program on Development of Medications for Substance Use Disorder, Center for Drug Discovery, Northeastern University
| |
Collapse
|
4
|
Beichman AC, Zhu L, Harris K. The Evolutionary Interplay of Somatic and Germline Mutation Rates. Annu Rev Biomed Data Sci 2024; 7:83-105. [PMID: 38669515 DOI: 10.1146/annurev-biodatasci-102523-104225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024]
Abstract
Novel sequencing technologies are making it increasingly possible to measure the mutation rates of somatic cell lineages. Accurate germline mutation rate measurement technologies have also been available for a decade, making it possible to assess how this fundamental evolutionary parameter varies across the tree of life. Here, we review some classical theories about germline and somatic mutation rate evolution that were formulated using principles of population genetics and the biology of aging and cancer. We find that somatic mutation rate measurements, while still limited in phylogenetic diversity, seem consistent with the theory that selection to preserve the soma is proportional to life span. However, germline and somatic theories make conflicting predictions regarding which species should have the most accurate DNA repair. Resolving this conflict will require carefully measuring how mutation rates scale with time and cell division and achieving a better understanding of mutation rate pleiotropy among cell types.
Collapse
Affiliation(s)
- Annabel C Beichman
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA;
| | - Luke Zhu
- Department of Bioengineering, University of Washington, Seattle, Washington, USA
| | - Kelley Harris
- Computational Biology Division, Fred Hutchinson Cancer Center, Seattle, Washington, USA
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA;
| |
Collapse
|
5
|
Deaville LA, Berrens RV. Technology to the rescue: how to uncover the role of transposable elements in preimplantation development. Biochem Soc Trans 2024; 52:1349-1362. [PMID: 38752836 PMCID: PMC11346443 DOI: 10.1042/bst20231262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 04/23/2024] [Accepted: 04/24/2024] [Indexed: 06/27/2024]
Abstract
Transposable elements (TEs) are highly expressed in preimplantation development. Preimplantation development is the phase when the cells of the early embryo undergo the first cell fate choice and change from being totipotent to pluripotent. A range of studies have advanced our understanding of TEs in preimplantation, as well as their epigenetic regulation and functional roles. However, many questions remain about the implications of TE expression during early development. Challenges originate first due to the abundance of TEs in the genome, and second because of the limited cell numbers in preimplantation. Here we review the most recent technological advancements promising to shed light onto the role of TEs in preimplantation development. We explore novel avenues to identify genomic TE insertions and improve our understanding of the regulatory mechanisms and roles of TEs and their RNA and protein products during early development.
Collapse
Affiliation(s)
- Lauryn A. Deaville
- Institute for Developmental and Regenerative Medicine, Oxford University, IMS-Tetsuya Nakamura Building, Old Road Campus, Roosevelt Dr, Oxford OX3 7TY, U.K
- Department of Paediatrics, Oxford University, Level 2, Children's Hospital, John Radcliffe Headington, Oxford OX3 9DU, U.K
- MRC Weatherall Institute of Molecular Medicine, Oxford University, John Radcliffe Hospital, Oxford OX3 9DS, U.K
| | - Rebecca V. Berrens
- Institute for Developmental and Regenerative Medicine, Oxford University, IMS-Tetsuya Nakamura Building, Old Road Campus, Roosevelt Dr, Oxford OX3 7TY, U.K
- Department of Paediatrics, Oxford University, Level 2, Children's Hospital, John Radcliffe Headington, Oxford OX3 9DU, U.K
| |
Collapse
|
6
|
D'Alessandro A, Keele GR, Hay A, Nemkov T, Earley EJ, Stephenson D, Vincent M, Deng X, Stone M, Dzieciatkowska M, Hansen KC, Kleinman S, Spitalnik SL, Roubinian NH, Norris PJ, Busch MP, Page GP, Stockwell BR, Churchill GA, Zimring JC. Ferroptosis regulates hemolysis in stored murine and human red blood cells. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.11.598512. [PMID: 38915523 PMCID: PMC11195277 DOI: 10.1101/2024.06.11.598512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Red blood cell (RBC) metabolism regulates hemolysis during aging in vivo and in the blood bank. Here, we leveraged a diversity outbred mouse population to map the genetic drivers of fresh/stored RBC metabolism and extravascular hemolysis upon storage and transfusion in 350 mice. We identify the ferrireductase Steap3 as a critical regulator of a ferroptosis-like process of lipid peroxidation. Steap3 polymorphisms were associated with RBC iron content, in vitro hemolysis, and in vivo extravascular hemolysis both in mice and 13,091 blood donors from the Recipient Epidemiology and Donor evaluation Study. Using metabolite Quantitative Trait Loci analyses, we identified a network of gene products (FADS1/2, EPHX2 and LPCAT3) - enriched in donors of African descent - associated with oxylipin metabolism in stored human RBCs and related to Steap3 or its transcriptional regulator, the tumor protein TP53. Genetic variants were associated with lower in vivo hemolysis in thousands of single-unit transfusion recipients. Highlights Steap3 regulates lipid peroxidation and extravascular hemolysis in 350 diversity outbred miceSteap3 SNPs are linked to RBC iron, hemolysis, vesiculation in 13,091 blood donorsmQTL analyses of oxylipins identified ferroptosis-related gene products FADS1/2, EPHX2, LPCAT3Ferroptosis markers are linked to hemoglobin increments in transfusion recipients. Graphical abstract
Collapse
|
7
|
Boyboy BAG, Ichiyanagi K. Insertion of short L1 sequences generates inter-strain histone acetylation differences in the mouse. Mob DNA 2024; 15:11. [PMID: 38730323 PMCID: PMC11084082 DOI: 10.1186/s13100-024-00321-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 04/17/2024] [Indexed: 05/12/2024] Open
Abstract
BACKGROUND Gene expression divergence between populations and between individuals can emerge from genetic variations within the genes and/or in the cis regulatory elements. Since epigenetic modifications regulate gene expression, it is conceivable that epigenetic variations in cis regulatory elements can also be a source of gene expression divergence. RESULTS In this study, we compared histone acetylation (namely, H3K9ac) profiles in two mouse strains of different subspecies origin, C57BL/6 J (B6) and MSM/Ms (MSM), as well as their F1 hybrids. This identified 319 regions of strain-specific acetylation, about half of which were observed between the alleles of F1 hybrids. While the allele-specific presence of the interferon regulatory factor 3 (IRF3) binding sequence was associated with allele-specific histone acetylation, we also revealed that B6-specific insertions of a short 3' fragment of LINE-1 (L1) retrotransposon occur within or proximal to MSM-specific acetylated regions. Furthermore, even in hyperacetylated domains, flanking regions of non-polymorphic 3' L1 fragments were hypoacetylated, suggesting a general activity of the 3' L1 fragment to induce hypoacetylation. Indeed, we confirmed the binding of the 3' region of L1 by three Krüppel-associated box domain-containing zinc finger proteins (KZFPs), which interact with histone deacetylases. These results suggest that even a short insertion of L1 would be excluded from gene- and acetylation-rich regions by natural selection. Finally, mRNA-seq analysis for F1 hybrids was carried out, which disclosed a link between allele-specific promoter/enhancer acetylation and gene expression. CONCLUSIONS This study disclosed a number of genetic changes that have changed the histone acetylation levels during the evolution of mouse subspecies, a part of which is associated with gene expression changes. Insertions of even a very short L1 fragment can decrease the acetylation level in their neighboring regions and thereby have been counter-selected in gene-rich regions, which may explain a long-standing mystery of discrete genomic distribution of LINEs and SINEs.
Collapse
Affiliation(s)
- Beverly Ann G Boyboy
- Laboratory of Genome and Epigenome Dynamics, Department of Animal Sciences, Graduate School of Bioagricultural Sciences, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan
| | - Kenji Ichiyanagi
- Laboratory of Genome and Epigenome Dynamics, Department of Animal Sciences, Graduate School of Bioagricultural Sciences, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, 464-8601, Japan.
| |
Collapse
|
8
|
Baldarelli RM, Smith CL, Ringwald M, Richardson JE, Bult CJ. Mouse Genome Informatics: an integrated knowledgebase system for the laboratory mouse. Genetics 2024; 227:iyae031. [PMID: 38531069 PMCID: PMC11075557 DOI: 10.1093/genetics/iyae031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Accepted: 02/13/2024] [Indexed: 03/28/2024] Open
Abstract
Mouse Genome Informatics (MGI) is a federation of expertly curated information resources designed to support experimental and computational investigations into genetic and genomic aspects of human biology and disease using the laboratory mouse as a model system. The Mouse Genome Database (MGD) and the Gene Expression Database (GXD) are core MGI databases that share data and system architecture. MGI serves as the central community resource of integrated information about mouse genome features, variation, expression, gene function, phenotype, and human disease models acquired from peer-reviewed publications, author submissions, and major bioinformatics resources. To facilitate integration and standardization of data, biocuration scientists annotate using terms from controlled metadata vocabularies and biological ontologies (e.g. Mammalian Phenotype Ontology, Mouse Developmental Anatomy, Disease Ontology, Gene Ontology, etc.), and by applying international community standards for gene, allele, and mouse strain nomenclature. MGI serves basic scientists, translational researchers, and data scientists by providing access to FAIR-compliant data in both human-readable and compute-ready formats. The MGI resource is accessible at https://informatics.jax.org. Here, we present an overview of the core data types represented in MGI and highlight recent enhancements to the resource with a focus on new data and functionality for MGD and GXD.
Collapse
Affiliation(s)
| | | | | | | | - Carol J Bult
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
| |
Collapse
|
9
|
O’Connor C, Keele GR, Martin W, Stodola T, Gatti D, Hoffman BR, Korstanje R, Churchill GA, Reinholdt LG. Unraveling the genetics of arsenic toxicity with cellular morphology QTL. PLoS Genet 2024; 20:e1011248. [PMID: 38662777 PMCID: PMC11075906 DOI: 10.1371/journal.pgen.1011248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 05/07/2024] [Accepted: 04/03/2024] [Indexed: 05/08/2024] Open
Abstract
The health risks that arise from environmental exposures vary widely within and across human populations, and these differences are largely determined by genetic variation and gene-by-environment (gene-environment) interactions. However, risk assessment in laboratory mice typically involves isogenic strains and therefore, does not account for these known genetic effects. In this context, genetically heterogenous cell lines from laboratory mice are promising tools for population-based screening because they provide a way to introduce genetic variation in risk assessment without increasing animal use. Cell lines from genetic reference populations of laboratory mice offer genetic diversity, power for genetic mapping, and potentially, predictive value for in vivo experimentation in genetically matched individuals. To explore this further, we derived a panel of fibroblast lines from a genetic reference population of laboratory mice (the Diversity Outbred, DO). We then used high-content imaging to capture hundreds of cell morphology traits in cells exposed to the oxidative stress-inducing arsenic metabolite monomethylarsonous acid (MMAIII). We employed dose-response modeling to capture latent parameters of response and we then used these parameters to identify several hundred cell morphology quantitative trait loci (cmQTL). Response cmQTL encompass genes with established associations with cellular responses to arsenic exposure, including Abcc4 and Txnrd1, as well as novel gene candidates like Xrcc2. Moreover, baseline trait cmQTL highlight the influence of natural variation on fundamental aspects of nuclear morphology. We show that the natural variants influencing response include both coding and non-coding variation, and that cmQTL haplotypes can be used to predict response in orthogonal cell lines. Our study sheds light on the major molecular initiating events of oxidative stress that are under genetic regulation, including the NRF2-mediated antioxidant response, cellular detoxification pathways, DNA damage repair response, and cell death trajectories.
Collapse
Affiliation(s)
- Callan O’Connor
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
| | - Gregory R. Keele
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- RTI International, Research Triangle Park, Durham, North Carolina, United States of America
| | - Whitney Martin
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Timothy Stodola
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Daniel Gatti
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Brian R. Hoffman
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
| | - Ron Korstanje
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
| | - Gary A. Churchill
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
| | - Laura G. Reinholdt
- The Jackson Laboratory, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
| |
Collapse
|
10
|
Dumont BL, Gatti DM, Ballinger MA, Lin D, Phifer-Rixey M, Sheehan MJ, Suzuki TA, Wooldridge LK, Frempong HO, Lawal RA, Churchill GA, Lutz C, Rosenthal N, White JK, Nachman MW. Into the Wild: A novel wild-derived inbred strain resource expands the genomic and phenotypic diversity of laboratory mouse models. PLoS Genet 2024; 20:e1011228. [PMID: 38598567 PMCID: PMC11034653 DOI: 10.1371/journal.pgen.1011228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 04/22/2024] [Accepted: 03/18/2024] [Indexed: 04/12/2024] Open
Abstract
The laboratory mouse has served as the premier animal model system for both basic and preclinical investigations for over a century. However, laboratory mice capture only a subset of the genetic variation found in wild mouse populations, ultimately limiting the potential of classical inbred strains to uncover phenotype-associated variants and pathways. Wild mouse populations are reservoirs of genetic diversity that could facilitate the discovery of new functional and disease-associated alleles, but the scarcity of commercially available, well-characterized wild mouse strains limits their broader adoption in biomedical research. To overcome this barrier, we have recently developed, sequenced, and phenotyped a set of 11 inbred strains derived from wild-caught Mus musculus domesticus. Each of these "Nachman strains" immortalizes a unique wild haplotype sampled from one of five environmentally distinct locations across North and South America. Whole genome sequence analysis reveals that each strain carries between 4.73-6.54 million single nucleotide differences relative to the GRCm39 mouse reference, with 42.5% of variants in the Nachman strain genomes absent from current classical inbred mouse strain panels. We phenotyped the Nachman strains on a customized pipeline to assess the scope of disease-relevant neurobehavioral, biochemical, physiological, metabolic, and morphological trait variation. The Nachman strains exhibit significant inter-strain variation in >90% of 1119 surveyed traits and expand the range of phenotypic diversity captured in classical inbred strain panels. These novel wild-derived inbred mouse strain resources are set to empower new discoveries in both basic and preclinical research.
Collapse
Affiliation(s)
- Beth L. Dumont
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
- Graduate School of Biomedical Science and Engineering, The University of Maine, Orono, Maine, United States of America
| | - Daniel M. Gatti
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Mallory A. Ballinger
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, New York, United States of America
| | - Dana Lin
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee, United States of America
| | - Megan Phifer-Rixey
- Department of Biology, Drexel University, Philadelphia, Pennsylvania, United States of America
| | - Michael J. Sheehan
- Department of Neurobiology and Behavior, Cornell University, Ithaca, New York, United States of America
| | - Taichi A. Suzuki
- College of Health Solutions and Biodesign Center for Health Through Microbiomes, Arizona State University, Tempe, Arizona, United States of America
| | - Lydia K. Wooldridge
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Hilda Opoku Frempong
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Science and Engineering, The University of Maine, Orono, Maine, United States of America
| | - Raman Akinyanju Lawal
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Gary A. Churchill
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
- Graduate School of Biomedical Science and Engineering, The University of Maine, Orono, Maine, United States of America
| | - Cathleen Lutz
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Nadia Rosenthal
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
- Graduate School of Biomedical Sciences, Tufts University, Boston, Massachusetts, United States of America
- Graduate School of Biomedical Science and Engineering, The University of Maine, Orono, Maine, United States of America
- National Heart and Lung Institute, Imperial College London, London, United Kingdom
| | - Jacqueline K. White
- The Jackson Laboratory, 600 Main Street, Bar Harbor, Maine, United States of America
| | - Michael W. Nachman
- Department of Integrative Biology, Museum of Vertebrate Zoology, and Center for Computational Biology, University of California, Berkeley, Berkeley, California, United States of America
| |
Collapse
|
11
|
Sasani TA, Quinlan AR, Harris K. Epistasis between mutator alleles contributes to germline mutation spectrum variability in laboratory mice. eLife 2024; 12:RP89096. [PMID: 38381482 PMCID: PMC10942616 DOI: 10.7554/elife.89096] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024] Open
Abstract
Maintaining germline genome integrity is essential and enormously complex. Although many proteins are involved in DNA replication, proofreading, and repair, mutator alleles have largely eluded detection in mammals. DNA replication and repair proteins often recognize sequence motifs or excise lesions at specific nucleotides. Thus, we might expect that the spectrum of de novo mutations - the frequencies of C>T, A>G, etc. - will differ between genomes that harbor either a mutator or wild-type allele. Previously, we used quantitative trait locus mapping to discover candidate mutator alleles in the DNA repair gene Mutyh that increased the C>A germline mutation rate in a family of inbred mice known as the BXDs (Sasani et al., 2022, Ashbrook et al., 2021). In this study we developed a new method to detect alleles associated with mutation spectrum variation and applied it to mutation data from the BXDs. We discovered an additional C>A mutator locus on chromosome 6 that overlaps Ogg1, a DNA glycosylase involved in the same base-excision repair network as Mutyh (David et al., 2007). Its effect depends on the presence of a mutator allele near Mutyh, and BXDs with mutator alleles at both loci have greater numbers of C>A mutations than those with mutator alleles at either locus alone. Our new methods for analyzing mutation spectra reveal evidence of epistasis between germline mutator alleles and may be applicable to mutation data from humans and other model organisms.
Collapse
Affiliation(s)
- Thomas A Sasani
- Department of Human Genetics, University of UtahSalt Lake CityUnited States
| | - Aaron R Quinlan
- Department of Human Genetics, University of UtahSalt Lake CityUnited States
- Department of Biomedical Informatics, University of UtahSalt Lake CityUnited States
| | - Kelley Harris
- Department of Genome Sciences, University of WashingtonSeattleUnited States
- Herbold Computational Biology Program, Fred Hutch Cancer CenterSeattleUnited States
| |
Collapse
|
12
|
Lanciano S, Philippe C, Sarkar A, Pratella D, Domrane C, Doucet AJ, van Essen D, Saccani S, Ferry L, Defossez PA, Cristofari G. Locus-level L1 DNA methylation profiling reveals the epigenetic and transcriptional interplay between L1s and their integration sites. CELL GENOMICS 2024; 4:100498. [PMID: 38309261 PMCID: PMC10879037 DOI: 10.1016/j.xgen.2024.100498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 07/20/2023] [Accepted: 01/09/2024] [Indexed: 02/05/2024]
Abstract
Long interspersed element 1 (L1) retrotransposons are implicated in human disease and evolution. Their global activity is repressed by DNA methylation, but deciphering the regulation of individual copies has been challenging. Here, we combine short- and long-read sequencing to unveil L1 methylation heterogeneity across cell types, families, and individual loci and elucidate key principles involved. We find that the youngest primate L1 families are specifically hypomethylated in pluripotent stem cells and the placenta but not in most tumors. Locally, intronic L1 methylation is intimately associated with gene transcription. Conversely, the L1 methylation state can propagate to the proximal region up to 300 bp. This phenomenon is accompanied by the binding of specific transcription factors, which drive the expression of L1 and chimeric transcripts. Finally, L1 hypomethylation alone is typically insufficient to trigger L1 expression due to redundant silencing pathways. Our results illuminate the epigenetic and transcriptional interplay between retrotransposons and their host genome.
Collapse
Affiliation(s)
- Sophie Lanciano
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Claude Philippe
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Arpita Sarkar
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - David Pratella
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Cécilia Domrane
- University Paris Cité, CNRS, Epigenetics and Cell Fate, Paris, France
| | - Aurélien J Doucet
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Dominic van Essen
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Simona Saccani
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France
| | - Laure Ferry
- University Paris Cité, CNRS, Epigenetics and Cell Fate, Paris, France
| | | | - Gael Cristofari
- University Cote d'Azur, INSERM, CNRS, Institute for Research on Cancer and Aging of Nice (IRCAN), Nice, France.
| |
Collapse
|
13
|
Audano PA, Beck CR. Small polymorphisms are a source of ancestral bias in structural variant breakpoint placement. Genome Res 2024; 34:7-19. [PMID: 38176712 PMCID: PMC10904011 DOI: 10.1101/gr.278203.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 01/02/2024] [Indexed: 01/06/2024]
Abstract
High-quality genome assemblies and sophisticated algorithms have increased sensitivity for a wide range of variant types, and breakpoint accuracy for structural variants (SVs, ≥50 bp) has improved to near base pair precision. Despite these advances, many SV breakpoint locations are subject to systematic bias affecting variant representation. To understand why SV breakpoints are inconsistent across samples, we reanalyzed 64 phased haplotypes constructed from long-read assemblies released by the Human Genome Structural Variation Consortium (HGSVC). We identify 882 SV insertions and 180 SV deletions with variable breakpoints not anchored in tandem repeats (TRs) or segmental duplications (SDs). SVs called from aligned sequencing reads increase breakpoint disagreements by 2×-16×. Sequence accuracy had a minimal impact on breakpoints, but we observe a strong effect of ancestry. We confirm that SNP and indel polymorphisms are enriched at shifted breakpoints and are also absent from variant callsets. Breakpoint homology increases the likelihood of imprecise SV calls and the distance they are shifted, and tandem duplications are the most heavily affected SVs. Because graph genome methods normalize SV calls across samples, we investigated graphs generated by two different methods and find the resulting breakpoints are subject to other technical biases affecting breakpoint accuracy. The breakpoint inconsistencies we characterize affect ∼5% of the SVs called in a human genome and can impact variant interpretation and annotation. These limitations underscore a need for algorithm development to improve SV databases, mitigate the impact of ancestry on breakpoints, and increase the value of callsets for investigating breakpoint features.
Collapse
Affiliation(s)
- Peter A Audano
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032, USA
| | - Christine R Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032, USA;
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, University of Connecticut Health Center, Farmington, Connecticut 06030, USA
| |
Collapse
|
14
|
Ball RL, Bogue MA, Liang H, Srivastava A, Ashbrook DG, Lamoureux A, Gerring MW, Hatoum AS, Kim MJ, He H, Emerson J, Berger AK, Walton DO, Sheppard K, El Kassaby B, Castellanos F, Kunde-Ramamoorthy G, Lu L, Bluis J, Desai S, Sundberg BA, Peltz G, Fang Z, Churchill GA, Williams RW, Agrawal A, Bult CJ, Philip VM, Chesler EJ. GenomeMUSter mouse genetic variation service enables multitrait, multipopulation data integration and analysis. Genome Res 2024; 34:145-159. [PMID: 38290977 PMCID: PMC10903950 DOI: 10.1101/gr.278157.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 01/10/2024] [Indexed: 02/01/2024]
Abstract
Hundreds of inbred mouse strains and intercross populations have been used to characterize the function of genetic variants that contribute to disease. Thousands of disease-relevant traits have been characterized in mice and made publicly available. New strains and populations including consomics, the collaborative cross, expanded BXD, and inbred wild-derived strains add to existing complex disease mouse models, mapping populations, and sensitized backgrounds for engineered mutations. The genome sequences of inbred strains, along with dense genotypes from others, enable integrated analysis of trait-variant associations across populations, but these analyses are hampered by the sparsity of genotypes available. Moreover, the data are not readily interoperable with other resources. To address these limitations, we created a uniformly dense variant resource by harmonizing multiple data sets. Missing genotypes were imputed using the Viterbi algorithm with a data-driven technique that incorporates local phylogenetic information, an approach that is extendable to other model organisms. The result is a web- and programmatically accessible data service called GenomeMUSter, comprising single-nucleotide variants covering 657 strains at 106.8 million segregating sites. Interoperation with phenotype databases, analytic tools, and other resources enable a wealth of applications, including multitrait, multipopulation meta-analysis. We show this in cross-species comparisons of type 2 diabetes and substance use disorder meta-analyses, leveraging mouse data to characterize the likely role of human variant effects in disease. Other applications include refinement of mapped loci and prioritization of strain backgrounds for disease modeling to further unlock extant mouse diversity for genetic and genomic studies in health and disease.
Collapse
Affiliation(s)
- Robyn L Ball
- The Jackson Laboratory, Bar Harbor, Maine 04609, USA;
| | - Molly A Bogue
- The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | | | - Anuj Srivastava
- The Jackson Laboratory for Genomic Medicine, Farmington, Connecticut 06032, USA
| | - David G Ashbrook
- University of Tennessee Health Science Center, Memphis, Tennessee 38163, USA
| | | | | | - Alexander S Hatoum
- Psychological and Brain Sciences, Washington University in St. Louis, St. Louis, Missouri 63130, USA
- Artificial Intelligence and the Internet of Things Institute, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Matthew J Kim
- University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| | - Hao He
- The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | - Jake Emerson
- The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | | | | | | | | | | | | | - Lu Lu
- University of Tennessee Health Science Center, Memphis, Tennessee 38163, USA
| | - John Bluis
- The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | - Sejal Desai
- The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | | | - Gary Peltz
- Department of Anesthesia, Pain and Perioperative Medicine, Stanford University School of Medicine, Stanford, California 94305, USA
| | - Zhuoqing Fang
- Department of Anesthesia, Pain and Perioperative Medicine, Stanford University School of Medicine, Stanford, California 94305, USA
| | | | - Robert W Williams
- University of Tennessee Health Science Center, Memphis, Tennessee 38163, USA
| | - Arpana Agrawal
- Department of Psychiatry, Washington University School of Medicine, St. Louis, Missouri 63110, USA
| | - Carol J Bult
- The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | | | | |
Collapse
|
15
|
Kozak KM, Escalona M, Chumchim N, Fairbairn C, Marimuthu MPA, Nguyen O, Sahasrabudhe R, Seligmann W, Conroy C, Patton JL, Bowie RCK, Nachman MW. A highly contiguous genome assembly for the pocket mouse Perognathus longimembris longimembris. J Hered 2024; 115:130-138. [PMID: 37793045 PMCID: PMC10838119 DOI: 10.1093/jhered/esad060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 09/30/2023] [Indexed: 10/06/2023] Open
Abstract
The little pocket mouse, Perognathus longimembris, and its nine congeners are small heteromyid rodents found in arid and seasonally arid regions of Western North America. The genus is characterized by behavioral and physiological adaptations to dry and often harsh environments, including nocturnality, seasonal torpor, food caching, enhanced osmoregulation, and a well-developed sense of hearing. Here we present a genome assembly of Perognathus longimembris longimembris generated from PacBio HiFi long read and Omni-C chromatin-proximity sequencing as part of the California Conservation Genomics Project. The assembly has a length of 2.35 Gb, contig N50 of 11.6 Mb, scaffold N50 of 73.2 Mb, and includes 93.8% of the BUSCO Glires genes. Interspersed repetitive elements constitute 41.2% of the genome. A comparison with the highly endangered Pacific pocket mouse, P. l. pacificus, reveals broad synteny. These new resources will enable studies of local adaptation, genetic diversity, and conservation of threatened taxa.
Collapse
Affiliation(s)
- Krzysztof M Kozak
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, United States
| | - Merly Escalona
- Department of Biomolecular Engineering, University of California–Santa Cruz, Santa Cruz, CA 95064, United States
| | - Noravit Chumchim
- DNA Technologies and Expression Analysis Core Laboratory, Genome Center, University of California, Davis, CA 95616, United States
| | - Colin Fairbairn
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, United States
| | - Mohan P A Marimuthu
- DNA Technologies and Expression Analysis Core Laboratory, Genome Center, University of California, Davis, CA 95616, United States
| | - Oanh Nguyen
- DNA Technologies and Expression Analysis Core Laboratory, Genome Center, University of California, Davis, CA 95616, United States
| | - Ruta Sahasrabudhe
- DNA Technologies and Expression Analysis Core Laboratory, Genome Center, University of California, Davis, CA 95616, United States
| | - William Seligmann
- Department of Ecology and Evolutionary Biology, University of California, Santa Cruz, Santa Cruz, CA 95064, United States
| | - Chris Conroy
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, United States
| | - James L Patton
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, United States
| | - Rauri C K Bowie
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, United States
| | - Michael W Nachman
- Museum of Vertebrate Zoology and Department of Integrative Biology, University of California, Berkeley, CA 94720, United States
| |
Collapse
|
16
|
Stévant I, Gonen N, Poulat F. Transposable elements acquire time- and sex-specific transcriptional and epigenetic signatures along mouse fetal gonad development. Front Cell Dev Biol 2024; 11:1327410. [PMID: 38283992 PMCID: PMC10811072 DOI: 10.3389/fcell.2023.1327410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 12/20/2023] [Indexed: 01/30/2024] Open
Abstract
Gonadal sex determination in mice is a complex and dynamic process, which is crucial for the development of functional reproductive organs. The expression of genes involved in this process is regulated by a variety of genetic and epigenetic mechanisms. Recently, there has been increasing evidence that transposable elements (TEs), which are a class of mobile genetic elements, play a significant role in regulating gene expression during embryogenesis and organ development. In this study, we aimed to investigate the involvement of TEs in the regulation of gene expression during mouse embryonic gonadal development. Through bioinformatics analysis, we aimed to identify and characterize specific TEs that operate as regulatory elements for sex-specific genes, as well as their potential mechanisms of regulation. We identified TE loci expressed in a time- and sex-specific manner along fetal gonad development that correlate positively and negatively with nearby gene expression, suggesting that their expression is integrated to the gonadal regulatory network. Moreover, chromatin accessibility and histone post-transcriptional modification analyses in differentiating supporting cells revealed that TEs are acquiring a sex-specific signature for promoter-, enhancer-, and silencer-like elements, with some of them being proximal to critical sex-determining genes. Altogether, our study introduces TEs as the new potential players in the gene regulatory network that controls gonadal development in mammals.
Collapse
Affiliation(s)
- Isabelle Stévant
- The Mina and Everard Goodman Faculty of Life Sciences and the Institute of Nanotechnology and Advanced Materials, Bar-Ilan University, Ramat Gan, Israel
- Institute of Human Genetics, CNRS UMR9002 University of Montpellier, Montpellier, France
| | - Nitzan Gonen
- The Mina and Everard Goodman Faculty of Life Sciences and the Institute of Nanotechnology and Advanced Materials, Bar-Ilan University, Ramat Gan, Israel
| | - Francis Poulat
- Institute of Human Genetics, CNRS UMR9002 University of Montpellier, Montpellier, France
| |
Collapse
|
17
|
Saha K, Nielsen GI, Nandani R, Kong L, Ye P, An W. YY1 is a transcriptional activator of mouse LINE-1 Tf subfamily. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.03.573552. [PMID: 38260579 PMCID: PMC10802269 DOI: 10.1101/2024.01.03.573552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
Long interspersed element type 1 (LINE-1, L1) is an active autonomous transposable element (TE) in the human genome. The first step of L1 replication is transcription, which is controlled by an internal RNA polymerase II promoter in the 5' untranslated region (UTR) of a full-length L1. It has been shown that transcription factor YY1 binds to a conserved sequence motif at the 5' end of the human L1 5'UTR and dictates where transcription initiates but not the level of transcription. Putative YY1-binding motifs have been predicted in the 5'UTRs of two distinct mouse L1 subfamilies, Tf and Gf. Using site-directed mutagenesis, in vitro binding, and gene knockdown assays, we experimentally tested the role of YY1 in mouse L1 transcription. Our results indicate that Tf, but not Gf subfamily, harbors functional YY1-binding sites in its 5'UTR monomers. In contrast to its role in human L1, YY1 functions as a transcriptional activator for the mouse Tf subfamily. Furthermore, YY1-binding motifs are solely responsible for the synergistic interaction between monomers, consistent with a model wherein distant monomers act as enhancers for mouse L1 transcription. The abundance of YY1-binding sites in Tf elements also raise important implications for gene regulation at the genomic level.
Collapse
Affiliation(s)
- Karabi Saha
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Grace I. Nielsen
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Raj Nandani
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Lingqi Kong
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Ping Ye
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| | - Wenfeng An
- Department of Pharmaceutical Sciences, South Dakota State University, Brookings, SD 57007, USA
| |
Collapse
|
18
|
O'Connor C, Keele GR, Martin W, Stodola T, Gatti D, Hoffman BR, Korstanje R, Churchill GA, Reinholdt LG. Cell morphology QTL reveal gene by environment interactions in a genetically diverse cell population. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.18.567597. [PMID: 38014303 PMCID: PMC10680806 DOI: 10.1101/2023.11.18.567597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Genetically heterogenous cell lines from laboratory mice are promising tools for population-based screening as they offer power for genetic mapping, and potentially, predictive value for in vivo experimentation in genetically matched individuals. To explore this further, we derived a panel of fibroblast lines from a genetic reference population of laboratory mice (the Diversity Outbred, DO). We then used high-content imaging to capture hundreds of cell morphology traits in cells exposed to the oxidative stress-inducing arsenic metabolite monomethylarsonous acid (MMAIII). We employed dose-response modeling to capture latent parameters of response and we then used these parameters to identify several hundred cell morphology quantitative trait loci (cmQTL). Response cmQTL encompass genes with established associations with cellular responses to arsenic exposure, including Abcc4 and Txnrd1, as well as novel gene candidates like Xrcc2. Moreover, baseline trait cmQTL highlight the influence of natural variation on fundamental aspects of nuclear morphology. We show that the natural variants influencing response include both coding and non-coding variation, and that cmQTL haplotypes can be used to predict response in orthogonal cell lines. Our study sheds light on the major molecular initiating events of oxidative stress that are under genetic regulation, including the NRF2-mediated antioxidant response, cellular detoxification pathways, DNA damage repair response, and cell death trajectories.
Collapse
Affiliation(s)
- Callan O'Connor
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
- Graduate School of Biomedical Sciences, Tufts University, Boston, MA 02111, USA
| | - Gregory R Keele
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
- RTI International, RTP, NC 27709, USA
| | | | | | - Daniel Gatti
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
| | | | | | | | - Laura G Reinholdt
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
- Graduate School of Biomedical Sciences, Tufts University, Boston, MA 02111, USA
| |
Collapse
|
19
|
Tam PLF, Leung D. The Molecular Impacts of Retrotransposons in Development and Diseases. Int J Mol Sci 2023; 24:16418. [PMID: 38003607 PMCID: PMC10671454 DOI: 10.3390/ijms242216418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Revised: 11/11/2023] [Accepted: 11/14/2023] [Indexed: 11/26/2023] Open
Abstract
Retrotransposons are invasive genetic elements that constitute substantial portions of mammalian genomes. They have the potential to influence nearby gene expression through their cis-regulatory sequences, reverse transcription machinery, and the ability to mold higher-order chromatin structures. Due to their multifaceted functions, it is crucial for host fitness to maintain strict regulation of these parasitic sequences to ensure proper growth and development. This review explores how subsets of retrotransposons have undergone evolutionary exaptation to enhance the complexity of mammalian genomes. It also highlights the significance of regulating these elements, drawing on recent studies conducted in human and murine systems.
Collapse
Affiliation(s)
- Phoebe Lut Fei Tam
- Division of Life Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR, China;
| | - Danny Leung
- Division of Life Science, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR, China;
- Center for Epigenomics Research, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong SAR, China
| |
Collapse
|
20
|
Sasani TA, Quinlan AR, Harris K. Epistasis between mutator alleles contributes to germline mutation spectra variability in laboratory mice. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.25.537217. [PMID: 37162999 PMCID: PMC10168256 DOI: 10.1101/2023.04.25.537217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Maintaining germline genome integrity is essential and enormously complex. Although many proteins are involved in DNA replication, proofreading, and repair [1], mutator alleles have largely eluded detection in mammals. DNA replication and repair proteins often recognize sequence motifs or excise lesions at specific nucleotides. Thus, we might expect that the spectrum of de novo mutations - the frequencies of C>T, A>G, etc. - will differ between genomes that harbor either a mutator or wild-type allele. Previously, we used quantitative trait locus mapping to discover candidate mutator alleles in the DNA repair gene Mutyh that increased the C>A germline mutation rate in a family of inbred mice known as the BXDs [2,3]. In this study we developed a new method to detect alleles associated with mutation spectrum variation and applied it to mutation data from the BXDs. We discovered an additional C>A mutator locus on chromosome 6 that overlaps Ogg1, a DNA glycosylase involved in the same base-excision repair network as Mutyh [4]. Its effect depended on the presence of a mutator allele near Mutyh, and BXDs with mutator alleles at both loci had greater numbers of C>A mutations than those with mutator alleles at either locus alone. Our new methods for analyzing mutation spectra reveal evidence of epistasis between germline mutator alleles and may be applicable to mutation data from humans and other model organisms.
Collapse
Affiliation(s)
| | - Aaron R. Quinlan
- Department of Human Genetics, University of Utah; Department of Biomedical Informatics, University of Utah · Funded by NIH/NHGRI R01HG012252
| | - Kelley Harris
- Department of Genome Sciences, University of Washington · Funded by NIH/NIGMS R35GM133428; Burroughs Wellcome Career Award at the Scientific Interface; Searle Scholarship; Pew Scholarship; Sloan Fellowship; Allen Discovery Center for Cell Lineage Tracing
| |
Collapse
|
21
|
Dumont BL, Gatti D, Ballinger MA, Lin D, Phifer-Rixey M, Sheehan MJ, Suzuki TA, Wooldridge LK, Frempong HO, Churchill G, Lutz C, Rosenthal N, White JK, Nachman MW. Into the Wild: A novel wild-derived inbred strain resource expands the genomic and phenotypic diversity of laboratory mouse models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.21.558738. [PMID: 37790321 PMCID: PMC10542534 DOI: 10.1101/2023.09.21.558738] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The laboratory mouse has served as the premier animal model system for both basic and preclinical investigations for a century. However, laboratory mice capture a narrow subset of the genetic variation found in wild mouse populations. This consideration inherently restricts the scope of potential discovery in laboratory models and narrows the pool of potentially identified phenotype-associated variants and pathways. Wild mouse populations are reservoirs of predicted functional and disease-associated alleles, but the sparsity of commercially available, well-characterized wild mouse strains limits their broader adoption in biomedical research. To overcome this barrier, we have recently imported, sequenced, and phenotyped a set of 11 wild-derived inbred strains developed from wild-caught Mus musculus domesticus. Each of these "Nachman strains" immortalizes a unique wild haplotype sampled from five environmentally diverse locations across North and South America: Saratoga Springs, New York, USA; Gainesville, Florida, USA; Manaus, Brazil; Tucson, Arizona, USA; and Edmonton, Alberta, Canada. Whole genome sequence analysis reveals that each strain carries between 4.73-6.54 million single nucleotide differences relative to the mouse reference assembly, with 42.5% of variants in the Nachman strain genomes absent from classical inbred mouse strains. We phenotyped the Nachman strains on a customized pipeline to assess the scope of disease-relevant neurobehavioral, biochemical, physiological, metabolic, and morphological trait variation. The Nachman strains exhibit significant inter-strain variation in >90% of 1119 surveyed traits and expand the range of phenotypic diversity captured in classical inbred strain panels alone. Taken together, our work introduces a novel wild-derived inbred mouse strain resource that will enable new discoveries in basic and preclinical research. These strains are currently available through The Jackson Laboratory Repository under laboratory code NachJ.
Collapse
Affiliation(s)
- Beth L Dumont
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
- Tufts University, Graduate School of Biomedical Sciences, 136 Harrison Ave, Boston, MA, 02111, USA
- The University of Maine, Graduate School of Biomedical Science and Engineering, 5775 Stodder Hall, Room 46, Orono, ME, 04469, USA
| | - Daniel Gatti
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| | - Mallory A Ballinger
- Department of Integrative Biology, Center for Computational Biology, and Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Dana Lin
- Department of Integrative Biology, Center for Computational Biology, and Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, CA 94720, USA
| | | | - Michael J Sheehan
- Department of Neurobiology and Behavior, Cornell University, Ithaca, NY 14853, USA
| | - Taichi A Suzuki
- College of Health Solutions and Biodesign Center for Health Through Microbiomes, Arizona State University, Tempe, AZ, USA 85281
| | | | - Hilda Opoku Frempong
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
- The University of Maine, Graduate School of Biomedical Science and Engineering, 5775 Stodder Hall, Room 46, Orono, ME, 04469, USA
| | - Gary Churchill
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
- Tufts University, Graduate School of Biomedical Sciences, 136 Harrison Ave, Boston, MA, 02111, USA
- The University of Maine, Graduate School of Biomedical Science and Engineering, 5775 Stodder Hall, Room 46, Orono, ME, 04469, USA
| | - Cathleen Lutz
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| | - Nadia Rosenthal
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
- Tufts University, Graduate School of Biomedical Sciences, 136 Harrison Ave, Boston, MA, 02111, USA
- The University of Maine, Graduate School of Biomedical Science and Engineering, 5775 Stodder Hall, Room 46, Orono, ME, 04469, USA
| | | | - Michael W Nachman
- Department of Integrative Biology, Center for Computational Biology, and Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
22
|
Gerdes P, Chan D, Lundberg M, Sanchez-Luque FJ, Bodea GO, Ewing AD, Faulkner GJ, Richardson SR. Locus-resolution analysis of L1 regulation and retrotransposition potential in mouse embryonic development. Genome Res 2023; 33:1465-1481. [PMID: 37798118 PMCID: PMC10620060 DOI: 10.1101/gr.278003.123] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 08/21/2023] [Indexed: 10/07/2023]
Abstract
Mice harbor ∼2800 intact copies of the retrotransposon Long Interspersed Element 1 (L1). The in vivo retrotransposition capacity of an L1 copy is defined by both its sequence integrity and epigenetic status, including DNA methylation of the monomeric units constituting young mouse L1 promoters. Locus-specific L1 methylation dynamics during development may therefore elucidate and explain spatiotemporal niches of endogenous retrotransposition but remain unresolved. Here, we interrogate the retrotransposition efficiency and epigenetic fate of source (donor) L1s, identified as mobile in vivo. We show that promoter monomer loss consistently attenuates the relative retrotransposition potential of their offspring (daughter) L1 insertions. We also observe that most donor/daughter L1 pairs are efficiently methylated upon differentiation in vivo and in vitro. We use Oxford Nanopore Technologies (ONT) long-read sequencing to resolve L1 methylation genome-wide and at individual L1 loci, revealing a distinctive "smile" pattern in methylation levels across the L1 promoter region. Using Pacific Biosciences (PacBio) SMRT sequencing of L1 5' RACE products, we then examine DNA methylation dynamics at the mouse L1 promoter in parallel with transcription start site (TSS) distribution at locus-specific resolution. Together, our results offer a novel perspective on the interplay between epigenetic repression, L1 evolution, and genome stability.
Collapse
Affiliation(s)
- Patricia Gerdes
- Mater Research Institute - University of Queensland, TRI Building, Woolloongabba, Queensland 4102, Australia
| | - Dorothy Chan
- Mater Research Institute - University of Queensland, TRI Building, Woolloongabba, Queensland 4102, Australia
| | - Mischa Lundberg
- Mater Research Institute - University of Queensland, TRI Building, Woolloongabba, Queensland 4102, Australia
- The University of Queensland Diamantina Institute, The University of Queensland, Woolloongabba, Queensland 4102, Australia
- Translational Bioinformatics, Commonwealth Scientific and Industrial Research Organisation, Sydney, New South Wales 2113, Australia
| | - Francisco J Sanchez-Luque
- Mater Research Institute - University of Queensland, TRI Building, Woolloongabba, Queensland 4102, Australia
- GENYO. Centre for Genomics and Oncological Research (Pfizer-University of Granada-Andalusian Regional Government), PTS Granada, 18016, Spain
- MRC Human Genetics Unit, Institute of Genetics and Cancer (IGC), University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom
| | - Gabriela O Bodea
- Mater Research Institute - University of Queensland, TRI Building, Woolloongabba, Queensland 4102, Australia
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland 4072, Australia
| | - Adam D Ewing
- Mater Research Institute - University of Queensland, TRI Building, Woolloongabba, Queensland 4102, Australia
| | - Geoffrey J Faulkner
- Mater Research Institute - University of Queensland, TRI Building, Woolloongabba, Queensland 4102, Australia;
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland 4072, Australia
| | - Sandra R Richardson
- Mater Research Institute - University of Queensland, TRI Building, Woolloongabba, Queensland 4102, Australia;
| |
Collapse
|
23
|
Xie H, Li W, Guo Y, Su X, Chen K, Wen L, Tang F. Long-read-based single sperm genome sequencing for chromosome-wide haplotype phasing of both SNPs and SVs. Nucleic Acids Res 2023; 51:8020-8034. [PMID: 37351613 PMCID: PMC10450174 DOI: 10.1093/nar/gkad532] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 06/01/2023] [Accepted: 06/09/2023] [Indexed: 06/24/2023] Open
Abstract
Although localized haploid phasing can be achieved using long read genome sequencing without parental data, reliable chromosome-scale phasing remains a great challenge. Given that sperm is a natural haploid cell, single-sperm genome sequencing can provide a chromosome-wide phase signal. Due to the limitation of read length, current short-read-based single-sperm genome sequencing methods can only achieve SNP haplotyping and come with difficulties in detecting and haplotyping structural variations (SVs) in complex genomic regions. To overcome these limitations, we developed a long-read-based single-sperm genome sequencing method and a corresponding data analysis pipeline that can accurately identify crossover events and chromosomal level aneuploidies in single sperm and efficiently detect SVs within individual sperm cells. Importantly, without parental genome information, our method can accurately conduct de novo phasing of heterozygous SVs as well as SNPs from male individuals at the whole chromosome scale. The accuracy for phasing of SVs was as high as 98.59% using 100 single sperm cells, and the accuracy for phasing of SNPs was as high as 99.95%. Additionally, our method reliably enabled deduction of the repeat expansions of haplotype-resolved STRs/VNTRs in single sperm cells. Our method provides a new opportunity for studying haplotype-related genetics in mammals.
Collapse
Affiliation(s)
- Haoling Xie
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
- Changping Laboratory, Changping Laboratory, Yard 28, Science Park Road, Changping District, Beijing 102206, China
| | - Wen Li
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Yuqing Guo
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Xinjie Su
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Kexuan Chen
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Lu Wen
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
| | - Fuchou Tang
- Biomedical Pioneering Innovation Center, School of Life Sciences, Peking University, Beijing 100871, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Beijing Advanced Innovation Center for Genomics (ICG), Ministry of Education Key Laboratory of Cell Proliferation and Differentiation, Beijing 100871, China
- Changping Laboratory, Changping Laboratory, Yard 28, Science Park Road, Changping District, Beijing 102206, China
| |
Collapse
|
24
|
Ball RL, Bogue MA, Liang H, Srivastava A, Ashbrook DG, Lamoureux A, Gerring MW, Hatoum AS, Kim M, He H, Emerson J, Berger AK, Walton DO, Sheppard K, Kassaby BE, Castellanos F, Kunde-Ramamoorthy G, Lu L, Bluis J, Desai S, Sundberg BA, Peltz G, Fang Z, Churchill GA, Williams RW, Agrawal A, Bult CJ, Philip VM, Chesler EJ. GenomeMUSter mouse genetic variation service enables multi-trait, multi-population data integration and analyses. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.08.552506. [PMID: 37609331 PMCID: PMC10441370 DOI: 10.1101/2023.08.08.552506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Hundreds of inbred laboratory mouse strains and intercross populations have been used to functionalize genetic variants that contribute to disease. Thousands of disease relevant traits have been characterized in mice and made publicly available. New strains and populations including the Collaborative Cross, expanded BXD and inbred wild-derived strains add to set of complex disease mouse models, genetic mapping resources and sensitized backgrounds against which to evaluate engineered mutations. The genome sequences of many inbred strains, along with dense genotypes from others could allow integrated analysis of trait - variant associations across populations, but these analyses are not feasible due to the sparsity of genotypes available. Moreover, the data are not readily interoperable with other resources. To address these limitations, we created a uniformly dense data resource by harmonizing multiple variant datasets. Missing genotypes were imputed using the Viterbi algorithm with a data-driven technique that incorporates local phylogenetic information, an approach that is extensible to other model organism species. The result is a web- and programmatically-accessible data service called GenomeMUSter ( https://muster.jax.org ), comprising allelic data covering 657 strains at 106.8M segregating sites. Interoperation with phenotype databases, analytic tools and other resources enable a wealth of applications including multi-trait, multi-population meta-analysis. We demonstrate this in a cross-species comparison of the meta-analysis of Type 2 Diabetes and of substance use disorders, resulting in the more specific characterization of the role of human variant effects in light of mouse phenotype data. Other applications include refinement of mapped loci and prioritization of strain backgrounds for disease modeling to further unlock extant mouse diversity for genetic and genomic studies in health and disease.
Collapse
|
25
|
Audano PA, Beck CR. Small allelic variants are a source of ancestral bias in structural variant breakpoint placement. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.25.546295. [PMID: 37425850 PMCID: PMC10327140 DOI: 10.1101/2023.06.25.546295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
High-quality genome assemblies and sophisticated algorithms have increased sensitivity for a wide range of variant types, and breakpoint accuracy for structural variants (SVs, ≥ 50 bp) has improved to near basepair precision. Despite these advances, many SVs in unique regions of the genome are subject to systematic bias that affects breakpoint location. This ambiguity leads to less accurate variant comparisons across samples, and it obscures true breakpoint features needed for mechanistic inferences. To understand why SVs are not consistently placed, we reanalyzed 64 phased haplotypes constructed from long-read assemblies released by the Human Genome Structural Variation Consortium (HGSVC). We identified variable breakpoints for 882 SV insertions and 180 SV deletions not anchored in tandem repeats (TRs) or segmental duplications (SDs). While this is unexpectedly high for genome assemblies in unique loci, we find read-based callsets from the same sequencing data yielded 1,566 insertions and 986 deletions with inconsistent breakpoints also not anchored in TRs or SDs. When we investigated causes for breakpoint inaccuracy, we found sequence and assembly errors had minimal impact, but we observed a strong effect of ancestry. We confirmed that polymorphic mismatches and small indels are enriched at shifted breakpoints and that these polymorphisms are generally lost when breakpoints shift. Long tracts of homology, such as SVs mediated by transposable elements, increase the likelihood of imprecise SV calls and the distance they are shifted. Tandem Duplication (TD) breakpoints are the most heavily affected SV class with 14% of TDs placed at different locations across haplotypes. While graph genome methods normalize SV calls across many samples, the resulting breakpoints are sometimes incorrect, highlighting a need to tune graph methods for breakpoint accuracy. The breakpoint inconsistencies we characterize collectively affect ~5% of the SVs called in a human genome and underscore a need for algorithm development to improve SV databases, mitigate the impact of ancestry on breakpoint placement, and increase the value of callsets for investigating mutational processes.
Collapse
Affiliation(s)
- Peter A Audano
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Christine R Beck
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT, USA
| |
Collapse
|