1
|
Olli S, Lam NT, Hiljanen S, Kettunen T, Haikonen L, Hyvönen HM, Kiebler A, Köngäs I, Minkkinen S, Pöykiö V, Sannikka V, Vesa R, Wehrenberg G, Prost S, Prous M. Large mitochondrial genomes in tenthredinid sawflies (Hymenoptera, Tenthredinidae). Mitochondrial DNA A DNA Mapp Seq Anal 2024:1-9. [PMID: 39526637 DOI: 10.1080/24701394.2024.2427206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 11/04/2024] [Indexed: 11/16/2024]
Abstract
We sequenced and assembled mitochondrial genomes of three tenthredinid sawflies (Euura poecilonota, E. striata, and Dolerus timidus) using Oxford Nanopore Technologies' MinION. The Canu assembler produced circular assemblies (23,000-40,000 bp). Still, errors were found in the highly repetitive non-coding control region because of the fragmented DNA which led to no reads spanning the complete control region, preventing its reliable assembly. Based on the non-repetitive coding region's sequencing coverage, we estimate the lengths of mitochondrial genomes of E. poecilonota, D. timidus, and E. striata to be about 30,000 bp, 31,000 bp, and 37,000 bp and control region to be 15,000 bp, 16,000 bp, and 22,000 bp respectively. All standard bilaterian mitochondrial genes are in the same order and orientation, except trnQ, which is on the minus strand in Euura and the plus strand in Dolerus. Using published tenthredinid genome data, we show that control region lengths are often underestimated.
Collapse
Affiliation(s)
- Suvi Olli
- Ecology and Genetics Research Unit, University of Oulu, Oulu, Finland
| | - Nok Ting Lam
- Ecology and Genetics Research Unit, University of Oulu, Oulu, Finland
| | - Siri Hiljanen
- Ecology and Genetics Research Unit, University of Oulu, Oulu, Finland
| | - Taru Kettunen
- Faculty of Science, University of Oulu, Oulu, Finland
| | | | | | - Angelika Kiebler
- Ecology and Genetics Research Unit, University of Oulu, Oulu, Finland
| | - Ida Köngäs
- Faculty of Science, University of Oulu, Oulu, Finland
| | | | - Veera Pöykiö
- Faculty of Science, University of Oulu, Oulu, Finland
| | - Ville Sannikka
- Faculty of Science, University of Oulu, Oulu, Finland
- Faculty of Science, University of Turku, Turku, Finland
| | - Ronja Vesa
- Faculty of Science, University of Oulu, Oulu, Finland
| | - Gerrit Wehrenberg
- Ecology and Genetics Research Unit, University of Oulu, Oulu, Finland
| | - Stefan Prost
- Ecology and Genetics Research Unit, University of Oulu, Oulu, Finland
- South African National Biodiversity Institute, National Zoological Garden, Pretoria, South Africa
- Central Research Laboratories, Natural History Museum Vienna, Vienna, Austria
| | - Marko Prous
- Museum of Natural History, University of Tartu, Tartu, Estonia
| |
Collapse
|
2
|
Dowell JA, Bowsher AW, Jamshad A, Shah R, Burke JM, Donovan LA, Mason CM. Historic breeding practices contribute to germplasm divergence in leaf specialized metabolism and ecophysiology in cultivated sunflower (Helianthus annuus). AMERICAN JOURNAL OF BOTANY 2024; 111:e16420. [PMID: 39483110 DOI: 10.1002/ajb2.16420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 07/09/2024] [Accepted: 07/09/2024] [Indexed: 11/03/2024]
Abstract
PREMISE The use of hybrid breeding systems to increase crop yields has been the cornerstone of modern agriculture and is exemplified in the breeding and improvement of cultivated sunflower (Helianthus annuus). However, it is poorly understood what effect supporting separate breeding pools in such systems, combined with continued selection for yield, may have on leaf ecophysiology and specialized metabolite variation. METHODS We analyzed 288 lines of cultivated H. annuus to examine the genomic basis of several specialized metabolites and agronomically important traits across major heterotic groups. RESULTS Heterotic group identity supports phenotypic divergences between fertility restoring and cytoplasmic male-sterility maintainer lines in leaf ecophysiology and specialized metabolism. However, the divergence is not associated with physical linkage to nuclear genes that support current hybrid breeding practices in cultivated H. annuus. Additionally, we identified four genomic regions associated with leaf ecophysiology and specialized metabolism that colocalize with previously identified QTLs for quantitative self-compatibility traits and with S-protein homolog (SPH) proteins, a recently discovered family of proteins associated with self-incompatibility and self/nonself recognition in Papaver rhoeas (common poppy) with suggested conserved downstream mechanisms among eudicots. CONCLUSIONS Further work is necessary to confirm the self-incompatibility mechanisms in cultivated H. annuus and their relationship to the integrative and polygenic architecture of leaf ecophysiology and specialized metabolism in cultivated sunflower. However, because self-compatibility is a derived quantitative trait in cultivated H. annuus, trait linkage to divergent phenotypic traits may have partially arisen as a potential unintended consequence of historical breeding practices and selection for yield.
Collapse
Affiliation(s)
- Jordan A Dowell
- Department of Biological Sciences, Louisiana State University, Baton Rouge, 70802, LA, USA
- Department of Biology, University of Central Florida, Orlando, 32816, FL, USA
| | - Alan W Bowsher
- Department of Plant Biology, University of Georgia, Athens, 30602, GA, USA
| | - Amna Jamshad
- Department of Plant Biology, University of Georgia, Athens, 30602, GA, USA
| | - Rahul Shah
- Department of Medicine, Vanderbilt University Medical Center, Nashville, 37232, TN, USA
| | - John M Burke
- Department of Plant Biology, University of Georgia, Athens, 30602, GA, USA
- The Plant Center, University of Georgia, Athens, 30602, GA, USA
| | - Lisa A Donovan
- Department of Plant Biology, University of Georgia, Athens, 30602, GA, USA
| | - Chase M Mason
- Department of Biology, University of Central Florida, Orlando, 32816, FL, USA
- Department of Plant Biology, University of Georgia, Athens, 30602, GA, USA
- Department of Biology, University of British Columbia Okanagan, Kelowna, B.C. 9 V1V1V7, Canada
| |
Collapse
|
3
|
Waneka G, Pate B, Monroe JG, Sloan DB. Exploring the Relationship Between Gene Expression and Low-Frequency Somatic Mutations in Arabidopsis with Duplex Sequencing. Genome Biol Evol 2024; 16:evae213. [PMID: 39365161 PMCID: PMC11489876 DOI: 10.1093/gbe/evae213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 09/07/2024] [Accepted: 09/10/2024] [Indexed: 10/05/2024] Open
Abstract
Intragenomic mutation rates can vary dramatically due to transcription-associated mutagenesis or transcription-coupled repair, which vary based on local epigenomic modifications that are nonuniformly distributed across genomes. One feature associated with decreased mutation is higher expression level, which depends on environmental cues. To understand the magnitude of expression-dependent mutation rate variation, we perturbed expression through a heat treatment in Arabidopsis thaliana. We quantified gene expression to identify differentially expressed genes, which we then targeted for mutation detection using duplex sequencing. This approach provided a highly accurate measurement of the frequency of rare somatic mutations in vegetative plant tissues, which has been a recent source of uncertainty. Somatic mutations in plants may be useful for understanding drivers of DNA damage and repair in the germline since plants experience late germline segregation and both somatic and germline cells share common repair machinery. We included mutant lines lacking mismatch repair (MMR) and base excision repair (BER) capabilities to understand how repair mechanisms may drive biased mutation accumulation. We found wild-type (WT) and BER mutant mutation frequencies to be very low (mean variant frequency 1.8 × 10-8 and 2.6 × 10-8, respectively), while MMR mutant frequencies were significantly elevated (1.13 × 10-6). Interestingly, in the MMR mutant lines, there was no difference in the somatic mutation frequencies between temperature treatments or between highly versus lowly expressed genes. The extremely low somatic variant frequencies in WT plants indicate that larger datasets will be needed to address fundamental evolutionary questions about whether environmental change leads to gene-specific changes in mutation rate.
Collapse
Affiliation(s)
- Gus Waneka
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - Braden Pate
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| | - J Grey Monroe
- Department of Plant Sciences, University of California, Davis, Davis, CA, USA
| | - Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, CO, USA
| |
Collapse
|
4
|
Souza FHS, Perez MF, Ferreira PHN, Bertollo LAC, Ezaz T, Charlesworth D, Cioffi MB. Multiple karyotype differences between populations of the Hoplias malabaricus (Teleostei; Characiformes), a species complex in the gray area of the speciation process. Heredity (Edinb) 2024; 133:216-226. [PMID: 39039117 PMCID: PMC11437160 DOI: 10.1038/s41437-024-00707-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 07/15/2024] [Accepted: 07/16/2024] [Indexed: 07/24/2024] Open
Abstract
Neotropical fishes exhibit remarkable karyotype diversity, whose evolution is poorly understood. Here, we studied genetic differences in 60 individuals, from 11 localities of one species, the wolf fish Hoplias malabaricus, from populations that include six different "karyomorphs". These differ in Y-X chromosome differentiation, and, in several cases, by fusions with autosomes that have resulted in multiple sex chromosomes. Other differences are also observed in diploid chromosome numbers and morphologies. In an attempt to start understanding how this diversity was generated, we analyzed within- and between-population differences in a genome-wide sequence data set. We detect clear genotype differences between karyomorphs. Even in sympatry, samples with different karyomorphs differ more in sequence than samples from allopatric populations of the same karyomorph, suggesting that they represent populations that are to some degree reproductively isolated. However, sequence divergence between populations with different karyomorphs is remarkably low, suggesting that chromosome rearrangements may have evolved during a brief evolutionary time. We suggest that the karyotypic differences probably evolved in allopatry, in small populations that would have allowed rapid fixation of rearrangements, and that they became sympatric after their differentiation. Further studies are needed to test whether the karyotype differences contribute to reproductive isolation detected between some H. malabaricus karyomorphs.
Collapse
Affiliation(s)
- Fernando H S Souza
- Laboratory of Evolutionary Cytogenetics, Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, SP, Brazil
| | - Manolo F Perez
- Laboratory of Evolutionary Cytogenetics, Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, SP, Brazil
| | - Pedro H N Ferreira
- Laboratory of Evolutionary Cytogenetics, Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, SP, Brazil
| | - Luiz A C Bertollo
- Laboratory of Evolutionary Cytogenetics, Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, SP, Brazil
| | - Tariq Ezaz
- Institute for Applied Ecology, University of Canberra, Canberra, NSW, Australia
| | - Deborah Charlesworth
- Institute for Evolutionary Biology, Ashworth Laboratories, King's Buildings, University of Edinburgh, Edinburgh, UK
| | - Marcelo B Cioffi
- Laboratory of Evolutionary Cytogenetics, Department of Genetics and Evolution, Federal University of São Carlos, São Carlos, SP, Brazil.
| |
Collapse
|
5
|
Roberts M, Josephs EB. Previously unmeasured genetic diversity explains part of Lewontin's paradox in a k -mer-based meta-analysis of 112 plant species. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.17.594778. [PMID: 38798362 PMCID: PMC11118579 DOI: 10.1101/2024.05.17.594778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
At the molecular level, most evolution is expected to be neutral. A key prediction of this expectation is that the level of genetic diversity in a population should scale with population size. However, as was noted by Richard Lewontin in 1974 and reaffirmed by later studies, the slope of the population size-diversity relationship in nature is much weaker than expected under neutral theory. We hypothesize that one contributor to this paradox is that current methods relying on single nucleotide polymorphisms (SNPs) called from aligning short reads to a reference genome underestimate levels of genetic diversity in many species. To test this idea, we calculated nucleotide diversity ( π ) and k -mer-based metrics of genetic diversity across 112 plant species, amounting to over 205 terabases of DNA sequencing data from 27,488 individual plants. We then compared how these different metrics correlated with proxies of population size that account for both range size and population density variation across species. We found that our population size proxies scaled anywhere from about 3 to over 20 times faster with k -mer diversity than nucleotide diversity after adjusting for evolutionary history, mating system, life cycle habit, cultivation status, and invasiveness. The relationship between k -mer diversity and population size proxies also remains significant after correcting for genome size, whereas the analogous relationship for nucleotide diversity does not. These results suggest that variation not captured by common SNP-based analyses explains part of Lewontin's paradox in plants.
Collapse
Affiliation(s)
- Miles Roberts
- Genetics and Genome Sciences Program, Michigan State University, East Lansing MI
| | - Emily B. Josephs
- Department of Plant Biology, Michigan State University, East Lansing, MI
- Ecology, Evolution, and Behavior Program, Michigan State University, East Lansing, MI
- Plant Resilience Institute, Michigan State University, East Lansing, MI
| |
Collapse
|
6
|
Kileeg Z, Wang P, Mott GA. Chromosome-Scale Assembly and Annotation of Eight Arabidopsis thaliana Ecotypes. Genome Biol Evol 2024; 16:evae169. [PMID: 39101619 PMCID: PMC11327923 DOI: 10.1093/gbe/evae169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2024] [Revised: 07/25/2024] [Accepted: 07/27/2024] [Indexed: 08/06/2024] Open
Abstract
The plant Arabidopsis thaliana is a model system used by researchers through much of plant research. Recent efforts have focused on discovering the genomic variation found in naturally occurring ecotypes isolated from around the world. These ecotypes have come from diverse climates and therefore have faced and adapted to a variety of abiotic and biotic stressors. The sequencing and comparative analysis of these genomes can offer insight into the adaptive strategies of plants. While there are a large number of ecotype genome sequences available, the majority were created using short-read technology. Mapping of short-reads containing structural variation to a reference genome bereft of that variation leads to incorrect mapping of those reads, resulting in a loss of genetic information and introduction of false heterozygosity. For this reason, long-read de novo sequencing of genomes is required to resolve structural variation events. In this article, we sequenced the genomes of eight natural variants of A. thaliana using nanopore sequencing. This resulted in highly contiguous assemblies with >95% of the genome contained within five contigs. The sequencing results from this study include five ecotypes from relict and African populations, an area of untapped genetic diversity. With this study, we increase the knowledge of diversity we have across A. thaliana ecotypes and contribute to ongoing production of an A. thaliana pan-genome.
Collapse
Affiliation(s)
- Zachary Kileeg
- Department of Biological Sciences, University of Toronto-Scarborough, Toronto, Canada
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada
| | - Pauline Wang
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada
- Centre for the Analysis of Genome Evolution & Function, University of Toronto, Toronto, Canada
| | - G Adam Mott
- Department of Biological Sciences, University of Toronto-Scarborough, Toronto, Canada
- Department of Cell and Systems Biology, University of Toronto, Toronto, Canada
- Centre for the Analysis of Genome Evolution & Function, University of Toronto, Toronto, Canada
| |
Collapse
|
7
|
Pierotti S, Welz B, Osuna-López M, Fitzgerald T, Wittbrodt J, Birney E. Genotype imputation in F2 crosses of inbred lines. BIOINFORMATICS ADVANCES 2024; 4:vbae107. [PMID: 39077633 PMCID: PMC11286293 DOI: 10.1093/bioadv/vbae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 06/04/2024] [Accepted: 07/22/2024] [Indexed: 07/31/2024]
Abstract
Motivation Crosses among inbred lines are a fundamental tool for the discovery of genetic loci associated with phenotypes of interest. In organisms for which large reference panels or SNP chips are not available, imputation from low-pass whole-genome sequencing is an effective method for obtaining genotype data from a large number of individuals. To date, a structured analysis of the conditions required for optimal genotype imputation has not been performed. Results We report a systematic exploration of the effect of several design variables on imputation performance in F2 crosses of inbred medaka lines using the imputation software STITCH. We determined that, depending on the number of samples, imputation performance reaches a plateau when increasing the per-sample sequencing coverage. We also systematically explored the trade-offs between cost, imputation accuracy, and sample numbers. We developed a computational pipeline to streamline the process, enabling other researchers to perform a similar cost-benefit analysis on their population of interest. Availability and implementation The source code for the pipeline is available at https://github.com/birneylab/stitchimpute. While our pipeline has been developed and tested for an F2 population, the software can also be used to analyse populations with a different structure.
Collapse
Affiliation(s)
- Saul Pierotti
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, Cambridge CB101SD, United Kingdom
| | - Bettina Welz
- Centre for Organismal Studies (COS), Heidelberg University, Heidelberg 69120, Germany
| | - Mireia Osuna-López
- Genomics Core Facility, European Molecular Biology Laboratory (EMBL), Heidelberg 69117, Germany
| | - Tomas Fitzgerald
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, Cambridge CB101SD, United Kingdom
| | - Joachim Wittbrodt
- Centre for Organismal Studies (COS), Heidelberg University, Heidelberg 69120, Germany
| | - Ewan Birney
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Hinxton, Cambridge CB101SD, United Kingdom
| |
Collapse
|
8
|
Cornetti L, Fields PD, Du Pasquier L, Ebert D. Long-term balancing selection for pathogen resistance maintains trans-species polymorphisms in a planktonic crustacean. Nat Commun 2024; 15:5333. [PMID: 38909039 PMCID: PMC11193740 DOI: 10.1038/s41467-024-49726-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 06/18/2024] [Indexed: 06/24/2024] Open
Abstract
Balancing selection is an evolutionary process that maintains genetic polymorphisms at selected loci and strongly reduces the likelihood of allele fixation. When allelic polymorphisms that predate speciation events are maintained independently in the resulting lineages, a pattern of trans-species polymorphisms may occur. Trans-species polymorphisms have been identified for loci related to mating systems and the MHC, but they are generally rare. Trans-species polymorphisms in disease loci are believed to be a consequence of long-term host-parasite coevolution by balancing selection, the so-called Red Queen dynamics. Here we scan the genomes of three crustaceans with a divergence of over 15 million years and identify 11 genes containing identical-by-descent trans-species polymorphisms with the same polymorphisms in all three species. Four of these genes display molecular footprints of balancing selection and have a function related to immunity. Three of them are located in or close to loci involved in resistance to a virulent bacterial pathogen, Pasteuria, with which the Daphnia host is known to coevolve. This provides rare evidence of trans-species polymorphisms for loci known to be functionally relevant in interactions with a widespread and highly specific parasite. These findings support the theory that specific antagonistic coevolution is able to maintain genetic diversity over millions of years.
Collapse
Affiliation(s)
- Luca Cornetti
- Department of Environmental Sciences, Zoology, University of Basel, Basel, Switzerland
- Syngenta Crop Protection AG, Stein, Switzerland
| | - Peter D Fields
- Department of Environmental Sciences, Zoology, University of Basel, Basel, Switzerland
| | - Louis Du Pasquier
- Department of Environmental Sciences, Zoology, University of Basel, Basel, Switzerland
| | - Dieter Ebert
- Department of Environmental Sciences, Zoology, University of Basel, Basel, Switzerland.
| |
Collapse
|
9
|
Minadakis N, Kaderli L, Horvath R, Bourgeois Y, Xu W, Thieme M, Woods DP, Roulin AC. Polygenic architecture of flowering time and its relationship with local environments in the grass Brachypodium distachyon. Genetics 2024; 227:iyae042. [PMID: 38504651 PMCID: PMC11075549 DOI: 10.1093/genetics/iyae042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Revised: 01/12/2024] [Accepted: 03/07/2024] [Indexed: 03/21/2024] Open
Abstract
Synchronizing the timing of reproduction with the environment is crucial in the wild. Among the multiple mechanisms, annual plants evolved to sense their environment, the requirement of cold-mediated vernalization is a major process that prevents individuals from flowering during winter. In many annual plants including crops, both a long and short vernalization requirement can be observed within species, resulting in so-called early-(spring) and late-(winter) flowering genotypes. Here, using the grass model Brachypodium distachyon, we explored the link between flowering-time-related traits (vernalization requirement and flowering time), environmental variation, and diversity at flowering-time genes by combining measurements under greenhouse and outdoor conditions. These experiments confirmed that B. distachyon natural accessions display large differences regarding vernalization requirements and ultimately flowering time. We underline significant, albeit quantitative effects of current environmental conditions on flowering-time-related traits. While disentangling the confounding effects of population structure on flowering-time-related traits remains challenging, population genomics analyses indicate that well-characterized flowering-time genes may contribute significantly to flowering-time variation and display signs of polygenic selection. Flowering-time genes, however, do not colocalize with genome-wide association peaks obtained with outdoor measurements, suggesting that additional genetic factors contribute to flowering-time variation in the wild. Altogether, our study fosters our understanding of the polygenic architecture of flowering time in a natural grass system and opens new avenues of research to investigate the gene-by-environment interaction at play for this trait.
Collapse
Affiliation(s)
- Nikolaos Minadakis
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstr. 107, 8008 Zürich, Switzerland
| | - Lars Kaderli
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstr. 107, 8008 Zürich, Switzerland
| | - Robert Horvath
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstr. 107, 8008 Zürich, Switzerland
| | - Yann Bourgeois
- DIADE, University of Montpellier, CIRAD, IRD, 34 000 Montpellier, France
| | - Wenbo Xu
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstr. 107, 8008 Zürich, Switzerland
| | - Michael Thieme
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstr. 107, 8008 Zürich, Switzerland
| | - Daniel P Woods
- Department of Plant Sciences, University of California-Davis, 104 Robbins Hall, Davis, CA 95616, USA
- Howard Hughes Medical Institute, 4000 Jones Bridge Rd, Chevy Chase, MD 20815, USA
| | - Anne C Roulin
- Department of Plant and Microbial Biology, University of Zürich, Zollikerstr. 107, 8008 Zürich, Switzerland
| |
Collapse
|
10
|
Jiang J, Xu YC, Zhang ZQ, Chen JF, Niu XM, Hou XH, Li XT, Wang L, Zhang YE, Ge S, Guo YL. Forces driving transposable element load variation during Arabidopsis range expansion. THE PLANT CELL 2024; 36:840-862. [PMID: 38036296 PMCID: PMC10980350 DOI: 10.1093/plcell/koad296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 10/25/2023] [Accepted: 11/06/2023] [Indexed: 12/02/2023]
Abstract
Genetic load refers to the accumulated and potentially life-threatening deleterious mutations in populations. Understanding the mechanisms underlying genetic load variation of transposable element (TE) insertion, a major large-effect mutation, during range expansion is an intriguing question in biology. Here, we used 1,115 global natural accessions of Arabidopsis (Arabidopsis thaliana) to study the driving forces of TE load variation during its range expansion. TE load increased with range expansion, especially in the recently established Yangtze River basin population. Effective population size, which explains 62.0% of the variance in TE load, high transposition rate, and selective sweeps contributed to TE accumulation in the expanded populations. We genetically mapped and identified multiple candidate causal genes and TEs, and revealed the genetic architecture of TE load variation. Overall, this study reveals the variation in TE genetic load during Arabidopsis expansion and highlights the causes of TE load variation from the perspectives of both population genetics and quantitative genetics.
Collapse
Affiliation(s)
- Juan Jiang
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- China National Botanical Garden, Beijing 100093, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yong-Chao Xu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- China National Botanical Garden, Beijing 100093, China
| | - Zhi-Qin Zhang
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- China National Botanical Garden, Beijing 100093, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jia-Fu Chen
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- China National Botanical Garden, Beijing 100093, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiao-Min Niu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- China National Botanical Garden, Beijing 100093, China
| | - Xing-Hui Hou
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- China National Botanical Garden, Beijing 100093, China
| | - Xin-Tong Li
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- China National Botanical Garden, Beijing 100093, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Li Wang
- Agricultural Synthetic Biology Center, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518000, China
| | - Yong E Zhang
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
- State Key Laboratory of Integrated Management of Pest Insects and Rodents & Key Laboratory of the Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Song Ge
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- China National Botanical Garden, Beijing 100093, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ya-Long Guo
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
- China National Botanical Garden, Beijing 100093, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
11
|
Lesack KJ, Wasmuth JD. The impact of FASTQ and alignment read order on structural variant calling from long-read sequencing data. PeerJ 2024; 12:e17101. [PMID: 38500526 PMCID: PMC10946394 DOI: 10.7717/peerj.17101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 02/21/2024] [Indexed: 03/20/2024] Open
Abstract
Background Structural variant (SV) calling from DNA sequencing data has been challenging due to several factors, including the ambiguity of short-read alignments, multiple complex SVs in the same genomic region, and the lack of "truth" datasets for benchmarking. Additionally, caller choice, parameter settings, and alignment method are known to affect SV calling. However, the impact of FASTQ read order on SV calling has not been explored for long-read data. Results Here, we used PacBio DNA sequencing data from 15 Caenorhabditis elegans strains and four Arabidopsis thaliana ecotypes to evaluate the sensitivity of different SV callers on FASTQ read order. Comparisons of variant call format files generated from the original and permutated FASTQ files demonstrated that the order of input data affected the SVs predicted by each caller. In particular, pbsv was highly sensitive to the order of the input data, especially at the highest depths where over 70% of the SV calls generated from pairs of differently ordered FASTQ files were in disagreement. These demonstrate that read order sensitivity is a complex, multifactorial process, as the differences observed both within and between species varied considerably according to the specific combination of aligner, SV caller, and sequencing depth. In addition to the SV callers being sensitive to the input data order, the SAMtools alignment sorting algorithm was identified as a source of variability following read order randomization. Conclusion The results of this study highlight the sensitivity of SV calling on the order of reads encoded in FASTQ files, which has not been recognized in long-read approaches. These findings have implications for the replication of SV studies and the development of consistent SV calling protocols. Our study suggests that researchers should pay attention to the input order sensitivity of read alignment sorting methods when analyzing long-read sequencing data for SV calling, as mitigating a source of variability could facilitate future replication work. These results also raise important questions surrounding the relationship between SV caller read order sensitivity and tool performance. Therefore, tool developers should also consider input order sensitivity as a potential source of variability during the development and benchmarking of new and improved methods for SV calling.
Collapse
Affiliation(s)
- Kyle J. Lesack
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - James D. Wasmuth
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
12
|
Waneka G, Pate B, Monroe JG, Sloan DB. Investigating low frequency somatic mutations in Arabidopsis with Duplex Sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.31.578196. [PMID: 38352550 PMCID: PMC10862904 DOI: 10.1101/2024.01.31.578196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Mutations are the source of novel genetic diversity but can also lead to disease and maladaptation. The conventional view is that mutations occur randomly with respect to their environment-specific fitness consequences. However, intragenomic mutation rates can vary dramatically due to transcription coupled repair and based on local epigenomic modifications, which are non-uniformly distributed across genomes. One sequence feature associated with decreased mutation is higher expression level, which can vary depending on environmental cues. To understand whether the association between expression level and mutation rate creates a systematic relationship with environment-specific fitness effects, we perturbed expression through a heat treatment in Arabidopsis thaliana. We quantified gene expression to identify differentially expressed genes, which we then targeted for mutation detection using Duplex Sequencing. This approach provided a highly accurate measurement of the frequency of rare somatic mutations in vegetative plant tissues, which has been a recent source of uncertainty in plant mutation research. We included mutant lines lacking mismatch repair (MMR) and base excision repair (BER) capabilities to understand how repair mechanisms may drive biased mutation accumulation. We found wild type (WT) and BER mutant mutation frequencies to be very low (mean variant frequency 1.8×10-8 and 2.6×10-8, respectively), while MMR mutant frequencies were significantly elevated (1.13×10-6). These results show that somatic variant frequencies are extremely low in WT plants, indicating that larger datasets will be needed to address the fundamental evolutionary question as to whether environmental change leads to gene-specific changes in mutation rate.
Collapse
Affiliation(s)
- Gus Waneka
- Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| | - Braden Pate
- Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| | - J Grey Monroe
- Department of Plant Sciences, University of California, Davis, Davis, CA USA
| | - Daniel B Sloan
- Department of Biology, Colorado State University, Fort Collins, Colorado, USA
| |
Collapse
|
13
|
Butkovic A, Ellis TJ, Gonzalez R, Jaegle B, Nordborg M, Elena SF. Genetic basis of Arabidopsis thaliana responses to infection by naïve and adapted isolates of turnip mosaic virus. eLife 2024; 12:RP89749. [PMID: 38240739 PMCID: PMC10945600 DOI: 10.7554/elife.89749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2024] Open
Abstract
Plant viruses account for enormous agricultural losses worldwide, and the most effective way to combat them is to identify genetic material conferring plant resistance to these pathogens. Aiming to identify genetic associations with responses to infection, we screened a large panel of Arabidopsis thaliana natural inbred lines for four disease-related traits caused by infection by A. thaliana-naïve and -adapted isolates of the natural pathogen turnip mosaic virus (TuMV). We detected a strong, replicable association in a 1.5 Mb region on chromosome 2 with a 10-fold increase in relative risk of systemic necrosis. The region contains several plausible causal genes as well as abundant structural variation, including an insertion of a Copia transposon into a Toll/interleukin receptor (TIR-NBS-LRR) coding for a gene involved in defense, that could be either a driver or a consequence of the disease-resistance locus. When inoculated with TuMV, loss-of-function mutant plants of this gene exhibited different symptoms than wild-type plants. The direction and severity of symptom differences depended on the adaptation history of the virus. This increase in symptom severity was specific for infections with the adapted isolate. Necrosis-associated alleles are found worldwide, and their distribution is consistent with a trade-off between resistance during viral outbreaks and a cost of resistance otherwise, leading to negative frequency-dependent selection.
Collapse
Affiliation(s)
- Anamarija Butkovic
- Instituto de Biología Integrativa de Sistemas (I2SysBio), CSIC-Universitat de València, Parc Científic UVValènciaSpain
| | - Thomas James Ellis
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna BioCenter, Doktor-Bohr-GasseViennaAustria
| | - Ruben Gonzalez
- Instituto de Biología Integrativa de Sistemas (I2SysBio), CSIC-Universitat de València, Parc Científic UVValènciaSpain
| | - Benjamin Jaegle
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna BioCenter, Doktor-Bohr-GasseViennaAustria
| | - Magnus Nordborg
- Gregor Mendel Institute (GMI), Austrian Academy of Sciences, Vienna BioCenter, Doktor-Bohr-GasseViennaAustria
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas (I2SysBio), CSIC-Universitat de València, Parc Científic UVValènciaSpain
- The Santa Fe InstituteSanta FeUnited States
| |
Collapse
|
14
|
Dallaire X, Bouchard R, Hénault P, Ulmo-Diaz G, Normandeau E, Mérot C, Bernatchez L, Moore JS. Widespread Deviant Patterns of Heterozygosity in Whole-Genome Sequencing Due to Autopolyploidy, Repeated Elements, and Duplication. Genome Biol Evol 2023; 15:evad229. [PMID: 38085037 PMCID: PMC10752349 DOI: 10.1093/gbe/evad229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/30/2023] [Indexed: 12/28/2023] Open
Abstract
Most population genomic tools rely on accurate single nucleotide polymorphism (SNP) calling and filtering to meet their underlying assumptions. However, genomic complexity, resulting from structural variants, paralogous sequences, and repetitive elements, presents significant challenges in assembling contiguous reference genomes. Consequently, short-read resequencing studies can encounter mismapping issues, leading to SNPs that deviate from Mendelian expected patterns of heterozygosity and allelic ratio. In this study, we employed the ngsParalog software to identify such deviant SNPs in whole-genome sequencing (WGS) data with low (1.5×) to intermediate (4.8×) coverage for four species: Arctic Char (Salvelinus alpinus), Lake Whitefish (Coregonus clupeaformis), Atlantic Salmon (Salmo salar), and the American Eel (Anguilla rostrata). The analyses revealed that deviant SNPs accounted for 22% to 62% of all SNPs in salmonid datasets and approximately 11% in the American Eel dataset. These deviant SNPs were particularly concentrated within repetitive elements and genomic regions that had recently undergone rediploidization in salmonids. Additionally, narrow peaks of elevated coverage were ubiquitous along all four reference genomes, encompassed most deviant SNPs, and could be partially associated with transposons and tandem repeats. Including these deviant SNPs in genomic analyses led to highly distorted site frequency spectra, underestimated pairwise FST values, and overestimated nucleotide diversity. Considering the widespread occurrence of deviant SNPs arising from a variety of sources, their important impact in estimating population parameters, and the availability of effective tools to identify them, we propose that excluding deviant SNPs from WGS datasets is required to improve genomic inferences for a wide range of taxa and sequencing depths.
Collapse
Affiliation(s)
- Xavier Dallaire
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Centre d'Études Nordiques, Université Laval, Québec, Canada
| | - Raphael Bouchard
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Philippe Hénault
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Gabriela Ulmo-Diaz
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Eric Normandeau
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
- Plateforme de bio-informatique de l’IBIS, Université Laval, Québec, Canada
| | - Claire Mérot
- CNRS, UMR 6553 ECOBIO, Université de Rennes, Rennes, France
| | - Louis Bernatchez
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| | - Jean-Sébastien Moore
- Institut de biologie intégrative et des systèmes, Université Laval, Québec, Canada
- Centre d'Études Nordiques, Université Laval, Québec, Canada
- Ressources Aquatique Québec, Université de Rimouski, Rimouski, Canada
| |
Collapse
|
15
|
Bramsiepe J, Krabberød AK, Bjerkan KN, Alling RM, Johannessen IM, Hornslien KS, Miller JR, Brysting AK, Grini PE. Structural evidence for MADS-box type I family expansion seen in new assemblies of Arabidopsis arenosa and A. lyrata. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 116:942-961. [PMID: 37517071 DOI: 10.1111/tpj.16401] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 05/24/2023] [Accepted: 07/13/2023] [Indexed: 08/01/2023]
Abstract
Arabidopsis thaliana diverged from A. arenosa and A. lyrata at least 6 million years ago. The three species differ by genome-wide polymorphisms and morphological traits. The species are to a high degree reproductively isolated, but hybridization barriers are incomplete. A special type of hybridization barrier is based on the triploid endosperm of the seed, where embryo lethality is caused by endosperm failure to support the developing embryo. The MADS-box type I family of transcription factors is specifically expressed in the endosperm and has been proposed to play a role in endosperm-based hybridization barriers. The gene family is well known for its high evolutionary duplication rate, as well as being regulated by genomic imprinting. Here we address MADS-box type I gene family evolution and the role of type I genes in the context of hybridization. Using two de-novo assembled and annotated chromosome-level genomes of A. arenosa and A. lyrata ssp. petraea we analyzed the MADS-box type I gene family in Arabidopsis to predict orthologs, copy number, and structural genomic variation related to the type I loci. Our findings were compared to gene expression profiles sampled before and after the transition to endosperm cellularization in order to investigate the involvement of MADS-box type I loci in endosperm-based hybridization barriers. We observed substantial differences in type-I expression in the endosperm of A. arenosa and A. lyrata ssp. petraea, suggesting a genetic cause for the endosperm-based hybridization barrier between A. arenosa and A. lyrata ssp. petraea.
Collapse
Affiliation(s)
- Jonathan Bramsiepe
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
- CEES, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
| | - Anders K Krabberød
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
| | - Katrine N Bjerkan
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
- CEES, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
| | - Renate M Alling
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
- CEES, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
| | - Ida M Johannessen
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
| | - Karina S Hornslien
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
| | - Jason R Miller
- College of STEM, Shepherd University, Shepherdstown, West Virginia, 25443-5000, USA
| | - Anne K Brysting
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
- CEES, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
| | - Paul E Grini
- Section for Genetics and Evolutionary Biology, Department of Biosciences, University of Oslo, 0316, Oslo, Norway
| |
Collapse
|
16
|
Pisupati R, Nizhynska V, Mollá Morales A, Nordborg M. On the causes of gene-body methylation variation in Arabidopsis thaliana. PLoS Genet 2023; 19:e1010728. [PMID: 37141384 DOI: 10.1371/journal.pgen.1010728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 05/16/2023] [Accepted: 03/31/2023] [Indexed: 05/06/2023] Open
Abstract
Gene-body methylation (gbM) refers to sparse CG methylation of coding regions, which is especially prominent in evolutionarily conserved house-keeping genes. It is found in both plants and animals, but is directly and stably (epigenetically) inherited over multiple generations in the former. Studies in Arabidopsis thaliana have demonstrated that plants originating from different parts of the world exhibit genome-wide differences in gbM, which could reflect direct selection on gbM, but which could also reflect an epigenetic memory of ancestral genetic and/or environmental factors. Here we look for evidence of such factors in F2 plants resulting from a cross between a southern Swedish line with low gbM and a northern Swedish line with high gbM, grown at two different temperatures. Using bisulfite-sequencing data with nucleotide-level resolution on hundreds of individuals, we confirm that CG sites are either methylated (nearly 100% methylation across sampled cells) or unmethylated (approximately 0% methylation across sampled cells), and show that the higher level of gbM in the northern line is due to more sites being methylated. Furthermore, methylation variants almost always show Mendelian segregation, consistent with their being directly and stably inherited through meiosis. To explore how the differences between the parental lines could have arisen, we focused on somatic deviations from the inherited state, distinguishing between gains (relative to the inherited 0% methylation) and losses (relative to the inherited 100% methylation) at each site in the F2 generation. We demonstrate that deviations predominantly affect sites that differ between the parental lines, consistent with these sites being more mutable. Gains and losses behave very differently in terms of the genomic distribution, and are influenced by the local chromatin state. We find clear evidence for different trans-acting genetic polymorphism affecting gains and losses, with those affecting gains showing strong environmental interactions (G×E). Direct effects of the environment were minimal. In conclusion, we show that genetic and environmental factors can change gbM at a cellular level, and hypothesize that these factors can also lead to transgenerational differences between individuals via the inclusion of such changes in the zygote. If true, this could explain genographic pattern of gbM with selection, and would cast doubt on estimates of epimutation rates from inbred lines in constant environments.
Collapse
Affiliation(s)
- Rahul Pisupati
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna BioCenter (VBC), Vienna, Austria
- Vienna Graduate School of Population Genetics, Institut für Populationsgenetik, Vetmeduni, Vienna, Austria
| | - Viktoria Nizhynska
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna BioCenter (VBC), Vienna, Austria
| | - Almudena Mollá Morales
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna BioCenter (VBC), Vienna, Austria
| | - Magnus Nordborg
- Gregor Mendel Institute, Austrian Academy of Sciences, Vienna BioCenter (VBC), Vienna, Austria
| |
Collapse
|