1
|
Mirchandani CD, Shultz AJ, Thomas GWC, Smith SJ, Baylis M, Arnold B, Corbett-Detig R, Enbody E, Sackton TB. A Fast, Reproducible, High-throughput Variant Calling Workflow for Population Genomics. Mol Biol Evol 2024; 41:msad270. [PMID: 38069903 PMCID: PMC10764099 DOI: 10.1093/molbev/msad270] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 10/27/2023] [Accepted: 11/22/2023] [Indexed: 01/05/2024] Open
Abstract
The increasing availability of genomic resequencing data sets and high-quality reference genomes across the tree of life present exciting opportunities for comparative population genomic studies. However, substantial challenges prevent the simple reuse of data across different studies and species, arising from variability in variant calling pipelines, data quality, and the need for computationally intensive reanalysis. Here, we present snpArcher, a flexible and highly efficient workflow designed for the analysis of genomic resequencing data in nonmodel organisms. snpArcher provides a standardized variant calling pipeline and includes modules for variant quality control, data visualization, variant filtering, and other downstream analyses. Implemented in Snakemake, snpArcher is user-friendly, reproducible, and designed to be compatible with high-performance computing clusters and cloud environments. To demonstrate the flexibility of this pipeline, we applied snpArcher to 26 public resequencing data sets from nonmammalian vertebrates. These variant data sets are hosted publicly to enable future comparative population genomic analyses. With its extensibility and the availability of public data sets, snpArcher will contribute to a broader understanding of genetic variation across species by facilitating the rapid use and reuse of large genomic data sets.
Collapse
Affiliation(s)
- Cade D Mirchandani
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Allison J Shultz
- Ornithology Department, Natural History Museum of Los Angeles County, Los Angeles, CA 90007, USA
| | | | - Sara J Smith
- Informatics Group, Harvard University, Cambridge, MA, USA
- Biology, Mount Royal University, Calgary, AB T3E 6K6, Canada
| | - Mara Baylis
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Brian Arnold
- Department of Ecology and Evolutionary Biology, Princeton University, Princeton, NJ, USA
- Center for Statistics and Machine Learning, Princeton University, Princeton, NJ, USA
| | - Russ Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
- Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | - Erik Enbody
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA 95064, USA
| | | |
Collapse
|
2
|
Kato S, Arakaki S, Nagano AJ, Kikuchi K, Hirase S. Genomic landscape of introgression from the ghost lineage in a gobiid fish uncovers the generality of forces shaping hybrid genomes. Mol Ecol 2023. [PMID: 38047388 DOI: 10.1111/mec.17216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 09/23/2023] [Accepted: 10/26/2023] [Indexed: 12/05/2023]
Abstract
Extinct lineages can leave legacies in the genomes of extant lineages through ancient introgressive hybridization. The patterns of genomic survival of these extinct lineages provide insight into the role of extinct lineages in current biodiversity. However, our understanding on the genomic landscape of introgression from extinct lineages remains limited due to challenges associated with locating the traces of unsampled 'ghost' extinct lineages without ancient genomes. Herein, we conducted population genomic analyses on the East China Sea (ECS) lineage of Chaenogobius annularis, which was suspected to have originated from ghost introgression, with the aim of elucidating its genomic origins and characterizing its landscape of introgression. By combining phylogeographic analysis and demographic modelling, we demonstrated that the ECS lineage originated from ancient hybridization with an extinct ghost lineage. Forward simulations based on the estimated demography indicated that the statistic γ of the HyDe analysis can be used to distinguish the differences in local introgression rates in our data. Consistent with introgression between extant organisms, we found reduced introgression from extinct lineage in regions with low recombination rates and with functional importance, thereby suggesting a role of linked selection that has eliminated the extinct lineage in shaping the hybrid genome. Moreover, we identified enrichment of repetitive elements in regions associated with ghost introgression, which was hitherto little known but was also observed in the re-analysis of published data on introgression between extant organisms. Overall, our findings underscore the unexpected similarities in the characteristics of introgression landscapes across different taxa, even in cases of ghost introgression.
Collapse
Affiliation(s)
- Shuya Kato
- Fisheries Laboratory, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Hamamatsu, Shizuoka, Japan
| | - Seiji Arakaki
- Amakusa Marine Biological Laboratory, Kyushu University, Amakusa, Kumamoto, Japan
| | - Atsushi J Nagano
- Department of Life Sciences, Faculty of Agriculture, Ryukoku University, Ōtsu, Shiga, Japan
- Institute for Advanced Biosciences, Keio University, Tsuruoka, Yamagata, Japan
| | - Kiyoshi Kikuchi
- Fisheries Laboratory, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Hamamatsu, Shizuoka, Japan
| | - Shotaro Hirase
- Fisheries Laboratory, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Hamamatsu, Shizuoka, Japan
| |
Collapse
|
3
|
Resolving species-level diversity of Beringiana and Sinanodonta mussels (Bivalvia: Unionidae) in the Japanese archipelago using genome-wide data. Mol Phylogenet Evol 2022; 175:107563. [PMID: 35809852 DOI: 10.1016/j.ympev.2022.107563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 06/04/2022] [Accepted: 06/27/2022] [Indexed: 11/22/2022]
Abstract
Accurate species identification is of primary importance in ecology and evolutionary biology. For a long time, the unionid mussels Beringiana and Sinanodonta have puzzled researchers trying to unravel their diversity because of their poorly discernible morphologies. A recent study conducted species delineation of unionid mussels based on mitochondrial DNA variation, opening up a new avenue to grasp species diversity of the mussels. However, mtDNA-based classification may not align with species boundaries because mtDNA is prone to introgression and incomplete lineage sorting that cause discordance between species affiliation and gene phylogeny. In this study, we evaluated the validity of the mtDNA-based classification of unionid mussels Beringiana and Sinanodonta in Japan using mitochondrial sequence data, double digest restriction site-associated DNA library (ddRAD) sequencing, and morphological data. We found significant inconsistencies in the mitochondrial and nuclear DNA phylogenies, casting doubt on the reliability of the mtDNA-based classification in this group. In addition, nuclear DNA phylogeny revealed that there are at least two unionid lineages hidden in the mtDNA phylogeny. Although molecular dating technique indicates that Beringiana and Sinanodonta diverged >35 million years ago, their shell morphologies are often indistinguishable. Specifically, morphological analyses exhibited the parallel appearance of nearly identical ball-like shell forms in the two genera in Lake Biwa, which further complicates species identification and the morphological evolution of unionid mussels. Our study adds to a growing body of literature that accurate species identification of unionid mussels is difficult when using morphological characters alone. Although mtDNA-based classification is a simple and convenient way to classify unionid mussels, considerable caution is warranted for its application in ecological and evolutionary studies.
Collapse
|
4
|
Yamazaki D, Ito S, Miura O, Sasaki T, Chiba S. High-throughput SNPs dataset reveal restricted population connectivity of marine gastropod within the narrow distribution range of peripheral oceanic islands. Sci Rep 2022; 12:2119. [PMID: 35136087 PMCID: PMC8825847 DOI: 10.1038/s41598-022-05026-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 12/29/2021] [Indexed: 11/25/2022] Open
Abstract
Molecular studies based on the high resolution genetic markers help us to grasp the factor shaping the genetic structure of marine organisms. Ecological factors linking to life history traits have often explained the process of genetic structuring in open and connectable oceanic environments. Besides, population genetic divergence can be affected by fragmented habitat, oceanic current, and past geographical events. In the present study, we demonstrated the genetic differentiation of marine gastropod Monodonta sp. within a narrow range of peripheral oceanic islands, the Ogasawara Islands. Genetic analyses were performed not only with a mitochondrial DNA marker but also with a high-throughput SNPs dataset obtained by ddRAD-seq. The results of the mtDNA analyses did not show genetic divergence among populations, while the SNPs dataset detected population genetic differentiation. Population demographic analyses and gene flow estimation suggested that the genetic structure was formed by sea level fluctuation associated with the past climatic change and regulated by temporal oceanographic conditions. These findings provide important insights into population genetic patterns in open and connectable environments.
Collapse
Affiliation(s)
- Daishi Yamazaki
- Center for Northeast Asian Studies, Tohoku University, 41 Kawauchi, Aoba-ku, Sendai, Miyagi, 980-8576, Japan.
| | - Shun Ito
- Graduate School of Life Science, Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai, Miyagi, 980-8577, Japan
| | - Osamu Miura
- Faculty of Agriculture and Marine Science, Kochi University, 200 Monobe, Nankoku, Kochi, 783-8502, Japan
| | - Tetsuro Sasaki
- Institute of Boninology, Chichijima-Aza-Nishimachi, Ogasawara, Tokyo, 100-2101, Japan
| | - Satoshi Chiba
- Center for Northeast Asian Studies, Tohoku University, 41 Kawauchi, Aoba-ku, Sendai, Miyagi, 980-8576, Japan.,Graduate School of Life Science, Tohoku University, 2-1-1 Katahira, Aoba-ku, Sendai, Miyagi, 980-8577, Japan
| |
Collapse
|