1
|
A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats. CELL GENOMICS 2024; 4:100527. [PMID: 38537634 PMCID: PMC11019364 DOI: 10.1016/j.xgen.2024.100527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 12/26/2023] [Accepted: 02/29/2024] [Indexed: 04/09/2024]
Abstract
The seventh iteration of the reference genome assembly for Rattus norvegicus-mRatBN7.2-corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared with its predecessor. Gene annotations are now more complete, improving the mapping precision of genomic, transcriptomic, and proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ∼20.0 million sequence variations, of which 18,700 are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.
Collapse
|
2
|
The genome and population genomics of allopolyploid Coffea arabica reveal the diversification history of modern coffee cultivars. Nat Genet 2024; 56:721-731. [PMID: 38622339 PMCID: PMC11018527 DOI: 10.1038/s41588-024-01695-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 02/23/2024] [Indexed: 04/17/2024]
Abstract
Coffea arabica, an allotetraploid hybrid of Coffea eugenioides and Coffea canephora, is the source of approximately 60% of coffee products worldwide, and its cultivated accessions have undergone several population bottlenecks. We present chromosome-level assemblies of a di-haploid C. arabica accession and modern representatives of its diploid progenitors, C. eugenioides and C. canephora. The three species exhibit largely conserved genome structures between diploid parents and descendant subgenomes, with no obvious global subgenome dominance. We find evidence for a founding polyploidy event 350,000-610,000 years ago, followed by several pre-domestication bottlenecks, resulting in narrow genetic variation. A split between wild accessions and cultivar progenitors occurred ~30.5 thousand years ago, followed by a period of migration between the two populations. Analysis of modern varieties, including lines historically introgressed with C. canephora, highlights their breeding histories and loci that may contribute to pathogen resistance, laying the groundwork for future genomics-based breeding of C. arabica.
Collapse
|
3
|
Dissecting the genetic architecture of quantitative traits using genome-wide identity-by-descent sharing. Mol Ecol 2024; 33:e17299. [PMID: 38380534 DOI: 10.1111/mec.17299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 01/08/2024] [Accepted: 01/22/2024] [Indexed: 02/22/2024]
Abstract
Additive and dominance genetic variances underlying the expression of quantitative traits are important quantities for predicting short-term responses to selection, but they are notoriously challenging to estimate in most non-model wild populations. Specifically, large-sized or panmictic populations may be characterized by low variance in genetic relatedness among individuals which, in turn, can prevent accurate estimation of quantitative genetic parameters. We used estimates of genome-wide identity-by-descent (IBD) sharing from autosomal SNP loci to estimate quantitative genetic parameters for ecologically important traits in nine-spined sticklebacks (Pungitius pungitius) from a large, outbred population. Using empirical and simulated datasets, with varying sample sizes and pedigree complexity, we assessed the performance of different crossing schemes in estimating additive genetic variance and heritability for all traits. We found that low variance in relatedness characteristic of wild outbred populations with high migration rate can impair the estimation of quantitative genetic parameters and bias heritability estimates downwards. On the other hand, the use of a half-sib/full-sib design allowed precise estimation of genetic variance components and revealed significant additive variance and heritability for all measured traits, with negligible dominance contributions. Genome-partitioning and QTL mapping analyses revealed that most traits had a polygenic basis and were controlled by genes at multiple chromosomes. Furthermore, different QTL contributed to variation in the same traits in different populations suggesting heterogeneous underpinnings of parallel evolution at the phenotypic level. Our results provide important guidelines for future studies aimed at estimating adaptive potential in the wild, particularly for those conducted in outbred large-sized populations.
Collapse
|
4
|
A revamped rat reference genome improves the discovery of genetic diversity in laboratory rats. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.13.536694. [PMID: 37214860 PMCID: PMC10197727 DOI: 10.1101/2023.04.13.536694] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The seventh iteration of the reference genome assembly for Rattus norvegicus-mRatBN7.2-corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared to its predecessor. Gene annotations are now more complete, significantly improving the mapping precision of genomic, transcriptomic, and proteomics data sets. We jointly analyzed 163 short-read whole genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ~20.0 million sequence variations, of which 18.7 thousand are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.
Collapse
|
5
|
Evolutionary dynamics of genome size and content during the adaptive radiation of Heliconiini butterflies. Nat Commun 2023; 14:5620. [PMID: 37699868 PMCID: PMC10497600 DOI: 10.1038/s41467-023-41412-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 08/30/2023] [Indexed: 09/14/2023] Open
Abstract
Heliconius butterflies, a speciose genus of Müllerian mimics, represent a classic example of an adaptive radiation that includes a range of derived dietary, life history, physiological and neural traits. However, key lineages within the genus, and across the broader Heliconiini tribe, lack genomic resources, limiting our understanding of how adaptive and neutral processes shaped genome evolution during their radiation. Here, we generate highly contiguous genome assemblies for nine Heliconiini, 29 additional reference-assembled genomes, and improve 10 existing assemblies. Altogether, we provide a dataset of annotated genomes for a total of 63 species, including 58 species within the Heliconiini tribe. We use this extensive dataset to generate a robust and dated heliconiine phylogeny, describe major patterns of introgression, explore the evolution of genome architecture, and the genomic basis of key innovations in this enigmatic group, including an assessment of the evolution of putative regulatory regions at the Heliconius stem. Our work illustrates how the increased resolution provided by such dense genomic sampling improves our power to generate and test gene-phenotype hypotheses, and precisely characterize how genomes evolve.
Collapse
|
6
|
Inbreeding depression in an outbred stickleback population. Mol Ecol 2023. [PMID: 37000426 DOI: 10.1111/mec.16946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 03/22/2023] [Accepted: 03/27/2023] [Indexed: 04/01/2023]
Abstract
Inbreeding depression refers to the reduced fitness of offspring produced by genetically-related individuals and is expected to be rare in large, outbred populations. When it occurs, marked fitness loss is possible as large populations can carry a substantial load of recessive harmful mutations which are normally sheltered at the heterozygous state. Using experimental cross data and genome-wide identity-by-descent (IBD) relationships from an outbred marine nine-spined stickleback (Pungitius pungitius) population, we documented a significant decrease in offspring survival probability with increasing parental IBD sharing associated with an average inbreeding load (B) of 10.5. Interestingly, we found that this relationship was also underlined by a positive effect of paternal inbreeding coefficient on offspring survival, suggesting that certain combinations of parental inbreeding and genetic relatedness among mates may promote offspring survival. Our results demonstrate the potential for substantial inbreeding load in an outbred population and emphasize the need to consider fine-scale genetic relatedness in future studies of inbreeding depression in the wild.
Collapse
|
7
|
Evidence for a single, ancient origin of a genus-wide alternative life history strategy. SCIENCE ADVANCES 2023; 9:eabq3713. [PMID: 36947619 PMCID: PMC10032607 DOI: 10.1126/sciadv.abq3713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 02/21/2023] [Indexed: 06/18/2023]
Abstract
Understanding the evolutionary origins and factors maintaining alternative life history strategies (ALHS) within species is a major goal of evolutionary research. While alternative alleles causing discrete ALHS are expected to purge or fix over time, one-third of the ~90 species of Colias butterflies are polymorphic for a female-limited ALHS called Alba. Whether Alba arose once, evolved in parallel, or has been exchanged among taxa is currently unknown. Using comparative genome-wide association study (GWAS) and population genomic analyses, we placed the genetic basis of Alba in time-calibrated phylogenomic framework, revealing that Alba evolved once near the base of the genus and has been subsequently maintained via introgression and balancing selection. CRISPR-Cas9 mutagenesis was then used to verify a putative cis-regulatory region of Alba, which we identified using phylogenetic foot printing. We hypothesize that this cis-regulatory region acts as a modular enhancer for the induction of the Alba ALHS, which has likely facilitated its long evolutionary persistence.
Collapse
|
8
|
Fragmented habitat compensates for the adverse effects of genetic bottleneck. Curr Biol 2023; 33:1009-1018.e7. [PMID: 36822202 DOI: 10.1016/j.cub.2023.01.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Revised: 12/01/2022] [Accepted: 01/19/2023] [Indexed: 02/25/2023]
Abstract
In the face of the human-caused biodiversity crisis, understanding the theoretical basis of conservation efforts of endangered species and populations has become increasingly important. According to population genetics theory, population subdivision helps organisms retain genetic diversity, crucial for adaptation in a changing environment. Habitat topography is thought to be important for generating and maintaining population subdivision, but empirical cases are needed to test this assumption. We studied Saimaa ringed seals, landlocked in a labyrinthine lake and recovering from a drastic bottleneck, with additional samples from three other ringed seal subspecies. Using whole-genome sequences of 145 seals, we analyzed the distribution of variation and genetic relatedness among the individuals in relation to the habitat shape. Despite a severe history of genetic bottlenecks with prevalent homozygosity in Saimaa ringed seals, we found evidence for the population structure mirroring the subregions of the lake. Our genome-wide analyses showed that the subpopulations had retained unique variation and largely complementary patterns of homozygosity, highlighting the significance of habitat connectivity in conservation biology and the power of genomic tools in understanding its impact. The central role of the population substructure in preserving genetic diversity at the metapopulation level was confirmed by simulations. Integration of genetic analyses in conservation decisions gives hope to Saimaa ringed seals and other endangered species in fragmented habitats.
Collapse
|
9
|
Genome properties of key oil palm (Elaeis guineensis Jacq.) breeding populations. J Appl Genet 2022; 63:633-650. [PMID: 35691996 DOI: 10.1007/s13353-022-00708-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 05/26/2022] [Accepted: 06/04/2022] [Indexed: 11/29/2022]
Abstract
A good knowledge of the genome properties of the populations makes it possible to optimize breeding methods, in particular genomic selection (GS). In oil palm (Elaeis guineensis Jacq), the world's main source of vegetable oil, this would provide insight into the promising GS results obtained so far. The present study considered two complex breeding populations, Deli and La Mé, with 943 individuals and 7324 single-nucleotide polymorphisms (SNPs) from genotyping-by-sequencing. Linkage disequilibrium (LD), haplotype sharing, effective size (Ne), and fixation index (Fst) were investigated. A genetic linkage map spanning 1778.52 cM and with a recombination rate of 2.85 cM/Mbp was constructed. The LD at r2=0.3, considered the minimum to get reliable GS results, spanned over 1.05 cM/0.22 Mbp in Deli and 0.9 cM/0.21 Mbp in La Mé. The significant degree of differentiation existing between Deli and La Mé was confirmed by the high Fst value (0.53), the pattern of correlation of SNP heterozygosity and allele frequency among populations, and the decrease of persistence of LD and of haplotype sharing among populations with increasing SNP distance. However, the level of resemblance between the two populations over short genomic distances (correlation of r values between populations >0.6 for SNPs separated by <0.5 cM/1 kbp and percentage of common haplotypes >40% for haplotypes <3600 bp/0.20 cM) likely explains the superiority of GS models ignoring the parental origin of marker alleles over models taking this information into account. The two populations had low Ne (<5). Population-specific genetic maps and reference genomes are recommended for future studies.
Collapse
|
10
|
The genetic basis of structural colour variation in mimetic
Heliconius
butterflies. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200505. [PMID: 35634924 PMCID: PMC9149798 DOI: 10.1098/rstb.2020.0505] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Structural colours, produced by the reflection of light from ultrastructures, have evolved multiple times in butterflies. Unlike pigmentary colours and patterns, little is known about the genetic basis of these colours. Reflective structures on wing-scale ridges are responsible for iridescent structural colour in many butterflies, including the Müllerian mimics Heliconius erato and Heliconius melpomene. Here, we quantify aspects of scale ultrastructure variation and colour in crosses between iridescent and non-iridescent subspecies of both of these species and perform quantitative trait locus (QTL) mapping. We show that iridescent structural colour has a complex genetic basis in both species, with offspring from crosses having a wide variation in blue colour (both hue and brightness) and scale structure measurements. We detect two different genomic regions in each species that explain modest amounts of this variation, with a sex-linked QTL in H. erato but not H. melpomene. We also find differences between species in the relationships between structure and colour, overall suggesting that these species have followed different evolutionary trajectories in their evolution of structural colour. We then identify genes within the QTL intervals that are differentially expressed between subspecies and/or wing regions, revealing likely candidates for genes controlling structural colour formation. This article is part of the theme issue ‘Genetic basis of adaptation and speciation: from loci to causative mutations’.
Collapse
|
11
|
Recombination landscape dimorphism and sex chromosome evolution in the dioecious plant Rumex hastatulus. Philos Trans R Soc Lond B Biol Sci 2022; 377:20210226. [PMID: 35306892 PMCID: PMC8935318 DOI: 10.1098/rstb.2021.0226] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
There is growing evidence from diverse taxa for sex differences in the genomic landscape of recombination, but the causes and consequences of these differences remain poorly understood. Strong recombination landscape dimorphism between the sexes could have important implications for the dynamics of sex chromosome evolution because low recombination in the heterogametic sex can favour the spread of sexually antagonistic alleles. Here, we present a sex-specific linkage map and revised genome assembly of Rumex hastatulus and provide the first evidence and characterization of sex differences in recombination landscape in a dioecious plant. We present data on significant sex differences in recombination, with regions of very low recombination in males covering over half of the genome. This pattern is evident on both sex chromosomes and autosomes, suggesting that pre-existing differences in recombination may have contributed to sex chromosome formation and divergence. Our analysis of segregation distortion suggests that haploid selection due to pollen competition occurs disproportionately in regions with low male recombination. We hypothesize that sex differences in the recombination landscape have contributed to the formation of a large heteromorphic pair of sex chromosomes in R. hastatulus, but more comparative analyses of recombination will be important to investigate this hypothesis further. This article is part of the theme issue 'Sex determination and sex chromosome evolution in land plants'.
Collapse
|
12
|
Improved chromosome-level genome assembly of the Glanville fritillary butterfly (Melitaea cinxia) integrating Pacific Biosciences long reads and a high-density linkage map. Gigascience 2022; 11:6505122. [PMID: 35022701 PMCID: PMC8756199 DOI: 10.1093/gigascience/giab097] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 05/03/2021] [Accepted: 12/14/2021] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND The Glanville fritillary (Melitaea cinxia) butterfly is a model system for metapopulation dynamics research in fragmented landscapes. Here, we provide a chromosome-level assembly of the butterfly's genome produced from Pacific Biosciences sequencing of a pool of males, combined with a linkage map from population crosses. RESULTS The final assembly size of 484 Mb is an increase of 94 Mb on the previously published genome. Estimation of the completeness of the genome with BUSCO indicates that the genome contains 92-94% of the BUSCO genes in complete and single copies. We predicted 14,810 genes using the MAKER pipeline and manually curated 1,232 of these gene models. CONCLUSIONS The genome and its annotated gene models are a valuable resource for future comparative genomics, molecular biology, transcriptome, and genetics studies on this species.
Collapse
|
13
|
Automated improvement of stickleback reference genome assemblies with Lep-Anchor software. Mol Ecol Resour 2021; 21:2166-2176. [PMID: 33955177 DOI: 10.1111/1755-0998.13404] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 04/12/2021] [Accepted: 04/13/2021] [Indexed: 01/06/2023]
Abstract
We describe an integrative approach to improve contiguity and haploidy of a reference genome assembly and demonstrate its impact with practical examples. With two novel features of Lep-Anchor software and a combination of dense linkage maps, overlap detection and bridging long reads, we generated an improved assembly of the nine-spined stickleback (Pungitius pungitius) reference genome. We were able to remove a significant number of haplotypic contigs, detect more genetic variation and improve the contiguity of the genome, especially that of X chromosome. However, improved scaffolding cannot correct for mosaicism of erroneously assembled contigs, demonstrated by a de novo assembly of a 1.6-Mbp inversion. Qualitatively similar gains were obtained with the genome of three-spined stickleback (Gasterosteus aculeatus). Since the utility of genome-wide sequencing data in biological research depends heavily on the quality of the reference genome, the improved and fully automated approach described here should be helpful in refining reference genome assemblies.
Collapse
|
14
|
Abstract
The comma butterfly (Polygonia c-album, Nymphalidae, Lepidoptera) is a model insect species, most notably in the study of phenotypic plasticity and plant-insect coevolutionary interactions. In order to facilitate the integration of genomic tools with a diverse body of ecological and evolutionary research, we assembled the genome of a Swedish comma using 10X sequencing, scaffolding with matepair data, genome polishing, and assignment to linkage groups using a high-density linkage map. The resulting genome is 373 Mb in size, with a scaffold N50 of 11.7 Mb and contig N50 of 11,2Mb. The genome contained 90.1% of single-copy Lepidopteran orthologs in a BUSCO analysis of 5,286 genes. A total of 21,004 gene-models were annotated on the genome using RNA-Seq data from larval and adult tissue in combination with proteins from the Arthropoda database, resulting in a high-quality annotation for which functional annotations were generated. We further documented the quality of the chromosomal assembly via synteny assessment with Melitaea cinxia. The resulting annotated, chromosome-level genome will provide an important resource for investigating coevolutionary dynamics and comparative analyses in Lepidoptera.
Collapse
|
15
|
Genetic population structure constrains local adaptation in sticklebacks. Mol Ecol 2021; 30:1946-1961. [PMID: 33464655 DOI: 10.1111/mec.15808] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 11/19/2020] [Accepted: 01/08/2021] [Indexed: 12/20/2022]
Abstract
Repeated and independent adaptation to specific environmental conditions from standing genetic variation is common. However, if genetic variation is limited, the evolution of similar locally adapted traits may be restricted to genetically different and potentially less optimal solutions or prevented from happening altogether. Using a quantitative trait locus (QTL) mapping approach, we identified the genomic regions responsible for the repeated pelvic reduction (PR) in three crosses between nine-spined stickleback populations expressing full and reduced pelvic structures. In one cross, PR mapped to linkage group 7 (LG7) containing the gene Pitx1, known to control pelvic reduction also in the three-spined stickleback. In the two other crosses, PR was polygenic and attributed to 10 novel QTL, of which 90% were unique to specific crosses. When screening the genomes from 27 different populations for deletions in the Pitx1 regulatory element, these were only found in the population in which PR mapped to LG7, even though the morphological data indicated large-effect QTL for PR in several other populations as well. Consistent with the available theory and simulations parameterized on empirical data, we hypothesize that the observed variability in genetic architecture of PR is due to heterogeneity in the spatial distribution of standing genetic variation caused by >2× stronger population structuring among freshwater populations and >10× stronger genetic isolation by distance in the sea in nine-spined sticklebacks as compared to three-spined sticklebacks.
Collapse
|
16
|
A novel terpene synthase controls differences in anti-aphrodisiac pheromone production between closely related Heliconius butterflies. PLoS Biol 2021; 19:e3001022. [PMID: 33465061 PMCID: PMC7815096 DOI: 10.1371/journal.pbio.3001022] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 11/30/2020] [Indexed: 02/07/2023] Open
Abstract
Plants and insects often use the same compounds for chemical communication, but not much is known about the genetics of convergent evolution of chemical signals. The terpene (E)-β-ocimene is a common component of floral scent and is also used by the butterfly Heliconius melpomene as an anti-aphrodisiac pheromone. While the biosynthesis of terpenes has been described in plants and microorganisms, few terpene synthases (TPSs) have been identified in insects. Here, we study the recent divergence of 2 species, H. melpomene and Heliconius cydno, which differ in the presence of (E)-β-ocimene; combining linkage mapping, gene expression, and functional analyses, we identify 2 novel TPSs. Furthermore, we demonstrate that one, HmelOS, is able to synthesise (E)-β-ocimene in vitro. We find no evidence for TPS activity in HcydOS (HmelOS ortholog of H. cydno), suggesting that the loss of (E)-β-ocimene in this species is the result of coding, not regulatory, differences. The TPS enzymes we discovered are unrelated to previously described plant and insect TPSs, demonstrating that chemical convergence has independent evolutionary origins. Plants and insects often use the same compounds for chemical communication, but little is known about the convergent evolution of such chemical signals. This study identifies a novel terpene synthase involved in production of an anti-aphrodisiac pheromone by the butterfly Heliconius melpomene. This enzyme is unrelated to other insect terpene synthases, providing evidence that the ability to synthesise terpenes has arisen multiple times independently within the insects.
Collapse
|
17
|
Lep-Anchor: automated construction of linkage map anchored haploid genomes. Bioinformatics 2020; 36:2359-2364. [PMID: 31913460 DOI: 10.1093/bioinformatics/btz978] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 12/12/2019] [Accepted: 01/02/2020] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION Linkage mapping provides a practical way to anchor de novo genome assemblies into chromosomes and to detect chimeric or otherwise erroneous contigs. Such anchoring improves with higher number of markers and individuals, as long as the mapping software can handle all the information. Recent software Lep-MAP3 can robustly construct linkage maps for millions of genotyped markers and on thousands of individuals, providing optimal maps for genome anchoring. For such large datasets, automated and robust genome anchoring tool is especially valuable and can significantly reduce intensive computational and manual work involved. RESULTS Here, we present a software Lep-Anchor (LA) to anchor genome assemblies automatically using dense linkage maps. As the main novelty, it takes into account the uncertainty of the linkage map positions caused by low recombination regions, cross type or poor mapping data quality. Furthermore, it can automatically detect and cut chimeric contigs, and use contig-contig, single read or alternative genome assembly alignments as additional information on contig order and orientations and to collapse haplotype contigs. We demonstrate the performance of LA using real data and show that it outperforms ALLMAPS on anchoring completeness and speed. Accuracy-wise LA and ALLMAPS are about equal, but at the expense of lower completeness of ALLMAPS. The software Chromonomer was faster than the other two methods but has major limitations and is lower in accuracy. We also show that with additional information, such as contig-contig and read alignments, the anchoring completeness can be improved by up to 70% without significant loss in accuracy. Based on simulated data, we conclude that the anchoring accuracy can be improved by utilizing information about map position uncertainty. Accuracy is the rate of contigs in correct orientation and completeness is the number contigs with inferred orientation. AVAILABILITY AND IMPLEMENTATION Lep-Anchor is available with the source code under GNU general public license from http://sourceforge.net/projects/lep-anchor. All the scripts and code used to produce the reported results are included with Lep-Anchor.
Collapse
|
18
|
Limited genetic parallels underlie convergent evolution of quantitative pattern variation in mimetic butterflies. J Evol Biol 2020; 33:1516-1529. [DOI: 10.1111/jeb.13704] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 08/05/2020] [Accepted: 09/04/2020] [Indexed: 01/28/2023]
|
19
|
ELIMÄKI Locus Is Required for Vertical Proprioceptive Response in Birch Trees. Curr Biol 2020; 30:589-599.e5. [PMID: 32004453 DOI: 10.1016/j.cub.2019.12.016] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 11/08/2019] [Accepted: 12/05/2019] [Indexed: 11/28/2022]
Abstract
Tree architecture has evolved to support a top-heavy above-ground biomass, but this integral feature poses a weight-induced challenge to trunk stability. Maintaining an upright stem is expected to require vertical proprioception through feedback between sensing stem weight and responding with radial growth. Despite its apparent importance, the principle by which plant stems respond to vertical loading forces remains largely unknown. Here, by manipulating the stem weight of downy birch (Betula pubescens) trees, we show that cambial development is modulated systemically along the stem. We carried out a genetic study on the underlying regulation by combining an accelerated birch flowering program with a recessive mutation at the ELIMÄKI locus (EKI), which causes a mechanically defective response to weight stimulus resulting in stem collapse after just 3 months. We observed delayed wood morphogenesis in eki compared with WT, along with a more mechanically elastic cambial zone and radial compression of xylem cell size, indicating that rapid tissue differentiation is critical for cambial growth under mechanical stress. Furthermore, the touch-induced mechanosensory pathway was transcriptionally misregulated in eki, indicating that the ELIMÄKI locus is required to integrate the weight-growth feedback regulation. By studying this birch mutant, we were able to dissect vertical proprioception from the gravitropic response associated with reaction wood formation. Our study provides evidence for both local and systemic responses to mechanical stimuli during secondary plant development.
Collapse
|
20
|
A High-Quality Assembly of the Nine-Spined Stickleback (Pungitius pungitius) Genome. Genome Biol Evol 2019. [PMID: 31687752 DOI: 10.1101/741751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Abstract
The Gasterosteidae fish family hosts several species that are important models for eco-evolutionary, genetic, and genomic research. In particular, a wealth of genetic and genomic data has been generated for the three-spined stickleback (Gasterosteus aculeatus), the "ecology's supermodel," whereas the genomic resources for the nine-spined stickleback (Pungitius pungitius) have remained relatively scarce. Here, we report a high-quality chromosome-level genome assembly of P. pungitius consisting of 5,303 contigs (N50 = 1.2 Mbp) with a total size of 521 Mbp. These contigs were mapped to 21 linkage groups using a high-density linkage map, yielding a final assembly with 98.5% BUSCO completeness. A total of 25,062 protein-coding genes were annotated, and about 23% of the assembly was found to consist of repetitive elements. A comprehensive analysis of repetitive elements uncovered centromere-specific tandem repeats and provided insights into the evolution of retrotransposons. A multigene phylogenetic analysis inferred a divergence time of about 26 million years ago (Ma) between nine- and three-spined sticklebacks, which is far older than the commonly assumed estimate of 13 Ma. Compared with the three-spined stickleback, we identified an additional duplication of several genes in the hemoglobin cluster. Sequencing data from populations adapted to different environments indicated potential copy number variations in hemoglobin genes. Furthermore, genome-wide synteny comparisons between three- and nine-spined sticklebacks identified chromosomal rearrangements underlying the karyotypic differences between the two species. The high-quality chromosome-scale assembly of the nine-spined stickleback genome obtained with long-read sequencing technology provides a crucial resource for comparative and population genomic investigations of stickleback fishes and teleosts.
Collapse
|
21
|
A High-Quality Assembly of the Nine-Spined Stickleback (Pungitius pungitius) Genome. Genome Biol Evol 2019; 11:3291-3308. [PMID: 31687752 PMCID: PMC7145574 DOI: 10.1093/gbe/evz240] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/30/2019] [Indexed: 12/22/2022] Open
Abstract
The Gasterosteidae fish family hosts several species that are important models for eco-evolutionary, genetic, and genomic research. In particular, a wealth of genetic and genomic data has been generated for the three-spined stickleback (Gasterosteus aculeatus), the "ecology's supermodel," whereas the genomic resources for the nine-spined stickleback (Pungitius pungitius) have remained relatively scarce. Here, we report a high-quality chromosome-level genome assembly of P. pungitius consisting of 5,303 contigs (N50 = 1.2 Mbp) with a total size of 521 Mbp. These contigs were mapped to 21 linkage groups using a high-density linkage map, yielding a final assembly with 98.5% BUSCO completeness. A total of 25,062 protein-coding genes were annotated, and about 23% of the assembly was found to consist of repetitive elements. A comprehensive analysis of repetitive elements uncovered centromere-specific tandem repeats and provided insights into the evolution of retrotransposons. A multigene phylogenetic analysis inferred a divergence time of about 26 million years ago (Ma) between nine- and three-spined sticklebacks, which is far older than the commonly assumed estimate of 13 Ma. Compared with the three-spined stickleback, we identified an additional duplication of several genes in the hemoglobin cluster. Sequencing data from populations adapted to different environments indicated potential copy number variations in hemoglobin genes. Furthermore, genome-wide synteny comparisons between three- and nine-spined sticklebacks identified chromosomal rearrangements underlying the karyotypic differences between the two species. The high-quality chromosome-scale assembly of the nine-spined stickleback genome obtained with long-read sequencing technology provides a crucial resource for comparative and population genomic investigations of stickleback fishes and teleosts.
Collapse
|
22
|
The genetic architecture of adaptation: convergence and pleiotropy in Heliconius wing pattern evolution. Heredity (Edinb) 2019; 123:138-152. [PMID: 30670842 PMCID: PMC6781118 DOI: 10.1038/s41437-018-0180-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Revised: 12/14/2018] [Accepted: 12/17/2018] [Indexed: 12/14/2022] Open
Abstract
Unravelling the genetic basis of adaptive traits is a major challenge in evolutionary biology. Doing so informs our understanding of evolution towards an adaptive optimum, the distribution of locus effect sizes, and the influence of genetic architecture on the evolvability of a trait. In the Müllerian co-mimics Heliconius melpomene and Heliconius erato some Mendelian loci affecting mimicry shifts are well known. However, several phenotypes in H. melpomene remain to be mapped, and the quantitative genetics of colour pattern variation has rarely been analysed. Here we use quantitative trait loci (QTL) analyses of crosses between H. melpomene races from Peru and Suriname to map, for the first time, the control of the broken band phenotype to WntA and identify a ~100 kb region controlling this variation. Additionally, we map variation in basal forewing red-orange pigmentation to a locus centred around the gene ventral veins lacking (vvl). The locus also appears to affect medial band shape variation as it was previously known to do in H. erato. This adds to the list of homologous regions controlling convergent phenotypes between these two species. Finally we show that Heliconius wing-patterning genes are strikingly pleiotropic among wing pattern traits. Our results demonstrate how genetic architecture can shape, aid and constrain adaptive evolution.
Collapse
|
23
|
Early Sex-Chromosome Evolution in the Diploid Dioecious Plant Mercurialis annua. Genetics 2019; 212:815-835. [PMID: 31113811 PMCID: PMC6614902 DOI: 10.1534/genetics.119.302045] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 05/13/2019] [Indexed: 12/30/2022] Open
Abstract
Suppressed recombination allows divergence between homologous sex chromosomes and the functionality of their genes. Here, we reveal patterns of the earliest stages of sex-chromosome evolution in the diploid dioecious herb Mercurialis annua on the basis of cytological analysis, de novo genome assembly and annotation, genetic mapping, exome resequencing of natural populations, and transcriptome analysis. The genome assembly contained 34,105 expressed genes, of which 10,076 were assigned to linkage groups. Genetic mapping and exome resequencing of individuals across the species range both identified the largest linkage group, LG1, as the sex chromosome. Although the sex chromosomes of M. annua are karyotypically homomorphic, we estimate that about one-third of the Y chromosome, containing 568 transcripts and spanning 22.3 cM in the corresponding female map, has ceased recombining. Nevertheless, we found limited evidence for Y-chromosome degeneration in terms of gene loss and pseudogenization, and most X- and Y-linked genes appear to have diverged in the period subsequent to speciation between M. annua and its sister species M. huetii, which shares the same sex-determining region. Taken together, our results suggest that the M. annua Y chromosome has at least two evolutionary strata: a small old stratum shared with M. huetii, and a more recent larger stratum that is probably unique to M. annua and that stopped recombining ∼1 MYA. Patterns of gene expression within the nonrecombining region are consistent with the idea that sexually antagonistic selection may have played a role in favoring suppressed recombination.
Collapse
|
24
|
Author Correction: Genome sequencing and population genomic analyses provide insights into the adaptive landscape of silver birch. Nat Genet 2019; 51:1187-1189. [PMID: 31197270 PMCID: PMC8076037 DOI: 10.1038/s41588-019-0442-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
25
|
Unprecedented reorganization of holocentric chromosomes provides insights into the enigma of lepidopteran chromosome evolution. SCIENCE ADVANCES 2019; 5:eaau3648. [PMID: 31206013 PMCID: PMC6561736 DOI: 10.1126/sciadv.aau3648] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 05/03/2019] [Indexed: 05/04/2023]
Abstract
Chromosome evolution presents an enigma in the mega-diverse Lepidoptera. Most species exhibit constrained chromosome evolution with nearly identical haploid chromosome counts and chromosome-level gene collinearity among species more than 140 million years divergent. However, a few species possess radically inflated chromosomal counts due to extensive fission and fusion events. To address this enigma of constraint in the face of an exceptional ability to change, we investigated an unprecedented reorganization of the standard lepidopteran chromosome structure in the green-veined white butterfly (Pieris napi). We find that gene content in P. napi has been extensively rearranged in large collinear blocks, which until now have been masked by a haploid chromosome number close to the lepidopteran average. We observe that ancient chromosome ends have been maintained and collinear blocks are enriched for functionally related genes suggesting both a mechanism and a possible role for selection in determining the boundaries of these genome-wide rearrangements.
Collapse
|
26
|
Abstract
BACKGROUND With long reads getting even longer and cheaper, large scale sequencing projects can be accomplished without short reads at an affordable cost. Due to the high error rates and less mature tools, de novo assembly of long reads is still challenging and often results in a large collection of contigs. Dense linkage maps are collections of markers whose location on the genome is approximately known. Therefore they provide long range information that has the potential to greatly aid in de novo assembly. Previously linkage maps have been used to detect misassemblies and to manually order contigs. However, no fully automated tools exist to incorporate linkage maps in assembly but instead large amounts of manual labour is needed to order the contigs into chromosomes. RESULTS We formulate the genome assembly problem in the presence of linkage maps and present the first method for guided genome assembly using linkage maps. Our method is based on an additional cleaning step added to the assembly. We show that it can simplify the underlying assembly graph, resulting in more contiguous assemblies and reducing the amount of misassemblies when compared to de novo assembly. CONCLUSIONS We present the first method to integrate linkage maps directly into genome assembly. With a modest increase in runtime, our method improves contiguity and correctness of genome assembly.
Collapse
|
27
|
Abstract
The evolution of new species is made easier when traits under divergent ecological selection are also mating cues. Such ecological mating cues are now considered more common than previously thought, but we still know little about the genetic changes underlying their evolution or more generally about the genetic basis for assortative mating behaviors. Both tight physical linkage and the existence of large-effect preference loci will strengthen genetic associations between behavioral and ecological barriers, promoting the evolution of assortative mating. The warning patterns of Heliconius melpomene and H. cydno are under disruptive selection due to increased predation of nonmimetic hybrids and are used during mate recognition. We carried out a genome-wide quantitative trait locus (QTL) analysis of preference behaviors between these species and showed that divergent male preference has a simple genetic basis. We identify three QTLs that together explain a large proportion (approximately 60%) of the difference in preference behavior observed between the parental species. One of these QTLs is just 1.2 (0-4.8) centiMorgans (cM) from the major color pattern gene optix, and, individually, all three have a large effect on the preference phenotype. Genomic divergence between H. cydno and H. melpomene is high but broadly heterogenous, and admixture is reduced at the preference-optix color pattern locus but not the other preference QTLs. The simple genetic architecture we reveal will facilitate the evolution and maintenance of new species despite ongoing gene flow by coupling behavioral and ecological aspects of reproductive isolation.
Collapse
|
28
|
Bracketing phenogenotypic limits of mammalian hybridization. ROYAL SOCIETY OPEN SCIENCE 2018; 5:180903. [PMID: 30564397 PMCID: PMC6281900 DOI: 10.1098/rsos.180903] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2018] [Accepted: 10/29/2018] [Indexed: 05/09/2023]
Abstract
An increasing number of mammalian species have been shown to have a history of hybridization and introgression based on genetic analyses. Only relatively few fossils, however, preserve genetic material, and morphology must be used to identify the species and determine whether morphologically intermediate fossils could represent hybrids. Because dental and cranial fossils are typically the key body parts studied in mammalian palaeontology, here we bracket the potential for phenotypically extreme hybridizations by examining uniquely preserved cranio-dental material of a captive hybrid between grey and ringed seals. We analysed how distinct these species are genetically and morphologically, how easy it is to identify the hybrids using morphology and whether comparable hybridizations happen in the wild. We show that the genetic distance between these species is more than twice the modern human-Neanderthal distance, but still within that of morphologically similar species pairs known to hybridize. By contrast, morphological and developmental analyses show grey and ringed seals to be highly disparate, and that the hybrid is a predictable intermediate. Genetic analyses of the parent populations reveal introgression in the wild, suggesting that grey-ringed seal hybridization is not limited to captivity. Taken together, we postulate that there is considerable potential for mammalian hybridization between phenotypically disparate taxa.
Collapse
|
29
|
Evolution at two time frames: Polymorphisms from an ancient singular divergence event fuel contemporary parallel evolution. PLoS Genet 2018; 14:e1007796. [PMID: 30422983 PMCID: PMC6258555 DOI: 10.1371/journal.pgen.1007796] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Revised: 11/27/2018] [Accepted: 10/30/2018] [Indexed: 01/12/2023] Open
Abstract
When environments change, populations may adapt surprisingly fast, repeatedly and even at microgeographic scales. There is increasing evidence that such cases of rapid parallel evolution are fueled by standing genetic variation, but the source of this genetic variation remains poorly understood. In the saltmarsh beetle Pogonus chalceus, short-winged 'tidal' and long-winged 'seasonal' ecotypes have diverged in response to contrasting hydrological regimes and can be repeatedly found along the Atlantic European coast. By analyzing genomic variation across the beetles' distribution, we reveal that alleles selected in the tidal ecotype are spread across the genome and evolved during a singular and, likely, geographically isolated divergence event, within the last 190 Kya. Due to subsequent admixture, the ancient and differentially selected alleles are currently polymorphic in most populations across its range, which could potentially allow for the fast evolution of one ecotype from a small number of random individuals, as low as 5 to 15, from a population of the other ecotype. Our results suggest that cases of fast parallel ecological divergence can be the result of evolution at two different time frames: divergence in the past, followed by repeated selection on the same divergently evolved alleles after admixture. These findings highlight the importance of an ancient and, likely, allopatric divergence event for driving the rate and direction of contemporary fast evolution under gene flow. This mechanism is potentially driven by periods of geographic isolation imposed by large-scale environmental changes such as glacial cycles.
Collapse
|
30
|
Linkage disequilibrium clustering‐based approach for association mapping with tightly linked genomewide data. Mol Ecol Resour 2018; 18:809-824. [DOI: 10.1111/1755-0998.12893] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Revised: 04/05/2018] [Accepted: 04/06/2018] [Indexed: 02/05/2023]
|
31
|
Inferring dispersal across a fragmented landscape using reconstructed families in the Glanville fritillary butterfly. Evol Appl 2017. [DOI: 10.1111/eva.12552] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
|
32
|
Lep-MAP3: robust linkage mapping even for low-coverage whole genome sequencing data. Bioinformatics 2017; 33:3726-3732. [DOI: 10.1093/bioinformatics/btx494] [Citation(s) in RCA: 208] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2017] [Accepted: 08/01/2017] [Indexed: 11/13/2022] Open
|
33
|
Construction of Ultradense Linkage Maps with Lep-MAP2: Stickleback F2 Recombinant Crosses as an Example. Genome Biol Evol 2015; 8:78-93. [PMID: 26668116 PMCID: PMC4758246 DOI: 10.1093/gbe/evv250] [Citation(s) in RCA: 94] [Impact Index Per Article: 10.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
High-density linkage maps are important tools for genome biology and evolutionary genetics by quantifying the extent of recombination, linkage disequilibrium, and chromosomal rearrangements across chromosomes, sexes, and populations. They provide one of the best ways to validate and refine de novo genome assemblies, with the power to identify errors in assemblies increasing with marker density. However, assembly of high-density linkage maps is still challenging due to software limitations. We describe Lep-MAP2, a software for ultradense genome-wide linkage map construction. Lep-MAP2 can handle various family structures and can account for achiasmatic meiosis to gain linkage map accuracy. Simulations show that Lep-MAP2 outperforms other available mapping software both in computational efficiency and accuracy. When applied to two large F2-generation recombinant crosses between two nine-spined stickleback (Pungitius pungitius) populations, it produced two high-density (∼6 markers/cM) linkage maps containing 18,691 and 20,054 single nucleotide polymorphisms. The two maps showed a high degree of synteny, but female maps were 1.5–2 times longer than male maps in all linkage groups, suggesting genome-wide recombination suppression in males. Comparison with the genome sequence of the three-spined stickleback (Gasterosteus aculeatus) revealed a high degree of interspecific synteny with a low frequency (<5%) of interchromosomal rearrangements. However, a fairly large (ca. 10 Mb) translocation from autosome to sex chromosome was detected in both maps. These results illustrate the utility and novel features of Lep-MAP2 in assembling high-density linkage maps, and their usefulness in revealing evolutionarily interesting properties of genomes, such as strong genome-wide sex bias in recombination rates.
Collapse
|
34
|
Flight-induced changes in gene expression in the Glanville fritillary butterfly. Mol Ecol 2015; 24:4886-900. [DOI: 10.1111/mec.13359] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2014] [Revised: 08/24/2015] [Accepted: 08/25/2015] [Indexed: 12/15/2022]
|
35
|
Genome-wide SNP identification for the construction of a high-resolution genetic map of Japanese flounder (Paralichthys olivaceus): applications to QTL mapping of Vibrio anguillarum disease resistance and comparative genomic analysis. DNA Res 2015; 22:161-70. [PMID: 25762582 PMCID: PMC4401326 DOI: 10.1093/dnares/dsv001] [Citation(s) in RCA: 102] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2014] [Accepted: 02/01/2015] [Indexed: 12/18/2022] Open
Abstract
High-resolution genetic maps are essential for fine mapping of complex traits, genome assembly, and comparative genomic analysis. Single-nucleotide polymorphisms (SNPs) are the primary molecular markers used for genetic map construction. In this study, we identified 13,362 SNPs evenly distributed across the Japanese flounder (Paralichthys olivaceus) genome. Of these SNPs, 12,712 high-confidence SNPs were subjected to high-throughput genotyping and assigned to 24 consensus linkage groups (LGs). The total length of the genetic linkage map was 3,497.29 cM with an average distance of 0.47 cM between loci, thereby representing the densest genetic map currently reported for Japanese flounder. Nine positive quantitative trait loci (QTLs) forming two main clusters for Vibrio anguillarum disease resistance were detected. All QTLs could explain 5.1-8.38% of the total phenotypic variation. Synteny analysis of the QTL regions on the genome assembly revealed 12 immune-related genes, among them 4 genes strongly associated with V. anguillarum disease resistance. In addition, 246 genome assembly scaffolds with an average size of 21.79 Mb were anchored onto the LGs; these scaffolds, comprising 522.99 Mb, represented 95.78% of assembled genomic sequences. The mapped assembly scaffolds in Japanese flounder were used for genome synteny analyses against zebrafish (Danio rerio) and medaka (Oryzias latipes). Flounder and medaka were found to possess almost one-to-one synteny, whereas flounder and zebrafish exhibited a multi-syntenic correspondence. The newly developed high-resolution genetic map, which will facilitate QTL mapping, scaffold assembly, and genome synteny analysis of Japanese flounder, marks a milestone in the ongoing genome project for this species.
Collapse
|
36
|
The Glanville fritillary genome retains an ancient karyotype and reveals selective chromosomal fusions in Lepidoptera. Nat Commun 2014; 5:4737. [PMID: 25189940 PMCID: PMC4164777 DOI: 10.1038/ncomms5737] [Citation(s) in RCA: 153] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2014] [Accepted: 07/17/2014] [Indexed: 12/30/2022] Open
Abstract
Previous studies have reported that chromosome synteny in Lepidoptera has been well conserved, yet the number of haploid chromosomes varies widely from 5 to 223. Here we report the genome (393 Mb) of the Glanville fritillary butterfly (Melitaea cinxia; Nymphalidae), a widely recognized model species in metapopulation biology and eco-evolutionary research, which has the putative ancestral karyotype of n=31. Using a phylogenetic analyses of Nymphalidae and of other Lepidoptera, combined with orthologue-level comparisons of chromosomes, we conclude that the ancestral lepidopteran karyotype has been n=31 for at least 140 My. We show that fusion chromosomes have retained the ancestral chromosome segments and very few rearrangements have occurred across the fusion sites. The same, shortest ancestral chromosomes have independently participated in fusion events in species with smaller karyotypes. The short chromosomes have higher rearrangement rate than long ones. These characteristics highlight distinctive features of the evolutionary dynamics of butterflies and moths. Butterflies and moths (Lepidoptera) vary in chromosome number. Here, the authors sequence the genome of the Glanville fritillary butterfly, Melitaea cinxia, show it has the ancestral lepidopteran karyotype and provide insight into how chromosomal fusions have shaped karyotype evolution in butterflies and moths.
Collapse
|
37
|
Abstract
MOTIVATION Current high-throughput sequencing technologies allow cost-efficient genotyping of millions of single nucleotide polymorphisms (SNPs) for hundreds of samples. However, the tools that are currently available for constructing linkage maps are not well suited for large datasets. Linkage maps of large datasets would be helpful in de novo genome assembly by facilitating comprehensive genome validation and refinement by enabling chimeric scaffold detection, as well as in family-based linkage and association studies, quantitative trait locus mapping, analysis of genome synteny and other complex genomic data analyses. RESULTS We describe a novel tool, called Lepidoptera-MAP (Lep-MAP), for constructing accurate linkage maps with ultradense genome-wide SNP data. Lep-MAP is fast and memory efficient and largely automated, requiring minimal user interaction. It uses simultaneously data on multiple outbred families and can increase linkage map accuracy by taking into account achiasmatic meiosis, a special feature of Lepidoptera and some other taxa with no recombination in one sex (no recombination in females in Lepidoptera). We demonstrate that Lep-MAP outperforms other methods on real and simulated data. We construct a genome-wide linkage map of the Glanville fritillary butterfly (Melitaea cinxia) with over 40 000 SNPs. The data were generated with a novel in-house SOLiD restriction site-associated DNA tag sequencing protocol, which is described in the online supplementary material. AVAILABILITY AND IMPLEMENTATION Java source code under GNU general public license with the compiled classes and the datasets are available from http://sourceforge.net/users/lep-map.
Collapse
|
38
|
DNA-binding specificities of human transcription factors. Cell 2013; 152:327-39. [PMID: 23332764 DOI: 10.1016/j.cell.2012.12.009] [Citation(s) in RCA: 855] [Impact Index Per Article: 77.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Revised: 08/18/2012] [Accepted: 12/03/2012] [Indexed: 12/23/2022]
Abstract
Although the proteins that read the gene regulatory code, transcription factors (TFs), have been largely identified, it is not well known which sequences TFs can recognize. We have analyzed the sequence-specific binding of human TFs using high-throughput SELEX and ChIP sequencing. A total of 830 binding profiles were obtained, describing 239 distinctly different binding specificities. The models represent the majority of human TFs, approximately doubling the coverage compared to existing systematic studies. Our results reveal additional specificity determinants for a large number of factors for which a partial specificity was known, including a commonly observed A- or T-rich stretch that flanks the core motifs. Global analysis of the data revealed that homodimer orientation and spacing preferences, and base-stacking interactions, have a larger role in TF-DNA binding than previously appreciated. We further describe a binding model incorporating these features that is required to understand binding of TFs to DNA.
Collapse
|
39
|
Abstract
We investigated inbreeding depression and genetic load in a small (N(e) ∼ 100) population of the Glanville fritillary butterfly (Melitaea cinxia), which has been completely isolated on a small island [Pikku Tytärsaari (PT)] in the Baltic Sea for at least 75 y. As a reference, we studied conspecific populations from the well-studied metapopulation in the Åland Islands (ÅL), 400 km away. A large population in Saaremaa, Estonia, was used as a reference for estimating genetic diversity and N(e). We investigated 58 traits related to behavior, development, morphology, reproductive performance, and metabolism. The PT population exhibited high genetic load (L = 1 - W(PT)/W(ÅL)) in a range of fitness-related traits including adult weight (L = 0.12), flight metabolic rate (L = 0.53), egg viability (L = 0.37), and lifetime production of eggs in an outdoor population cage (L = 0.70). These results imply extensive fixation of deleterious recessive mutations, supported by greatly reduced diversity in microsatellite markers and immediate recovery (heterosis) of egg viability and flight metabolic rate in crosses with other populations. There was no significant inbreeding depression in most traits due to one generation of full-sib mating. Resting metabolic rate was significantly elevated in PT males, which may be related to their short lifespan (L = 0.25). The demographic history and the effective size of the PT population place it in the part of the parameter space in which models predict mutation accumulation. This population exemplifies the increasingly common situation in fragmented landscapes, in which small and completely isolated populations are vulnerable to extinction due to high genetic load.
Collapse
|
40
|
Rule-based induction method for haplotype comparison and identification of candidate disease loci. Genome Med 2012; 4:21. [PMID: 22429919 PMCID: PMC3446271 DOI: 10.1186/gm320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2011] [Accepted: 03/19/2012] [Indexed: 11/21/2022] Open
Abstract
There is a need for methods that are able to identify rare variants that cause low or moderate penetrance disease susceptibility. To answer this need, we introduce a rule-based haplotype comparison method, Haplous, which identifies haplotypes within multiple samples from phased genotype data and compares them within and between sample groups. We demonstrate that Haplous is able to accurately identify haplotypes that are identical by descent, exclude common haplotypes in the studied population and select rare haplotypes from the data. Our analysis of three families with multiple individuals affected by lymphoma identified several interesting haplotypes shared by distantly related patients.
Collapse
|
41
|
Finding significant matches of position weight matrices in linear time. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:69-79. [PMID: 21071798 DOI: 10.1109/tcbb.2009.35] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Position weight matrices are an important method for modeling signals or motifs in biological sequences, both in DNA and protein contexts. In this paper, we present fast algorithms for the problem of finding significant matches of such matrices. Our algorithms are of the online type, and they generalize classical multipattern matching, filtering, and superalphabet techniques of combinatorial string matching to the problem of weight matrix matching. Several variants of the algorithms are developed, including multiple matrix extensions that perform the search for several matrices in one scan through the sequence database. Experimental performance evaluation is provided to compare the new techniques against each other as well as against some other online and index-based algorithms proposed in the literature. Compared to the brute-force O(mn) approach, our solutions can be faster by a factor that is proportional to the matrix length m. Our multiple-matrix filtration algorithm had the best performance in the experiments. On a current PC, this algorithm finds significant matches (p = 0.0001) of the 123 JASPAR matrices in the human genome in about 18 minutes.
Collapse
|
42
|
Abstract
UNLABELLED MOODS (MOtif Occurrence Detection Suite) is a software package for matching position weight matrices against DNA sequences. MOODS implements state-of-the-art online matching algorithms, achieving considerably faster scanning speed than with a simple brute-force search. MOODS is written in C++, with bindings for the popular BioPerl and Biopython toolkits. It can easily be adapted for different purposes and integrated into existing workflows. It can also be used as a C++ library. AVAILABILITY The package with documentation and examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind. The source code is also available under the terms of a GNU General Public License (GPL).
Collapse
|