1
|
Rich KD, Srivastava S, Muthye VR, Wasmuth JD. Identification of potential molecular mimicry in pathogen-host interactions. PeerJ 2023; 11:e16339. [PMID: 37953771 PMCID: PMC10637249 DOI: 10.7717/peerj.16339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 10/02/2023] [Indexed: 11/14/2023] Open
Abstract
Pathogens have evolved sophisticated strategies to manipulate host signaling pathways, including the phenomenon of molecular mimicry, where pathogen-derived biomolecules imitate host biomolecules. In this study, we resurrected, updated, and optimized a sequence-based bioinformatics pipeline to identify potential molecular mimicry candidates between humans and 32 pathogenic species whose proteomes' 3D structure predictions were available at the start of this study. We observed considerable variation in the number of mimicry candidates across pathogenic species, with pathogenic bacteria exhibiting fewer candidates compared to fungi and protozoans. Further analysis revealed that the candidate mimicry regions were enriched in solvent-accessible regions, highlighting their potential functional relevance. We identified a total of 1,878 mimicked regions in 1,439 human proteins, and clustering analysis indicated diverse target proteins across pathogen species. The human proteins containing mimicked regions revealed significant associations between these proteins and various biological processes, with an emphasis on host extracellular matrix organization and cytoskeletal processes. However, immune-related proteins were underrepresented as targets of mimicry. Our findings provide insights into the broad range of host-pathogen interactions mediated by molecular mimicry and highlight potential targets for further investigation. This comprehensive analysis contributes to our understanding of the complex mechanisms employed by pathogens to subvert host defenses and we provide a resource to assist researchers in the development of novel therapeutic strategies.
Collapse
Affiliation(s)
- Kaylee D. Rich
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - Shruti Srivastava
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - Viraj R. Muthye
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| | - James D. Wasmuth
- Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, Canada
- Host-Parasite Interactions Research Training Network, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
2
|
Wang N, Khan S, Elo LL. VarSCAT: A computational tool for sequence context annotations of genomic variants. PLoS Comput Biol 2023; 19:e1010727. [PMID: 37566612 PMCID: PMC10446208 DOI: 10.1371/journal.pcbi.1010727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 08/23/2023] [Accepted: 07/20/2023] [Indexed: 08/13/2023] Open
Abstract
The sequence contexts of genomic variants play important roles in understanding biological significances of variants and potential sequencing related variant calling issues. However, methods for assessing the diverse sequence contexts of genomic variants such as tandem repeats and unambiguous annotations have been limited. Herein, we describe the Variant Sequence Context Annotation Tool (VarSCAT) for annotating the sequence contexts of genomic variants, including breakpoint ambiguities, flanking bases of variants, wildtype/mutated DNA sequences, variant nomenclatures, distances between adjacent variants, tandem repeat regions, and custom annotation with user customizable options. Our analyses demonstrate that VarSCAT is more versatile and customizable than the currently available methods or strategies for annotating variants in short tandem repeat (STR) regions or insertions and deletions (indels) with breakpoint ambiguity. Variant sequence context annotations of high-confidence human variant sets with VarSCAT revealed that more than 75% of all human individual germline and clinically relevant indels have breakpoint ambiguities. Moreover, we illustrate that more than 80% of human individual germline small variants in STR regions are indels and that the sizes of these indels correlated with STR motif sizes. VarSCAT is available from https://github.com/elolab/VarSCAT.
Collapse
Affiliation(s)
- Ning Wang
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
- InFLAMES Research Flagship Center, University of Turku, Turku, Finland
| | - Sofia Khan
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
| | - Laura L. Elo
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Turku, Finland
- InFLAMES Research Flagship Center, University of Turku, Turku, Finland
- Institute of Biomedicine, University of Turku, Turku, Finland
| |
Collapse
|
3
|
Tao Y, Luo H, Xu J, Cruickshank A, Zhao X, Teng F, Hathorn A, Wu X, Liu Y, Shatte T, Jordan D, Jing H, Mace E. Extensive variation within the pan-genome of cultivated and wild sorghum. NATURE PLANTS 2021; 7:766-773. [PMID: 34017083 DOI: 10.1038/s41477-021-00925-x] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 04/21/2021] [Indexed: 05/18/2023]
Abstract
Sorghum is a drought-tolerant staple crop for half a billion people in Africa and Asia, an important source of animal feed throughout the world and a biofuel feedstock of growing importance. Cultivated sorghum and its inter-fertile wild relatives constitute the primary gene pool for sorghum. Understanding and characterizing the diversity within this valuable resource is fundamental for its effective utilization in crop improvement. Here, we report analysis of a sorghum pan-genome to explore genetic diversity within the sorghum primary gene pool. We assembled 13 genomes representing cultivated sorghum and its wild relatives, and integrated them with 3 other published genomes to generate a pan-genome of 44,079 gene families with 222.6 Mb of new sequence identified. The pan-genome displays substantial gene-content variation, with 64% of gene families showing presence/absence variation among genomes. Comparisons between core genes and dispensable genes suggest that dispensable genes are important for sorghum adaptation. Extensive genetic variation was uncovered within the pan-genome, and the distribution of these variations was influenced by variation of recombination rate and transposable element content across the genome. We identified presence/absence variants that were under selection during sorghum domestication and improvement, and demonstrated that such variation had important phenotypic outcomes that could contribute to crop improvement. The constructed sorghum pan-genome represents an important resource for sorghum improvement and gene discovery.
Collapse
Affiliation(s)
- Yongfu Tao
- Queensland Alliance for Agriculture and Food Innovation (QAAFI), Hermitage Research Facility, The University of Queensland, Warwick, Queensland, Australia
| | - Hong Luo
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Jiabao Xu
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | - Alan Cruickshank
- Hermitage Research Facility, Agri-Science Queensland, Department of Agriculture and Fisheries (DAF), Warwick, Queensland, Australia
| | - Xianrong Zhao
- Queensland Alliance for Agriculture and Food Innovation (QAAFI), Hermitage Research Facility, The University of Queensland, Warwick, Queensland, Australia
| | - Fei Teng
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
| | - Adrian Hathorn
- Queensland Alliance for Agriculture and Food Innovation (QAAFI), Hermitage Research Facility, The University of Queensland, Warwick, Queensland, Australia
| | - Xiaoyuan Wu
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Yuanming Liu
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Tracey Shatte
- Hermitage Research Facility, Agri-Science Queensland, Department of Agriculture and Fisheries (DAF), Warwick, Queensland, Australia
| | - David Jordan
- Queensland Alliance for Agriculture and Food Innovation (QAAFI), Hermitage Research Facility, The University of Queensland, Warwick, Queensland, Australia.
| | - Haichun Jing
- Key Laboratory of Plant Resources, Institute of Botany, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| | - Emma Mace
- Queensland Alliance for Agriculture and Food Innovation (QAAFI), Hermitage Research Facility, The University of Queensland, Warwick, Queensland, Australia.
- Hermitage Research Facility, Agri-Science Queensland, Department of Agriculture and Fisheries (DAF), Warwick, Queensland, Australia.
| |
Collapse
|
4
|
The Genome of the Human Pathogen Candida albicans Is Shaped by Mutation and Cryptic Sexual Recombination. mBio 2018; 9:mBio.01205-18. [PMID: 30228236 PMCID: PMC6143739 DOI: 10.1128/mbio.01205-18] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The opportunistic fungal pathogen Candida albicans lacks a conventional sexual program and is thought to evolve, at least primarily, through the clonal acquisition of genetic changes. Here, we performed an analysis of heterozygous diploid genomes from 21 clinical isolates to determine the natural evolutionary processes acting on the C. albicans genome. Mutation and recombination shaped the genomic landscape among the C. albicans isolates. Strain-specific single nucleotide polymorphisms (SNPs) and insertions/deletions (indels) clustered across the genome. Additionally, loss-of-heterozygosity (LOH) events contributed substantially to genotypic variation, with most long-tract LOH events extending to the ends of the chromosomes suggestive of repair via break-induced replication. Consistent with a model of inheritance by descent, most polymorphisms were shared between closely related strains. However, some isolates contained highly mosaic genomes consistent with strains having experienced interclade recombination during their evolutionary history. A detailed examination of mitochondrial genomes also revealed clear examples of interclade recombination among sequenced strains. These analyses therefore establish that both (para)sexual recombination and mitotic mutational processes drive evolution of this important pathogen. To further facilitate the study of C. albicans genomes, we also introduce an online platform, SNPMap, to examine SNP patterns in sequenced isolates.IMPORTANCE Mutations introduce variation into the genome upon which selection can act. Defining the nature of these changes is critical for determining species evolution, as well as for understanding the genetic changes driving important cellular processes. The heterozygous diploid fungus Candida albicans is both a frequent commensal organism and a prevalent opportunistic pathogen. A prevailing theory is that C. albicans evolves primarily through the gradual buildup of mitotic mutations, and a pressing issue is whether sexual or parasexual processes also operate within natural populations. Here, we establish that the C. albicans genome evolves by a combination of localized mutation and both short-tract and long-tract loss-of-heterozygosity (LOH) events within the sequenced isolates. Mutations are more prevalent within noncoding and heterozygous regions and LOH increases towards chromosome ends. Furthermore, we provide evidence for genetic exchange between isolates, establishing that sexual or parasexual processes have contributed to the diversity of both nuclear and mitochondrial genomes.
Collapse
|
5
|
Zhai Y, Alexandre BC. A Poissonian Model of Indel Rate Variation for Phylogenetic Tree Inference. Syst Biol 2018; 66:698-714. [PMID: 28204784 DOI: 10.1093/sysbio/syx033] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 01/27/2017] [Indexed: 01/22/2023] Open
Abstract
While indel rate variation has been observed and analyzed in detail, it is not taken into account by current indel-aware phylogenetic reconstruction methods. In this work, we introduce a continuous time stochastic process, the geometric Poisson indel process, that generalizes the Poisson indel process by allowing insertion and deletion rates to vary across sites. We design an efficient algorithm for computing the probability of a given multiple sequence alignment based on our new indel model. We describe a method to construct phylogeny estimates from a fixed alignment using neighbor joining. Using simulation studies, we show that ignoring indel rate variation may have a detrimental effect on the accuracy of the inferred phylogenies, and that our proposed method can sidestep this issue by inferring latent indel rate categories. We also show that our phylogenetic inference method may be more stable to taxa subsampling than methods that either ignore indels or indel rate variation. [evolutionary stochastic process; indel rate variation; Poisson indel process; TKF91.].
Collapse
Affiliation(s)
- Yongliang Zhai
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| | - Bouchard-Côté Alexandre
- Department of Statistics, University of British Columbia, Vancouver, British Columbia, V6T 1Z4, Canada
| |
Collapse
|
6
|
Sun XQ, Li DH, Xue JY, Yang SH, Zhang YM, Li MM, Hang YY. Insertion DNA Accelerates Meiotic Interchromosomal Recombination in Arabidopsis thaliana. Mol Biol Evol 2016; 33:2044-53. [PMID: 27189569 DOI: 10.1093/molbev/msw087] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Nucleotide insertions/deletions are ubiquitous in eukaryotic genomes, and the resulting hemizygous (unpaired) DNA has significant, heritable effects on adjacent DNA. However, little is known about the genetic behavior of insertion DNA. Here, we describe a binary transgenic system to study the behavior of insertion DNA during meiosis. Transgenic Arabidopsis lines were generated to carry two different defective reporter genes on nonhomologous chromosomes, designated as "recipient" and "donor" lines. Double hemizygous plants (harboring unpaired DNA) were produced by crossing between the recipient and the donor, and double homozygous lines (harboring paired DNA) via self-pollination. The transfer of the donor's unmutated sequence to the recipient generated a functional β-glucuronidase gene, which could be visualized by histochemical staining and corroborated by polymerase chain reaction amplification and sequencing. More than 673 million seedlings were screened, and the results showed that meiotic ectopic recombination in the hemizygous lines occurred at a frequency >6.49-fold higher than that in the homozygous lines. Gene conversion might have been exclusively or predominantly responsible for the gene correction events. The direct measurement of ectopic recombination events provided evidence that an insertion, in the absence of an allelic counterpart, could scan the entire genome for homologous counterparts with which to pair. Furthermore, the unpaired (hemizygous) architectures could accelerate ectopic recombination between itself and interchromosomal counterparts. We suggest that the ectopic recombination accelerated by hemizygous architectures may be a general mechanism for interchromosomal recombination through ubiquitously dispersed repeat sequences in plants, ultimately contributing to genetic renovation and eukaryotic evolution.
Collapse
Affiliation(s)
- Xiao-Qin Sun
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Ding-Hong Li
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Jia-Yu Xue
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Si-Hai Yang
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Yan-Mei Zhang
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Mi-Mi Li
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| | - Yue-Yu Hang
- Jiangsu Key Laboratory for the Research and Uti1ization of Plant Resources, Institute of Botany, Jiangsu Province and Chinese Academy of Sciences, Nanjing, China
| |
Collapse
|
7
|
Zhu W, Cooper DN, Zhao Q, Wang Y, Liu R, Li Q, Férec C, Wang Y, Chen JM. Concurrent nucleotide substitution mutations in the human genome are characterized by a significantly decreased transition/transversion ratio. Hum Mutat 2015; 36:333-41. [PMID: 25546635 DOI: 10.1002/humu.22749] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2014] [Accepted: 12/17/2014] [Indexed: 01/16/2023]
Abstract
There is accumulating evidence that the number of multiple-nucleotide substitutions (MNS) occurring in closely spaced sites in eukaryotic genomes is significantly higher than would be predicted from the random accumulation of independently generated single-nucleotide substitutions (SNS). Although this excess can in principle be accounted for by the concept of transient hypermutability, a general mutational signature of concurrent MNS mutations has not so far been evident. Employing a dataset (N = 449) of "concurrent" double MNS mutations causing human inherited disease, we have identified just such a mutational signature: concurrently generated double MNS mutations exhibit a >twofold lower transition/transversion ratio (termed RTs/Tv ) than independently generated de novo SNS mutations (<0.80 vs. 2.10; P = 2.69 × 10(-14) ). We replicated this novel finding through a similar analysis employing two double MNS variant datasets with differing abundances of concurrent events (150,521 variants with both substitutions on the same haplotypic lineage vs. 94,875 variants whose component substitutions were on different haplotypic lineages) plus 5,430,874 SNS variants, all being derived from the whole-genome sequencing of seven Chinese individuals. Evaluation of the newly observed mutational signature in diverse contexts provides solid support for the postulated role of translesion synthesis DNA polymerases in transient hypermutability.
Collapse
Affiliation(s)
- Wenjuan Zhu
- Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, China
| | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Zhang Z, Mao L, Chen H, Bu F, Li G, Sun J, Li S, Sun H, Jiao C, Blakely R, Pan J, Cai R, Luo R, Van de Peer Y, Jacobsen E, Fei Z, Huang S. Genome-Wide Mapping of Structural Variations Reveals a Copy Number Variant That Determines Reproductive Morphology in Cucumber. THE PLANT CELL 2015; 27:1595-604. [PMID: 26002866 PMCID: PMC4498199 DOI: 10.1105/tpc.114.135848] [Citation(s) in RCA: 91] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2014] [Revised: 03/26/2015] [Accepted: 04/30/2015] [Indexed: 05/18/2023]
Abstract
Structural variations (SVs) represent a major source of genetic diversity. However, the functional impact and formation mechanisms of SVs in plant genomes remain largely unexplored. Here, we report a nucleotide-resolution SV map of cucumber (Cucumis sativas) that comprises 26,788 SVs based on deep resequencing of 115 diverse accessions. The largest proportion of cucumber SVs was formed through nonhomologous end-joining rearrangements, and the occurrence of SVs is closely associated with regions of high nucleotide diversity. These SVs affect the coding regions of 1676 genes, some of which are associated with cucumber domestication. Based on the map, we discovered a copy number variation (CNV) involving four genes that defines the Female (F) locus and gives rise to gynoecious cucumber plants, which bear only female flowers and set fruit at almost every node. The CNV arose from a recent 30.2-kb duplication at a meiotically unstable region, likely via microhomology-mediated break-induced replication. The SV set provides a snapshot of structural variations in plants and will serve as an important resource for exploring genes underlying key traits and for facilitating practical breeding in cucumber.
Collapse
Affiliation(s)
- Zhonghua Zhang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops of the Ministry of Agriculture, Sino-Dutch Joint Laboratory of Horticultural Genomics, Beijing 100081, China
| | - Linyong Mao
- Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, New York 14853
| | - Huiming Chen
- Hunan Vegetable Research Institute, Hunan Academy of Agricultural Sciences, Changsha 410125, China
| | - Fengjiao Bu
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops of the Ministry of Agriculture, Sino-Dutch Joint Laboratory of Horticultural Genomics, Beijing 100081, China Agricultural Genomic Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Guangcun Li
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops of the Ministry of Agriculture, Sino-Dutch Joint Laboratory of Horticultural Genomics, Beijing 100081, China Shandong Academy of Agricultural Sciences, Jinan 250100, China
| | - Jinjing Sun
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops of the Ministry of Agriculture, Sino-Dutch Joint Laboratory of Horticultural Genomics, Beijing 100081, China
| | - Shuai Li
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops of the Ministry of Agriculture, Sino-Dutch Joint Laboratory of Horticultural Genomics, Beijing 100081, China
| | - Honghe Sun
- Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, New York 14853
| | - Chen Jiao
- Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, New York 14853
| | - Rachel Blakely
- Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, New York 14853
| | - Junsong Pan
- Shanghai Jiaotong University, Shanghai 200240, China
| | - Run Cai
- Shanghai Jiaotong University, Shanghai 200240, China
| | - Ruibang Luo
- Department of Computer Science, University of Hong Kong, Hong Kong 999077, China
| | - Yves Van de Peer
- Department of Plant Systems Biology, VIB, 9052 Ghent, Belgium Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium Genomics Research Institute, University of Pretoria, Pretoria 0028, South Africa
| | - Evert Jacobsen
- Deparment of Plant Sciences, Laboratory of Plant Breeding, Wageningen University and Research Centre, 6700AA Wageningen, The Netherlands
| | - Zhangjun Fei
- Boyce Thompson Institute for Plant Research, Cornell University, Ithaca, New York 14853 USDA-ARS Robert W. Holley Center for Agriculture and Health, Ithaca, New York 14853
| | - Sanwen Huang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops of the Ministry of Agriculture, Sino-Dutch Joint Laboratory of Horticultural Genomics, Beijing 100081, China Agricultural Genomic Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| |
Collapse
|
9
|
Characterization of 26 deletion CNVs reveals the frequent occurrence of micro-mutations within the breakpoint-flanking regions and frequent repair of double-strand breaks by templated insertions derived from remote genomic regions. Hum Genet 2015; 134:589-603. [DOI: 10.1007/s00439-015-1539-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2015] [Accepted: 03/05/2015] [Indexed: 10/23/2022]
|
10
|
Hoffmann A, Griffin P, Dillon S, Catullo R, Rane R, Byrne M, Jordan R, Oakeshott J, Weeks A, Joseph L, Lockhart P, Borevitz J, Sgrò C. A framework for incorporating evolutionary genomics into biodiversity conservation and management. ACTA ACUST UNITED AC 2015. [DOI: 10.1186/s40665-014-0009-x] [Citation(s) in RCA: 126] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
11
|
Haag ES, Thomas CG. Fundamentals of Comparative Genome Analysis in Caenorhabditis Nematodes. Methods Mol Biol 2015; 1327:11-21. [PMID: 26423964 DOI: 10.1007/978-1-4939-2842-2_2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The genome of the nematode Caenorhabditis elegans was the first of any animal to be sequenced completely, and it remains the "gold standard" for completeness and annotations. Even before the C. elegans genome was completed, however, biologists began examining the generality of its features in the genomes of other Caenorhabditis species. With many such genomes now sequenced and available via WormBase, C. elegans researchers are often confronted with how to interpret comparative genomic data. In this article, we present practical approaches to addressing several common issues, including possible sources of error in homology annotations, the often complex relationships between sequence similarity, orthology, paralogy, and gene family evolution, the impact of sexual mode on genome assemblies and content, and the determination and use of synteny as a tool.
Collapse
Affiliation(s)
- Eric S Haag
- Department of Biology, University of Maryland, 1210 Biology-Psychology Building, College Park, MD, 20742, USA.
| | - Cristel G Thomas
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada, M5S 3B2
| |
Collapse
|
12
|
Lenz C, Haerty W, Golding GB. Increased substitution rates surrounding low-complexity regions within primate proteins. Genome Biol Evol 2014; 6:655-65. [PMID: 24572016 PMCID: PMC3971593 DOI: 10.1093/gbe/evu042] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Previous studies have found that DNA-flanking low-complexity regions (LCRs) have an increased substitution rate. Here, the substitution rate was confirmed to increase in the vicinity of LCRs in several primate species, including humans. This effect was also found among human sequences from the 1000 Genomes Project. A strong correlation was found between average substitution rate per site and distance from the LCR, as well as the proportion of genes with gaps in the alignment at each site and distance from the LCR. Along with substitution rates, dN/dS ratios were also determined for each site, and the proportion of sites undergoing negative selection was found to have a negative relationship with distance from the LCR.
Collapse
Affiliation(s)
- Carolyn Lenz
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | | | | |
Collapse
|
13
|
Huang W, Massouras A, Inoue Y, Peiffer J, Ràmia M, Tarone AM, Turlapati L, Zichner T, Zhu D, Lyman RF, Magwire MM, Blankenburg K, Carbone MA, Chang K, Ellis LL, Fernandez S, Han Y, Highnam G, Hjelmen CE, Jack JR, Javaid M, Jayaseelan J, Kalra D, Lee S, Lewis L, Munidasa M, Ongeri F, Patel S, Perales L, Perez A, Pu L, Rollmann SM, Ruth R, Saada N, Warner C, Williams A, Wu YQ, Yamamoto A, Zhang Y, Zhu Y, Anholt RR, Korbel JO, Mittelman D, Muzny DM, Gibbs RA, Barbadilla A, Johnston JS, Stone EA, Richards S, Deplancke B, Mackay TF. Natural variation in genome architecture among 205 Drosophila melanogaster Genetic Reference Panel lines. Genome Res 2014; 24:1193-208. [PMID: 24714809 PMCID: PMC4079974 DOI: 10.1101/gr.171546.113] [Citation(s) in RCA: 434] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 04/01/2014] [Indexed: 12/30/2022]
Abstract
The Drosophila melanogaster Genetic Reference Panel (DGRP) is a community resource of 205 sequenced inbred lines, derived to improve our understanding of the effects of naturally occurring genetic variation on molecular and organismal phenotypes. We used an integrated genotyping strategy to identify 4,853,802 single nucleotide polymorphisms (SNPs) and 1,296,080 non-SNP variants. Our molecular population genomic analyses show higher deletion than insertion mutation rates and stronger purifying selection on deletions. Weaker selection on insertions than deletions is consistent with our observed distribution of genome size determined by flow cytometry, which is skewed toward larger genomes. Insertion/deletion and single nucleotide polymorphisms are positively correlated with each other and with local recombination, suggesting that their nonrandom distributions are due to hitchhiking and background selection. Our cytogenetic analysis identified 16 polymorphic inversions in the DGRP. Common inverted and standard karyotypes are genetically divergent and account for most of the variation in relatedness among the DGRP lines. Intriguingly, variation in genome size and many quantitative traits are significantly associated with inversions. Approximately 50% of the DGRP lines are infected with Wolbachia, and four lines have germline insertions of Wolbachia sequences, but effects of Wolbachia infection on quantitative traits are rarely significant. The DGRP complements ongoing efforts to functionally annotate the Drosophila genome. Indeed, 15% of all D. melanogaster genes segregate for potentially damaged proteins in the DGRP, and genome-wide analyses of quantitative traits identify novel candidate genes. The DGRP lines, sequence data, genotypes, quality scores, phenotypes, and analysis and visualization tools are publicly available.
Collapse
Affiliation(s)
- Wen Huang
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Andreas Massouras
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Yutaka Inoue
- Center for Education in Liberal Arts and Sciences, Osaka University, Osaka-fu, 560-0043 Japan
| | - Jason Peiffer
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Miquel Ràmia
- Genomics, Bioinformatics and Evolution Group, Institut de Biotecnologia i de Biomedicina (IBB), Department of Genetics and Microbiology, Campus Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - Aaron M. Tarone
- Department of Entomology, Texas A&M University, College Station, Texas 77843, USA
| | - Lavanya Turlapati
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Thomas Zichner
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - Dianhui Zhu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Richard F. Lyman
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Michael M. Magwire
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Kerstin Blankenburg
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Mary Anna Carbone
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Kyle Chang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Lisa L. Ellis
- Department of Entomology, Texas A&M University, College Station, Texas 77843, USA
| | - Sonia Fernandez
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Yi Han
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Gareth Highnam
- Virginia Tech Virginia Bioinformatics Institute and Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia 24061, USA
| | - Carl E. Hjelmen
- Department of Entomology, Texas A&M University, College Station, Texas 77843, USA
| | - John R. Jack
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Mehwish Javaid
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Joy Jayaseelan
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Divya Kalra
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Sandy Lee
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Lora Lewis
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Mala Munidasa
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Fiona Ongeri
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Shohba Patel
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Lora Perales
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Agapito Perez
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - LingLing Pu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Stephanie M. Rollmann
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Robert Ruth
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Nehad Saada
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Crystal Warner
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Aneisa Williams
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Yuan-Qing Wu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Akihiko Yamamoto
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Yiqing Zhang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Yiming Zhu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Robert R.H. Anholt
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Jan O. Korbel
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - David Mittelman
- Virginia Tech Virginia Bioinformatics Institute and Department of Biological Sciences, Virginia Tech, Blacksburg, Virginia 24061, USA
| | - Donna M. Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Richard A. Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Antonio Barbadilla
- Genomics, Bioinformatics and Evolution Group, Institut de Biotecnologia i de Biomedicina (IBB), Department of Genetics and Microbiology, Campus Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain
| | - J. Spencer Johnston
- Department of Entomology, Texas A&M University, College Station, Texas 77843, USA
| | - Eric A. Stone
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| | - Stephen Richards
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030 USA
| | - Bart Deplancke
- Laboratory of Systems Biology and Genetics, Institute of Bioengineering, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
- Swiss Institute of Bioinformatics, 1015 Lausanne, Switzerland
| | - Trudy F.C. Mackay
- Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina 27595, USA
| |
Collapse
|
14
|
Cutter AD, Jovelin R, Dey A. Molecular hyperdiversity and evolution in very large populations. Mol Ecol 2013; 22:2074-95. [PMID: 23506466 PMCID: PMC4065115 DOI: 10.1111/mec.12281] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2012] [Revised: 01/24/2013] [Accepted: 01/29/2013] [Indexed: 02/06/2023]
Abstract
The genomic density of sequence polymorphisms critically affects the sensitivity of inferences about ongoing sequence evolution, function and demographic history. Most animal and plant genomes have relatively low densities of polymorphisms, but some species are hyperdiverse with neutral nucleotide heterozygosity exceeding 5%. Eukaryotes with extremely large populations, mimicking bacterial and viral populations, present novel opportunities for studying molecular evolution in sexually reproducing taxa with complex development. In particular, hyperdiverse species can help answer controversial questions about the evolution of genome complexity, the limits of natural selection, modes of adaptation and subtleties of the mutation process. However, such systems have some inherent complications and here we identify topics in need of theoretical developments. Close relatives of the model organisms Caenorhabditis elegans and Drosophila melanogaster provide known examples of hyperdiverse eukaryotes, encouraging functional dissection of resulting molecular evolutionary patterns. We recommend how best to exploit hyperdiverse populations for analysis, for example, in quantifying the impact of noncrossover recombination in genomes and for determining the identity and micro-evolutionary selective pressures on noncoding regulatory elements.
Collapse
Affiliation(s)
- Asher D Cutter
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada.
| | | | | |
Collapse
|