1
|
Mingrone J, Susko E, Bielawski J. Smoothed Bootstrap Aggregation for Assessing Selection Pressure at Amino Acid Sites. Mol Biol Evol 2016; 33:2976-2989. [PMID: 27486222 DOI: 10.1093/molbev/msw160] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
To detect positive selection at individual amino acid sites, most methods use an empirical Bayes approach. After parameters of a Markov process of codon evolution are estimated via maximum likelihood, they are passed to Bayes formula to compute the posterior probability that a site evolved under positive selection. A difficulty with this approach is that parameter estimates with large errors can negatively impact Bayesian classification. By assigning priors to some parameters, Bayes Empirical Bayes (BEB) mitigates this problem. However, as implemented, it imposes uniform priors, which causes it to be overly conservative in some cases. When standard regularity conditions are not met and parameter estimates are unstable, inference, even under BEB, can be negatively impacted. We present an alternative to BEB called smoothed bootstrap aggregation (SBA), which bootstraps site patterns from an alignment of protein coding DNA sequences to accommodate the uncertainty in the parameter estimates. We show that deriving the correction for parameter uncertainty from the data in hand, in combination with kernel smoothing techniques, improves site specific inference of positive selection. We compare BEB to SBA by simulation and real data analysis. Simulation results show that SBA balances accuracy and power at least as well as BEB, and when parameter estimates are unstable, the performance gap between BEB and SBA can widen in favor of SBA. SBA is applicable to a wide variety of other inference problems in molecular evolution.
Collapse
Affiliation(s)
- Joseph Mingrone
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, NS, Canada
| | - Edward Susko
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, NS, Canada
| | - Joseph Bielawski
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, NS, Canada Department of Biology, Dalhousie University, Halifax, NS, Canada
| |
Collapse
|
2
|
Isaza JP, Galván AL, Polanco V, Huang B, Matveyev AV, Serrano MG, Manque P, Buck GA, Alzate JF. Revisiting the reference genomes of human pathogenic Cryptosporidium species: reannotation of C. parvum Iowa and a new C. hominis reference. Sci Rep 2015; 5:16324. [PMID: 26549794 PMCID: PMC4637869 DOI: 10.1038/srep16324] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2015] [Accepted: 10/08/2015] [Indexed: 11/09/2022] Open
Abstract
Cryptosporidium parvum and C. hominis are the most relevant species of this genus for human health. Both cause a self-limiting diarrhea in immunocompetent individuals, but cause potentially life-threatening disease in the immunocompromised. Despite the importance of these pathogens, only one reference genome of each has been analyzed and published. These two reference genomes were sequenced using automated capillary sequencing; as of yet, no next generation sequencing technology has been applied to improve their assemblies and annotations. For C. hominis, the main challenge that prevents a larger number of genomes to be sequenced is its resistance to axenic culture. In the present study, we employed next generation technology to analyse the genomic DNA and RNA to generate a new reference genome sequence of a C. hominis strain isolated directly from human stool and a new genome annotation of the C. parvum Iowa reference genome.
Collapse
Affiliation(s)
- Juan P Isaza
- Grupo de Parasitología, Facultad de Medicina, Universidad de Antioquia Carrera 53 No. 61-30, Medellin, Antioquia 05001, Colombia.,Centro Nacional de Secuenciación Genómica-CNSG, Universidad de Antioquia Carrera 53 No. 61-30, Medellin, Antioquia 05001, Colombia
| | - Ana Luz Galván
- Grupo de Parasitología, Facultad de Medicina, Universidad de Antioquia Carrera 53 No. 61-30, Medellin, Antioquia 05001, Colombia
| | - Victor Polanco
- Universidad Mayor de Chile-Centro de Genómica y Bioinformatica Camino La piramide 5750 Huechuraba, Santiago de Chile, 8580000, Chile
| | - Bernice Huang
- Virginia Commonwealth University - Center for the Study of Biological Complexity 1101 E. Marshall St., Virginia 23298-0678, US
| | - Andrey V Matveyev
- Virginia Commonwealth University - Center for the Study of Biological Complexity 1101 E. Marshall St., Virginia 23298-0678, US
| | - Myrna G Serrano
- Virginia Commonwealth University - Center for the Study of Biological Complexity 1101 E. Marshall St., Virginia 23298-0678, US
| | - Patricio Manque
- Universidad Mayor de Chile-Centro de Genómica y Bioinformatica Camino La piramide 5750 Huechuraba, Santiago de Chile, 8580000, Chile
| | - Gregory A Buck
- Virginia Commonwealth University - Center for the Study of Biological Complexity 1101 E. Marshall St., Virginia 23298-0678, US
| | - Juan F Alzate
- Grupo de Parasitología, Facultad de Medicina, Universidad de Antioquia Carrera 53 No. 61-30, Medellin, Antioquia 05001, Colombia.,Centro Nacional de Secuenciación Genómica-CNSG, Universidad de Antioquia Carrera 53 No. 61-30, Medellin, Antioquia 05001, Colombia
| |
Collapse
|
3
|
Angelis K, Dos Reis M, Yang Z. Bayesian estimation of nonsynonymous/synonymous rate ratios for pairwise sequence comparisons. Mol Biol Evol 2014; 31:1902-13. [PMID: 24748652 PMCID: PMC4069626 DOI: 10.1093/molbev/msu142] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The nonsynonymous/synonymous rate ratio (ω = dN/dS) is an important measure of the mode and strength of natural selection acting on nonsynonymous mutations in protein-coding genes. The simplest such analysis is the estimation of the dN/dS ratio using two sequences. Both heuristic counting methods and the maximum-likelihood (ML) method based on a codon substitution model are widely used for such analysis. However, these methods do not have nice statistical properties, as the estimates can be zero or infinity in some data sets, so that their means and variances are infinite. In large genome-scale comparisons, such extreme estimates (either 0 or ∞) of ω and sequence distance (t) are common. Here, we implement a Bayesian method to estimate ω and t in pairwise sequence comparisons. Using a combination of computer simulation and real data analysis, we show that the Bayesian estimates have better statistical properties than the ML estimates, because the prior on ω and t shrinks the posterior of those parameters away from extreme values. We also calculate the posterior probability for ω > 1 as a Bayesian alternative to the likelihood ratio test. The new method is computationally efficient and may be useful for genome-scale comparisons of protein-coding gene sequences.
Collapse
Affiliation(s)
- Konstantinos Angelis
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Mario Dos Reis
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| | - Ziheng Yang
- Department of Genetics, Evolution and Environment, University College London, London, United Kingdom
| |
Collapse
|
4
|
Jenkins MC, Widmer G, O'Brien C, Bauchan G, Murphy C, Santin M, Fayer R. A highly divergent 33 kDa Cryptosporidium parvum antigen. J Parasitol 2014; 100:527-31. [PMID: 24601821 DOI: 10.1645/13-433.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Previous studies comparing the genome sequences of Cryptosporidium parvum with Cryptosporidium hominis identified a number of highly divergent genes that might reflect positive selection for host specificity. In the present study, the C. parvum DNA sequence cgd8-5370, which encodes a protein whose amino acid sequence differs appreciably from its homologue in C. hominis , was cloned by PCR and expressed as a recombinant protein in Escherichia coli . Antisera raised against the recombinant cgd8-5370 antigen strongly recognized a unique 33 kDa protein in immunoblots from reducing and non-reducing SDS-PAGE of native C. parvum protein. However, anti-Cp33 sera did not recognize the native 33 kDa homologue in C. hominis . In an immunofluorescence assay (IFA), anti-Cp33 serum recognized an antigen in the anterior end of air-dried C. parvum sporozoites but failed to bind at any sites in C. hominis sporozoites, indicating its specificity for C. parvum . IFA staining of live C. parvum sporozoites with anti-Cp33 serum failed to bind to the parasite, indicating that the CP33 antigen is not on the sporozoite surface, which is consistent with topology predictions based on the encoded amino acid sequence. RT-PCR analysis of cgd8-5370 mRNA before or during C. parvum oocyst excystation revealed transcripts only in excysting sporozoites. Thus, Cp33 represents one of a small number of proteins shown to differentiate C. parvum from C. hominis sporozoites and oocysts.
Collapse
Affiliation(s)
- Mark C Jenkins
- Environmental, Microbial, and Food Safety Laboratory, ARS, USDA, Beltsville, Maryland 20705
| | | | | | | | | | | | | |
Collapse
|
5
|
Ocaña KACS, de Oliveira D, Horta F, Dias J, Ogasawara E, Mattoso M. Exploring Molecular Evolution Reconstruction Using a Parallel Cloud Based Scientific Workflow. ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY 2012. [DOI: 10.1007/978-3-642-31927-3_16] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
6
|
Bouzid M, Tyler KM, Christen R, Chalmers RM, Elwin K, Hunter PR. Multi-locus analysis of human infective Cryptosporidium species and subtypes using ten novel genetic loci. BMC Microbiol 2010; 10:213. [PMID: 20696051 PMCID: PMC2928199 DOI: 10.1186/1471-2180-10-213] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2010] [Accepted: 08/09/2010] [Indexed: 01/27/2023] Open
Abstract
Background Cryptosporidium is a protozoan parasite that causes diarrheal illness in a wide range of hosts including humans. Two species, C. parvum and C. hominis are of primary public health relevance. Genome sequences of these two species are available and show only 3-5% sequence divergence. We investigated this sequence variability, which could correspond either to sequence gaps in the published genome sequences or to the presence of species-specific genes. Comparative genomic tools were used to identify putative species-specific genes and a subset of these genes was tested by PCR in a collection of Cryptosporidium clinical isolates and reference strains. Results The majority of the putative species-specific genes examined were in fact common to C. parvum and C. hominis. PCR product sequence analysis revealed interesting SNPs, the majority of which were species-specific. These genetic loci allowed us to construct a robust and multi-locus analysis. The Neighbour-Joining phylogenetic tree constructed clearly discriminated the previously described lineages of Cryptosporidium species and subtypes. Conclusions Most of the genes identified as being species specific during bioinformatics in Cryptosporidium sp. are in fact present in multiple species and only appear species specific because of gaps in published genome sequences. Nevertheless SNPs may offer a promising approach to studying the taxonomy of closely related species of Cryptosporidia.
Collapse
Affiliation(s)
- Maha Bouzid
- Biomedical Research Centre, School of Medicine, Health Policy and Practice, University of East Anglia, Norwich NR4 7TJ, UK
| | | | | | | | | | | |
Collapse
|
7
|
Weedall GD, Conway DJ. Detecting signatures of balancing selection to identify targets of anti-parasite immunity. Trends Parasitol 2010; 26:363-9. [PMID: 20466591 DOI: 10.1016/j.pt.2010.04.002] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2009] [Revised: 04/04/2010] [Accepted: 04/06/2010] [Indexed: 10/19/2022]
Abstract
Parasite antigen genes might evolve under frequency-dependent immune selection. The distinctive patterns of polymorphism that result can be detected using population genetic methods that test for signatures of balancing selection, allowing genes encoding important targets of immunity to be identified. Analyses can be complicated by population structures, histories and features of a parasite's genome. However, new sequencing technologies facilitate scans of polymorphism throughout parasite genomes to identify the most exceptional gene specific signatures. We focus on malaria parasites to illustrate challenges and opportunities for detecting targets of frequency-dependent immune selection to discover new potential vaccine candidates.
Collapse
Affiliation(s)
- Gareth D Weedall
- School of Biological Sciences, University of Liverpool, Crown Street, Liverpool, UK, L69 7ZB.
| | | |
Collapse
|
8
|
Aguileta G, Lengelle J, Marthey S, Chiapello H, Rodolphe F, Gendrault A, Yockteng R, Vercken E, Devier B, Fontaine MC, Wincker P, Dossat C, Cruaud C, Couloux A, Giraud T. Finding candidate genes under positive selection in Non-model species: examples of genes involved in host specialization in pathogens. Mol Ecol 2009; 19:292-306. [PMID: 20041992 DOI: 10.1111/j.1365-294x.2009.04454.x] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Numerous genes in diverse organisms have been shown to be under positive selection, especially genes involved in reproduction, adaptation to contrasting environments, hybrid inviability, and host-pathogen interactions. Looking for genes under positive selection in pathogens has been a priority in efforts to investigate coevolution dynamics and to develop vaccines or drugs. To elucidate the functions involved in host specialization, here we aimed at identifying candidate sequences that could have evolved under positive selection among closely related pathogens specialized on different hosts. For this goal, we sequenced c. 17,000-32,000 ESTs from each of four Microbotryum species, which are fungal pathogens responsible for anther smut disease on host plants in the Caryophyllaceae. Forty-two of the 372 predicted orthologous genes showed significant signal of positive selection, which represents a good number of candidate genes for further investigation. Sequencing 16 of these genes in 9 additional Microbotryum species confirmed that they have indeed been rapidly evolving in the pathogen species specialized on different hosts. The genes showing significant signals of positive selection were putatively involved in nutrient uptake from the host, secondary metabolite synthesis and secretion, respiration under stressful conditions and stress response, hyphal growth and differentiation, and regulation of expression by other genes. Many of these genes had transmembrane domains and may therefore also be involved in pathogen recognition by the host. Our approach thus revealed fruitful and should be feasible for many non-model organisms for which candidate genes for diversifying selection are needed.
Collapse
Affiliation(s)
- G Aguileta
- Ecologie, Systématique et Evolution, Université Paris-Sud, F-91405 Orsay cedex, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
O'Connor RM, Burns PB, Ha-Ngoc T, Scarpato K, Khan W, Kang G, Ward H. Polymorphic mucin antigens CpMuc4 and CpMuc5 are integral to Cryptosporidium parvum infection in vitro. EUKARYOTIC CELL 2009; 8:461-9. [PMID: 19168754 PMCID: PMC2669191 DOI: 10.1128/ec.00305-08] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/11/2008] [Accepted: 11/20/2008] [Indexed: 11/20/2022]
Abstract
Cryptosporidium, a waterborne enteric parasite, is a frequent cause of diarrheal disease outbreaks worldwide. Thus far, the few antigens shown to be important for attachment to and invasion of the host cell by Cryptosporidium are all mucin-like glycoproteins. In order to investigate other antigens that could be important for Cryptosporidium host-parasite interactions, the Cryptosporidium genome databases were mined for other mucin-like genes. A single locus of seven small mucin sequences was identified on chromosome 2 (CpMuc1 to -7). Reverse transcriptase PCR analysis demonstrated that all seven CpMucs were expressed throughout intracellular development. CpMuc4 and CpMuc5 were selected for further investigation because of the significant sequence divergence between Cryptosporidium parvum and C. hominis alleles. Rabbit anti-CpMuc5 and -CpMuc4 antibodies identified several polypeptides in C. parvum lysates, suggestive of proteolytic processing of the mucins. All polypeptides were larger than the predicted molecular weight, which is suggestive of posttranslational processing, most likely O-glycosylation. In immunofluorescence assays, both anti-CpMuc4 and -CpMuc5 antibodies reacted with the apical region of sporozoites and revealed surface-exposed epitopes. The antigens were not shed during excystation but did partition into the aqueous phase of Triton X-114 extractions. Consistent with a role in attachment and invasion, CpMuc4 and CpMuc5 could be detected binding to fixed Caco-2A cells, and anti-CpMuc4 peptide antibodies inhibited Cryptosporidium infection in vitro. Sequencing of CpMuc4 and CpMuc5 from C. hominis clinical isolates identified several polymorphic alleles. The data suggest that these antigens are integral for Cryptosporidium infection in vitro and may be potential vaccine candidates.
Collapse
Affiliation(s)
- Roberta M O'Connor
- Division of Geographic Medicine and Infectious Diseases, Tufts Medical Center, Boston, MA 02111, USA.
| | | | | | | | | | | | | |
Collapse
|