51
|
Press MO, Li H, Creanza N, Kramer G, Queitsch C, Sourjik V, Borenstein E. Genome-scale co-evolutionary inference identifies functions and clients of bacterial Hsp90. PLoS Genet 2013; 9:e1003631. [PMID: 23874229 PMCID: PMC3708813 DOI: 10.1371/journal.pgen.1003631] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2013] [Accepted: 05/28/2013] [Indexed: 12/12/2022] Open
Abstract
The molecular chaperone Hsp90 is essential in eukaryotes, in which it facilitates the folding of developmental regulators and signal transduction proteins known as Hsp90 clients. In contrast, Hsp90 is not essential in bacteria, and a broad characterization of its molecular and organismal function is lacking. To enable such characterization, we used a genome-scale phylogenetic analysis to identify genes that co-evolve with bacterial Hsp90. We find that genes whose gain and loss were coordinated with Hsp90 throughout bacterial evolution tended to function in flagellar assembly, chemotaxis, and bacterial secretion, suggesting that Hsp90 may aid assembly of protein complexes. To add to the limited set of known bacterial Hsp90 clients, we further developed a statistical method to predict putative clients. We validated our predictions by demonstrating that the flagellar protein FliN and the chemotaxis kinase CheA behaved as Hsp90 clients in Escherichia coli, confirming the predicted role of Hsp90 in chemotaxis and flagellar assembly. Furthermore, normal Hsp90 function is important for wild-type motility and/or chemotaxis in E. coli. This novel function of bacterial Hsp90 agreed with our subsequent finding that Hsp90 is associated with a preference for multiple habitats and may therefore face a complex selection regime. Taken together, our results reveal previously unknown functions of bacterial Hsp90 and open avenues for future experimental exploration by implicating Hsp90 in the assembly of membrane protein complexes and adaptation to novel environments.
Collapse
Affiliation(s)
- Maximilian O. Press
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
| | - Hui Li
- Zentrum für Molekulare Biologie der Universität Heidelberg, DKFZ-ZMBH Alliance, Heidelberg, Germany
| | - Nicole Creanza
- Department of Biology, Stanford University, Stanford, California, United States of America
| | - Günter Kramer
- Zentrum für Molekulare Biologie der Universität Heidelberg, DKFZ-ZMBH Alliance, Heidelberg, Germany
| | - Christine Queitsch
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- * E-mail: (CQ); (VS); (EB)
| | - Victor Sourjik
- Zentrum für Molekulare Biologie der Universität Heidelberg, DKFZ-ZMBH Alliance, Heidelberg, Germany
- * E-mail: (CQ); (VS); (EB)
| | - Elhanan Borenstein
- Department of Genome Sciences, University of Washington, Seattle, Washington, United States of America
- Department of Computer Science and Engineering, University of Washington, Seattle, Washington, United States of America
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
- * E-mail: (CQ); (VS); (EB)
| |
Collapse
|
52
|
Martínez A, Di Domenico M, Worsaae K. Gain of palps within a lineage of ancestrally burrowing annelids (Scalibregmatidae). ACTA ZOOL-STOCKHOLM 2013. [DOI: 10.1111/azo.12039] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Alejandro Martínez
- Marine Biological Section; University of Copenhagen; Strandpromenaden 5 3000 Helsingør Denmark
| | - Maikon Di Domenico
- Biological Institute; Zoological Museum ‘Prof. Dr. Adão José Cardoso’; University of Campinas (UNICAMP); Charles Darwin s/n N. 6109 Campinas São Paulo Brazil
| | - Katrine Worsaae
- Marine Biological Section; University of Copenhagen; Strandpromenaden 5 3000 Helsingør Denmark
| |
Collapse
|
53
|
Xia X. DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 2013; 30:1720-8. [PMID: 23564938 PMCID: PMC3684854 DOI: 10.1093/molbev/mst064] [Citation(s) in RCA: 739] [Impact Index Per Article: 67.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Since its first release in 2001 as mainly a software package for phylogenetic analysis, data analysis for molecular biology and evolution (DAMBE) has gained many new functions that may be classified into six categories: 1) sequence retrieval, editing, manipulation, and conversion among more than 20 standard sequence formats including MEGA, NEXUS, PHYLIP, GenBank, and the new NeXML format for interoperability, 2) motif characterization and discovery functions such as position weight matrix and Gibbs sampler, 3) descriptive genomic analysis tools with improved versions of codon adaptation index, effective number of codons, protein isoelectric point profiling, RNA and protein secondary structure prediction and calculation of minimum folding energy, and genomic skew plots with optimized window size, 4) molecular phylogenetics including sequence alignment, testing substitution saturation, distance-based, maximum parsimony, and maximum-likelihood methods for tree reconstructions, testing the molecular clock hypothesis with either a phylogeny or with relative-rate tests, dating gene duplication and speciation events, choosing the best-fit substitution models, and estimating rate heterogeneity over sites, 5) phylogeny-based comparative methods for continuous and discrete variables, and 6) graphic functions including secondary structure display, optimized skew plot, hydrophobicity plot, and many other plots of amino acid properties along a protein sequence, tree display and drawing by dragging nodes to each other, and visual searching of the maximum parsimony tree. DAMBE features a graphic, user-friendly, and intuitive interface and is freely available from http://dambe.bio.uottawa.ca (last accessed April 16, 2013).
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
54
|
Wang H, Huang H, Ding C, Nie F. Predicting Protein–Protein Interactions from Multimodal Biological Data Sources via Nonnegative Matrix Tri-Factorization. J Comput Biol 2013; 20:344-58. [DOI: 10.1089/cmb.2012.0273] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Affiliation(s)
- Hua Wang
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas
| | - Heng Huang
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas
| | - Chris Ding
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas
| | - Feiping Nie
- Department of Computer Science and Engineering, University of Texas at Arlington, Arlington, Texas
| |
Collapse
|
55
|
Zhang Q, Edwards SV. The evolution of intron size in amniotes: a role for powered flight? Genome Biol Evol 2013; 4:1033-43. [PMID: 22930760 PMCID: PMC3490418 DOI: 10.1093/gbe/evs070] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Intronic DNA is a major component of eukaryotic genes and genomes and can be subject to
selective constraint and have functions in gene regulation. Intron size is of particular
interest given that it is thought to be the target of a variety of evolutionary forces and
has been suggested to be linked ultimately to various phenotypic traits, such as powered
flight. Using whole-genome analyses and comparative approaches that account for
phylogenetic nonindependence, we examined interspecific variation in intron size variation
in three data sets encompassing from 12 to 30 amniotes genomes and allowing for different
levels of genome coverage. In addition to confirming that intron size is negatively
associated with intron position and correlates with genome size, we found that on average
mammals have longer introns than birds and nonavian reptiles, a trend that is correlated
with the proliferation of repetitive elements in mammals. Two independent comparisons
between flying and nonflying sister groups both showed a reduction of intron size in
volant species, supporting an association between powered flight, or possibly the high
metabolic rates associated with flight, and reduced intron/genome size. Small intron size
in volant lineages is less easily explained as a neutral consequence of large effective
population size. In conclusion, we found that the evolution of intron size in amniotes
appears to be non-neutral, is correlated with genome size, and is likely influenced by
powered flight and associated high metabolic rates.
Collapse
Affiliation(s)
- Qu Zhang
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | | |
Collapse
|
56
|
Abstract
Co-evolution is a fundamental component of the theory of evolution and is essential for understanding the relationships between species in complex ecological networks. A wide range of co-evolution-inspired computational methods has been designed to predict molecular interactions, but it is only recently that important advances have been made. Breakthroughs in the handling of phylogenetic information and in disentangling indirect relationships have resulted in an improved capacity to predict interactions between proteins and contacts between different protein residues. Here, we review the main co-evolution-based computational approaches, their theoretical basis, potential applications and foreseeable developments.
Collapse
Affiliation(s)
- David de Juan
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | | |
Collapse
|
57
|
Barh D, Gupta K, Jain N, Khatri G, León-Sicairos N, Canizalez-Roman A, Tiwari S, Verma A, Rahangdale S, Shah Hassan S, Rodrigues dos Santos A, Ali A, Carlos Guimarães L, Thiago Jucá Ramos R, Devarapalli P, Barve N, Bakhtiar M, Kumavath R, Ghosh P, Miyoshi A, Silva A, Kumar A, Narayan Misra A, Blum K, Baumbach J, Azevedo V. Conserved host–pathogen PPIs Globally conserved inter-species bacterial PPIs based conserved host-pathogen interactome derived novel target inC. pseudotuberculosis,C. diphtheriae,M. tuberculosis,C. ulcerans,Y. pestis, andE. colitargeted byPiper betelcompounds. Integr Biol (Camb) 2013; 5:495-509. [DOI: 10.1039/c2ib20206a] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- Debmalya Barh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- Department of Biosciences and Biotechnology, School of Biotechnology, Fakir Mohan University, Jnan Bigyan Vihar, Balasore, Orissa, India
| | - Krishnakant Gupta
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Neha Jain
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
| | - Gourav Khatri
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Nidia León-Sicairos
- Unidad de investigacion, Facultad de Medicina, Universidad Autónoma de Sinaloa. Cedros y Sauces, Fraccionamiento Fresnos, Culiacán Sinaloa 80246, México
| | - Adrian Canizalez-Roman
- Unidad de investigacion, Facultad de Medicina, Universidad Autónoma de Sinaloa. Cedros y Sauces, Fraccionamiento Fresnos, Culiacán Sinaloa 80246, México
| | - Sandeep Tiwari
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
| | - Ankit Verma
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Sachin Rahangdale
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Syed Shah Hassan
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | - Amjad Ali
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Luis Carlos Guimarães
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | - Pratap Devarapalli
- Department of Genomic Science, School of Biological Sciences, Riverside Transit Campus, Central University of Kerala, Kasaragod, India
| | - Neha Barve
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Marriam Bakhtiar
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Ranjith Kumavath
- Department of Genomic Science, School of Biological Sciences, Riverside Transit Campus, Central University of Kerala, Kasaragod, India
| | - Preetam Ghosh
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- Department of Computer Science and Center for the Study of Biological Complexity, Virginia Commonwealth University, 401 West Main Street, Room E4234, P.O. Box 843019, Richmond, Virginia 23284-3019, USA
| | - Anderson Miyoshi
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Artur Silva
- Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém, PA, Brazil
| | - Anil Kumar
- School of Biotechnology, Devi Ahilya University, Khandwa Road Campus, Indore, MP, India
| | - Amarendra Narayan Misra
- Department of Biosciences and Biotechnology, School of Biotechnology, Fakir Mohan University, Jnan Bigyan Vihar, Balasore, Orissa, India
- Center for Life Sciences, School of Natural Sciences, Central University of Jharkhand, Ranchi, Jharkhand State, India
| | - Kenneth Blum
- Centre for Genomics and Applied Gene Technology, Institute of Integrative Omics and Applied Biotechnology (IIOAB), Nonakuri, Purba Medinipur, West Bengal-721172, India. Fax: +91-944 955 0032; Tel: +91-944 955 0032
- University of Florida, College of Medicine, Gainesville, Florida, USA
- Global Integrated Services Unit University of Vermont Center for Clinical & Translational Science, College of Medicine, Burlington, VT, USA
- Dominion Diagnostics LLC, North Kingstown, Rhode Island, USA
| | - Jan Baumbach
- Computational Biology Group Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, DK-5230 Odense, Denmark
| | - Vasco Azevedo
- Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| |
Collapse
|
58
|
Abstract
BACKGROUND To derive post-genomic, neutral insight into the peptidoglycan (PG) distribution among organisms, we mined 1,644 genomes listed in the Carbohydrate-Active Enzymes database for the presence of a minimal 3-gene set that is necessary for PG metabolism. This gene set consists of one gene from the glycosyltransferase family GT28, one from family GT51 and at least one gene belonging to one of five glycoside hydrolase families (GH23, GH73, GH102, GH103 and GH104). RESULTS None of the 103 Viruses or 101 Archaea examined possessed the minimal 3-gene set, but this set was detected in 1/42 of the Eukarya members (Micromonas sp., coding for GT28, GT51 and GH103) and in 1,260/1,398 (90.1%) of Bacteria, with a 100% positive predictive value for the presence of PG. Pearson correlation test showed that GT51 family genes were significantly associated with PG with a value of 0.963 and a p value less than 10(-3). This result was confirmed by a phylogenetic comparative analysis showing that the GT51-encoding gene was significantly associated with PG with a Pagel's score of 60 and 51 (percentage of error close to 0%). Phylogenetic analysis indicated that the GT51 gene history comprised eight loss and one gain events, and suggested a dynamic on-going process. CONCLUSIONS Genome analysis is a neutral approach to explore prospectively the presence of PG in uncultured, sequenced organisms with high predictive values.
Collapse
|
59
|
O'Meara BC. Evolutionary Inferences from Phylogenies: A Review of Methods. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2012. [DOI: 10.1146/annurev-ecolsys-110411-160331] [Citation(s) in RCA: 169] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Brian C. O'Meara
- Department of Ecology and Evolutionary Biology, University of Tennessee, Knoxville, Tennessee 37996; ,
| |
Collapse
|
60
|
Dib L, Carbone A. Protein fragments: functional and structural roles of their coevolution networks. PLoS One 2012; 7:e48124. [PMID: 23139761 PMCID: PMC3489791 DOI: 10.1371/journal.pone.0048124] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Accepted: 09/27/2012] [Indexed: 11/19/2022] Open
Abstract
Small protein fragments, and not just residues, can be used as basic building blocks to reconstruct networks of coevolved amino acids in proteins. Fragments often enter in physical contact one with the other and play a major biological role in the protein. The nature of these interactions might be multiple and spans beyond binding specificity, allosteric regulation and folding constraints. Indeed, coevolving fragments are indicators of important information explaining folding intermediates, peptide assembly, key mutations with known roles in genetic diseases, distinguished subfamily-dependent motifs and differentiated evolutionary pressures on protein regions. Coevolution analysis detects networks of fragments interaction and highlights a high order organization of fragments demonstrating the importance of studying at a deeper level this structure. We demonstrate that it can be applied to protein families that are highly conserved or represented by few sequences, enlarging in this manner, the class of proteins where coevolution analysis can be performed and making large-scale coevolution studies a feasible goal.
Collapse
Affiliation(s)
- Linda Dib
- Université Pierre et Marie Curie, UMR 7238, Équipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
| | - Alessandra Carbone
- Université Pierre et Marie Curie, UMR 7238, Équipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
| |
Collapse
|
61
|
Cheng N, Mao Y, Shi Y, Tao S. Coevolution in RNA molecules driven by selective constraints: evidence from 5S rRNA. PLoS One 2012; 7:e44376. [PMID: 22973441 PMCID: PMC3433437 DOI: 10.1371/journal.pone.0044376] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2011] [Accepted: 08/06/2012] [Indexed: 11/20/2022] Open
Abstract
Understanding intra-molecular coevolution helps to elucidate various structural and functional constraints acting on molecules and might have practical applications in predicting molecular structure and interactions. In this study, we used 5S rRNA as a template to investigate how selective constraints have shaped the RNA evolution. We have observed the nonrandom occurrence of paired differences along the phylogenetic trees, the high rate of compensatory evolution, and the high TIR scores (the ratio of the numbers of terminal to intermediate states), all of which indicate that significant positive selection has driven the evolution of 5S rRNA. We found three mechanisms of compensatory evolution: Watson-Crick interaction (the primary one), complex interactions between multiple sites within a stem, and interplay of stems and loops. Coevolutionary interactions between sites were observed to be highly dependent on the structural and functional environment in which they occurred. Coevolution occurred mostly in those sites closest to loops or bulges within structurally or functionally important helices, which may be under weaker selective constraints than other stem positions. Breaking these pairs would directly increase the size of the adjoining loop or bulge, causing a partial or total structural rearrangement. In conclusion, our results indicate that sequence coevolution is a direct result of maintaining optimal structural and functional integrity.
Collapse
Affiliation(s)
- Nan Cheng
- StateKey Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A&F University, Yangling, People’s Republic of China
- Bioinformatics Center, Northwest A&F University, Yangling, People’s Republic of China
| | - Yuanhui Mao
- StateKey Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A&F University, Yangling, People’s Republic of China
| | - Youyi Shi
- College of Science, Northwest A&F University, Yangling, People’s Republic of China
| | - Shiheng Tao
- StateKey Laboratory of Crop Stress Biology in Arid Areas and College of Life Sciences, Northwest A&F University, Yangling, People’s Republic of China
- Bioinformatics Center, Northwest A&F University, Yangling, People’s Republic of China
- * E-mail:
| |
Collapse
|
62
|
Marazzi B, Ané C, Simon MF, Delgado-Salinas A, Luckow M, Sanderson MJ. Locating evolutionary precursors on a phylogenetic tree. Evolution 2012. [PMID: 23206146 DOI: 10.1111/j.1558-5646.2012.01720.x] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Conspicuous innovations in the history of life are often preceded by more cryptic genetic and developmental precursors. In many cases, these appear to be associated with recurring origins of very similar traits in close relatives (parallelisms) or striking convergences separated by deep time (deep homologies). Although the phylogenetic distribution of gain and loss of traits hints strongly at the existence of such precursors, no models of trait evolution currently permit inference about their location on a tree. Here we develop a new stochastic model, which explicitly captures the dependency implied by a precursor and permits estimation of precursor locations. We apply it to the evolution of extrafloral nectaries (EFNs), an ecologically significant trait mediating a widespread mutualism between plants and ants. In legumes, a species-rich clade with morphologically diverse EFNs, the precursor model fits the data on EFN occurrences significantly better than conventional models. The model generates explicit hypotheses about the phylogenetic location of hypothetical precursors, which may help guide future studies of molecular genetic pathways underlying nectary position, development, and function.
Collapse
Affiliation(s)
- Brigitte Marazzi
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 8572, USA
| | | | | | | | | | | |
Collapse
|
63
|
Gong YN, Chen GW, Suchard MA. A novel empirical mutual information approach to identify co-evolving amino acid positions of influenza A viruses. Comput Biol Chem 2012; 39:20-8. [PMID: 22858722 DOI: 10.1016/j.compbiolchem.2012.06.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2012] [Accepted: 06/22/2012] [Indexed: 11/30/2022]
Abstract
Mutual information (MI) is an approach commonly used to estimate the evolutionary correlation of 2 amino acid sites. Although several MI methods exist, prior to our contribution no systematic method had been developed to assess their performance, or to establish numerical thresholds to detect co-evolving amino acid sites. The current study performed a Markov chain Monte Carlo (MCMC) algorithm on influenza viral sequences to capture their evolutionary characteristics. A consensus maximum clade credibility (MCC) tree was estimated from the samples, together with their amino acid substitution statistics, from which we generated synthetic sequences of known dependent and independent paired amino acid sites. A pair-to-pair and influenza-specific amino acid substitution matrix (P2PFLU) incorporated into Bayesian Evolutionary Analysis Sampling Trees (BEAST) enumerated these synthetic sequences. The sequences inherited evolutionary features and co-varying characteristics from the real viral sequences, rendering these synthetic data ideal for exploring their co-evolving features. For the MI measure, we proposed a novel metric called the empirical MI (MI(Em)), which outperformed other MI measures in analysis of receiver operating characteristics (ROC). We implemented our approach on 1086 all-time PB2 sequences of influenza A H5N1 viruses, in which we found 97 sites exhibiting co-evolutionary substitution of one or more amino acid sites. In particular, PB2 451, along with eight other PB2 sites of various MI(Em) scores, was found to co-evolve with PB2 627, a known species-associated amino acid residue which plays a critical role in influenza virus replication.
Collapse
Affiliation(s)
- Yu-Nong Gong
- Graduate Institute of Electrical Engineering, Chang Gung University, Taoyuan, Taiwan
| | | | | |
Collapse
|
64
|
Guerra-Assunção JA, Enright AJ. Large-scale analysis of microRNA evolution. BMC Genomics 2012; 13:218. [PMID: 22672736 PMCID: PMC3497579 DOI: 10.1186/1471-2164-13-218] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2011] [Accepted: 02/17/2012] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND In animals, microRNAs (miRNA) are important genetic regulators. Animal miRNAs appear to have expanded in conjunction with an escalation in complexity during early bilaterian evolution. Their small size and high-degree of similarity makes them challenging for phylogenetic approaches. Furthermore, genomic locations encoding miRNAs are not clearly defined in many species. A number of studies have looked at the evolution of individual miRNA families. However, we currently lack resources for large-scale analysis of miRNA evolution. RESULTS We addressed some of these issues in order to analyse the evolution of miRNAs. We perform syntenic and phylogenetic analysis for miRNAs from 80 animal species. We present synteny maps, phylogenies and functional data for miRNAs across these species. These data represent the basis of our analyses and also act as a resource for the community. CONCLUSIONS We use these data to explore the distribution of miRNAs across phylogenetic space, characterise their birth and death, and examine functional relationships between miRNAs and other genes. These data confirm a number of previously reported findings on a larger scale and also offer novel insights into the evolution of the miRNA repertoire in animals, and it's genomic organization.
Collapse
Affiliation(s)
- José Afonso Guerra-Assunção
- EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
- PDBC, Instituto Gulbenkian de Ciência, Rua da Quinta Grande, 6, 2780-156, Oeiras, Portugal
| | - Anton J Enright
- EMBL - European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| |
Collapse
|
65
|
Shirai LT, Saenko SV, Keller RA, Jerónimo MA, Brakefield PM, Descimon H, Wahlberg N, Beldade P. Evolutionary history of the recruitment of conserved developmental genes in association to the formation and diversification of a novel trait. BMC Evol Biol 2012; 12:21. [PMID: 22335999 PMCID: PMC3361465 DOI: 10.1186/1471-2148-12-21] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2011] [Accepted: 02/15/2012] [Indexed: 12/31/2022] Open
Abstract
Background The origin and modification of novel traits are important aspects of biological diversification. Studies combining concepts and approaches of developmental genetics and evolutionary biology have uncovered many examples of the recruitment, or co-option, of genes conserved across lineages for the formation of novel, lineage-restricted traits. However, little is known about the evolutionary history of the recruitment of those genes, and of the relationship between them -for example, whether the co-option involves whole or parts of existing networks, or whether it occurs by redeployment of individual genes with de novo rewiring. We use a model novel trait, color pattern elements on butterfly wings called eyespots, to explore these questions. Eyespots have greatly diversified under natural and sexual selection, and their formation involves genetic circuitries shared across insects. Results We investigated the evolutionary history of the recruitment and co-recruitment of four conserved transcription regulators to the larval wing disc region where circular pattern elements develop. The co-localization of Antennapedia, Notch, Distal-less, and Spalt with presumptive (eye)spot organizers was examined in 13 butterfly species, providing the largest comparative dataset available for the system. We found variation between families, between subfamilies, and between tribes. Phylogenetic reconstructions by parsimony and maximum likelihood methods revealed an unambiguous evolutionary history only for Antennapedia, with a resolved single origin of eyespot-associated expression, and many homoplastic events for Notch, Distal-less, and Spalt. The flexibility in the (co-)recruitment of the targeted genes includes cases where different gene combinations are associated with morphologically similar eyespots, as well as cases where identical protein combinations are associated with very different phenotypes. Conclusions The evolutionary history of gene (co-)recruitment is consistent with both divergence from a recruited putative ancestral network, and with independent co-option of individual genes. The diversity in the combinations of genes expressed in association with eyespot formation does not parallel diversity in characteristics of the adult phenotype. We discuss these results in the context of inferring homology. Our study underscores the importance of widening the representation of phylogenetic, morphological, and genetic diversity in order to establish general principles about the mechanisms behind the evolution of novel traits.
Collapse
Affiliation(s)
- Leila T Shirai
- Instituto Gulbenkian de Ciência, Rua da Quinta Grande 6, P-2780-156 Oeiras, Portugal
| | | | | | | | | | | | | | | |
Collapse
|
66
|
Distinct co-evolution patterns of genes associated to DNA polymerase III DnaE and PolC. BMC Genomics 2012; 13:69. [PMID: 22333191 PMCID: PMC3814617 DOI: 10.1186/1471-2164-13-69] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Accepted: 02/14/2012] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Bacterial genomes displaying a strong bias between the leading and the lagging strand of DNA replication encode two DNA polymerases III, DnaE and PolC, rather than a single one. Replication is a highly unsymmetrical process, and the presence of two polymerases is therefore not unexpected. Using comparative genomics, we explored whether other processes have evolved in parallel with each polymerase. RESULTS Extending previous in silico heuristics for the analysis of gene co-evolution, we analyzed the function of genes clustering with dnaE and polC. Clusters were highly informative. DnaE co-evolves with the ribosome, the transcription machinery, the core of intermediary metabolism enzymes. It is also connected to the energy-saving enzyme necessary for RNA degradation, polynucleotide phosphorylase. Most of the proteins of this co-evolving set belong to the persistent set in bacterial proteomes, that is fairly ubiquitously distributed. In contrast, PolC co-evolves with RNA degradation enzymes that are present only in the A+T-rich Firmicutes clade, suggesting at least two origins for the degradosome. CONCLUSION DNA replication involves two machineries, DnaE and PolC. DnaE co-evolves with the core functions of bacterial life. In contrast PolC co-evolves with a set of RNA degradation enzymes that does not derive from the degradosome identified in gamma-Proteobacteria. This suggests that at least two independent RNA degradation pathways existed in the progenote community at the end of the RNA genome world.
Collapse
|
67
|
SPPS: a sequence-based method for predicting probability of protein-protein interaction partners. PLoS One 2012; 7:e30938. [PMID: 22292078 PMCID: PMC3266917 DOI: 10.1371/journal.pone.0030938] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2011] [Accepted: 12/26/2011] [Indexed: 01/20/2023] Open
Abstract
Background The molecular network sustained by different types of interactions among proteins is widely manifested as the fundamental driving force of cellular operations. Many biological functions are determined by the crosstalk between proteins rather than by the characteristics of their individual components. Thus, the searches for protein partners in global networks are imperative when attempting to address the principles of biology. Results We have developed a web-based tool “Sequence-based Protein Partners Search” (SPPS) to explore interacting partners of proteins, by searching over a large repertoire of proteins across many species. SPPS provides a database containing more than 60,000 protein sequences with annotations and a protein-partner search engine in two modes (Single Query and Multiple Query). Two interacting proteins of human FBXO6 protein have been found using the service in the study. In addition, users can refine potential protein partner hits by using annotations and possible interactive network in the SPPS web server. Conclusions SPPS provides a new type of tool to facilitate the identification of direct or indirect protein partners which may guide scientists on the investigation of new signaling pathways. The SPPS server is available to the public at http://mdl.shsmu.edu.cn/SPPS/.
Collapse
|
68
|
Pokorny L, Ho BC, Frahm JP, Quandt D, Shaw AJ. Phylogenetic analyses of morphological evolution in the gametophyte and sporophyte generations of the moss order Hookeriales (Bryopsida). Mol Phylogenet Evol 2012; 63:351-64. [PMID: 22266481 DOI: 10.1016/j.ympev.2012.01.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2011] [Revised: 01/03/2012] [Accepted: 01/09/2012] [Indexed: 10/14/2022]
Abstract
Morphological characters from the gametophyte and sporophyte generations have been used in land plants to infer relationships and construct classifications, but sporophytes provide the vast majority of data for the systematics of vascular plants. In bryophytes both generations are well developed and characters from both are commonly used to classify these organisms. However, because morphological traits of gametophytes and sporophytes can have different genetic bases and experience different selective pressures, taxonomic emphasis on one generation or the other may yield incongruent classifications. The moss order Hookeriales has a controversial taxonomic history because previous classifications have focused almost exclusively on either gametophytes or sporophytes. The Hookeriales provide a model for comparing morphological evolution in gametophytes and sporophytes, and its impact on alternative classification systems. In this study we reconstruct relationships among mosses that are or have been included in the Hookeriales based on sequences from five gene regions, and reconstruct morphological evolution of six sporophyte and gametophyte traits that have been used to differentiate families and genera. We found that the Hookeriales, as currently circumscribed, are monophyletic and that both sporophyte and gametophyte characters are labile. We documented parallel changes and reversals in traits from both generations. This study addresses the general issue of morphological reversals to ancestral states, and resolves novel relationships in the Hookeriales.
Collapse
Affiliation(s)
- L Pokorny
- Department of Biology, Duke University, 125 Science Drive, Durham, NC 27708-0338, USA.
| | | | | | | | | |
Collapse
|
69
|
Latysheva N, Junker VL, Palmer WJ, Codd GA, Barker D. The evolution of nitrogen fixation in cyanobacteria. Bioinformatics 2012; 28:603-6. [DOI: 10.1093/bioinformatics/bts008] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
70
|
Abstract
Phylogenetic profiling involves the comparison of phylogenetic data across gene families. It is possible to construct phylogenetic trees, or related data structures, for specific gene families using a wide variety of tools and approaches. Phylogenetic profiling involves the comparison of this data to determine which families have correlated or coupled evolution. The underlying assumption is that in certain cases these couplings may allow us to infer that the two families are functionally related: that is their function in the cell is coupled. Although this technique can be applied to noncoding genes, it is more commonly used to assess the function of protein coding genes. Examples of proteins that are functionally related include subunits of protein complexes, or enzymes that perform consecutive steps along biochemical pathways. We hypothesize the deletion of one of the families from a genome would then indirectly affect the function of the other. Dozens of different implementations of the phylogenetic profiling technique have been developed over the past decade. These range from the first simple approaches that describe phylogenetic profiles as binary vectors to the most complex ones that attempt to model to the coevolution of protein families on a phylogenetic tree. We discuss a set of these implementations and present the software and databases that are available to perform phylogenetic profiling.
Collapse
Affiliation(s)
- Matteo Pellegrini
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, CA, USA.
| |
Collapse
|
71
|
Cui J, DeLuca TF, Jung JY, Wall DP. Phylogenetically informed logic relationships improve detection of biological network organization. BMC Bioinformatics 2011; 12:476. [PMID: 22172058 PMCID: PMC3402364 DOI: 10.1186/1471-2105-12-476] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Accepted: 12/15/2011] [Indexed: 12/04/2022] Open
Abstract
Background A "phylogenetic profile" refers to the presence or absence of a gene across a set of organisms, and it has been proven valuable for understanding gene functional relationships and network organization. Despite this success, few studies have attempted to search beyond just pairwise relationships among genes. Here we search for logic relationships involving three genes, and explore its potential application in gene network analyses. Results Taking advantage of a phylogenetic matrix constructed from the large orthologs database Roundup, we invented a method to create balanced profiles for individual triplets of genes that guarantee equal weight on the different phylogenetic scenarios of coevolution between genes. When we applied this idea to LAPP, the method to search for logic triplets of genes, the balanced profiles resulted in significant performance improvement and the discovery of hundreds of thousands more putative triplets than unadjusted profiles. We found that logic triplets detected biological network organization and identified key proteins and their functions, ranging from neighbouring proteins in local pathways, to well separated proteins in the whole pathway, and to the interactions among different pathways at the system level. Finally, our case study suggested that the directionality in a logic relationship and the profile of a triplet could disclose the connectivity between the triplet and surrounding networks. Conclusion Balanced profiles are superior to the raw profiles employed by traditional methods of phylogenetic profiling in searching for high order gene sets. Gene triplets can provide valuable information in detection of biological network organization and identification of key genes at different levels of cellular interaction.
Collapse
Affiliation(s)
- Jike Cui
- Center for Biomedical Informatics, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | |
Collapse
|
72
|
Use of comparative genomics approaches to characterize interspecies differences in response to environmental chemicals: challenges, opportunities, and research needs. Toxicol Appl Pharmacol 2011; 271:372-85. [PMID: 22142766 DOI: 10.1016/j.taap.2011.11.011] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2011] [Revised: 11/11/2011] [Accepted: 11/16/2011] [Indexed: 01/12/2023]
Abstract
A critical challenge for environmental chemical risk assessment is the characterization and reduction of uncertainties introduced when extrapolating inferences from one species to another. The purpose of this article is to explore the challenges, opportunities, and research needs surrounding the issue of how genomics data and computational and systems level approaches can be applied to inform differences in response to environmental chemical exposure across species. We propose that the data, tools, and evolutionary framework of comparative genomics be adapted to inform interspecies differences in chemical mechanisms of action. We compare and contrast existing approaches, from disciplines as varied as evolutionary biology, systems biology, mathematics, and computer science, that can be used, modified, and combined in new ways to discover and characterize interspecies differences in chemical mechanism of action which, in turn, can be explored for application to risk assessment. We consider how genetic, protein, pathway, and network information can be interrogated from an evolutionary biology perspective to effectively characterize variations in biological processes of toxicological relevance among organisms. We conclude that comparative genomics approaches show promise for characterizing interspecies differences in mechanisms of action, and further, for improving our understanding of the uncertainties inherent in extrapolating inferences across species in both ecological and human health risk assessment. To achieve long-term relevance and consistent use in environmental chemical risk assessment, improved bioinformatics tools, computational methods robust to data gaps, and quantitative approaches for conducting extrapolations across species are critically needed. Specific areas ripe for research to address these needs are recommended.
Collapse
|
73
|
Basu MK, Selengut JD, Haft DH. ProPhylo: partial phylogenetic profiling to guide protein family construction and assignment of biological process. BMC Bioinformatics 2011; 12:434. [PMID: 22070167 PMCID: PMC3226654 DOI: 10.1186/1471-2105-12-434] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2011] [Accepted: 11/09/2011] [Indexed: 12/02/2022] Open
Abstract
Background Phylogenetic profiling is a technique of scoring co-occurrence between a protein family and some other trait, usually another protein family, across a set of taxonomic groups. In spite of several refinements in recent years, the technique still invites significant improvement. To be its most effective, a phylogenetic profiling algorithm must be able to examine co-occurrences among protein families whose boundaries are uncertain within large homologous protein superfamilies. Results Partial Phylogenetic Profiling (PPP) is an iterative algorithm that scores a given taxonomic profile against the taxonomic distribution of families for all proteins in a genome. The method works through optimizing the boundary of each protein family, rather than by relying on prebuilt protein families or fixed sequence similarity thresholds. Double Partial Phylogenetic Profiling (DPPP) is a related procedure that begins with a single sequence and searches for optimal granularities for its surrounding protein family in order to generate the best query profiles for PPP. We present ProPhylo, a high-performance software package for phylogenetic profiling studies through creating individually optimized protein family boundaries. ProPhylo provides precomputed databases for immediate use and tools for manipulating the taxonomic profiles used as queries. Conclusion ProPhylo results show universal markers of methanogenesis, a new DNA phosphorothioation-dependent restriction enzyme, and efficacy in guiding protein family construction. The software and the associated databases are freely available under the open source Perl Artistic License from ftp://ftp.jcvi.org/pub/data/ppp/.
Collapse
Affiliation(s)
- Malay K Basu
- J. Craig Venter Institute, Rockville, MD 20850, USA.
| | | | | |
Collapse
|
74
|
Devos N, Renner MAM, Gradstein R, Shaw AJ, Laenen B, Vanderpoorten A. Evolution of sexual systems, dispersal strategies and habitat selection in the liverwort genus Radula. THE NEW PHYTOLOGIST 2011; 192:225-236. [PMID: 21649662 DOI: 10.1111/j.1469-8137.2011.03783.x] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
• Shifts in sexual systems are among the most common and important transitions in plants and are correlated with a suite of life-history traits. The evolution of sexual systems and their relationships to gametophyte size, sexual and asexual reproduction, and epiphytism are examined here in the liverwort genus Radula. • The sequence of trait acquisition and the phylogenetic correlations between those traits was investigated using comparative methods. • Shifts in sexual systems recurrently occurred from dioecy to monoecy within facultative epiphyte lineages. Production of specialized asexual gemmae was correlated to neither dioecy nor strict epiphytism. • The significant correlations among life-history traits related to sexual systems and habitat conditions suggest the existence of evolutionary trade-offs. Obligate epiphytes do not produce gemmae more frequently than facultative epiphytes and disperse by whole gametophyte fragments, presumably to avoid the sensitive protonemal stage in a habitat prone to rapid changes in moisture availability. As dispersal ranges correlate with diaspore size, this reinforces the notion that epiphytes experience strong dispersal limitations. Our results thus provide the evolutionary complement to metapopulation, metacommunity and experimental studies demonstrating trade-offs between dispersal distance, establishment ability, and life-history strategy, which may be central to the evolution of reproductive strategies in bryophytes.
Collapse
Affiliation(s)
- Nicolas Devos
- Institut de Botanique, Université de Liege, B-22 Sart Tilman, B-4000 Liege, Belgium
- Biology Department, Duke University, Box 90338 Durham, NC 27708, USA
| | - Matt A M Renner
- National Herbarium of New South Wales, Royal Botanic Gardens Sydney, Mrs Macquaries Road, Sydney, NSW 2000, Australia
| | - Robbert Gradstein
- Dept. Systématique et Evolution, Muséum National d'Histoire Naturelle, 57 rue Cuvier, 75231 Paris cedex 05, France
| | - A Jonathan Shaw
- Biology Department, Duke University, Box 90338 Durham, NC 27708, USA
| | - Benjamin Laenen
- Institut de Botanique, Université de Liege, B-22 Sart Tilman, B-4000 Liege, Belgium
| | - Alain Vanderpoorten
- Institut de Botanique, Université de Liege, B-22 Sart Tilman, B-4000 Liege, Belgium
| |
Collapse
|
75
|
Cao S, Kumimoto RW, Siriwardana CL, Risinger JR, Holt BF. Identification and characterization of NF-Y transcription factor families in the monocot model plant Brachypodium distachyon. PLoS One 2011; 6:e21805. [PMID: 21738795 PMCID: PMC3128097 DOI: 10.1371/journal.pone.0021805] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2011] [Accepted: 06/07/2011] [Indexed: 11/19/2022] Open
Abstract
Background Nuclear Factor Y (NF-Y) is a heterotrimeric transcription factor composed of NF-YA, NF-YB and NF-YC proteins. Using the dicot plant model system Arabidopsis thaliana (Arabidopsis), NF-Y were previously shown to control a variety of agronomically important traits, including drought tolerance, flowering time, and seed development. The aim of the current research was to identify and characterize NF-Y families in the emerging monocot model plant Brachypodium distachyon (Brachypodium) with the long term goal of assisting in the translation of known dicot NF-Y functions to the grasses. Methodology/Principal Findings We identified, annotated, and further characterized 7 NF-YA, 17 NF-YB, and 12 NF-YC proteins in Brachypodium (BdNF-Y). By examining phylogenetic relationships, orthology predictions, and tissue-specific expression patterns for all 36 BdNF-Y, we proposed numerous examples of likely functional conservation between dicots and monocots. To test one of these orthology predictions, we demonstrated that a BdNF-YB with predicted orthology to Arabidopsis floral-promoting NF-Y proteins can rescue a late flowering Arabidopsis mutant. Conclusions/Significance The Brachypodium genome encodes a similar complement of NF-Y to other sequenced angiosperms. Information regarding NF-Y phylogenetic relationships, predicted orthologies, and expression patterns can facilitate their study in the grasses. The current data serves as an entry point for translating many NF-Y functions from dicots to the genetically tractable monocot model system Brachypodium. In turn, studies of NF-Y function in Brachypodium promise to be more readily translatable to the agriculturally important grasses.
Collapse
Affiliation(s)
- Shuanghe Cao
- Department of Botany and Microbiology, University of Oklahoma, Norman, Oklahoma, United States of America
| | - Roderick W. Kumimoto
- Department of Botany and Microbiology, University of Oklahoma, Norman, Oklahoma, United States of America
| | - Chamindika L. Siriwardana
- Department of Botany and Microbiology, University of Oklahoma, Norman, Oklahoma, United States of America
| | - Jan R. Risinger
- Department of Botany and Microbiology, University of Oklahoma, Norman, Oklahoma, United States of America
| | - Ben F. Holt
- Department of Botany and Microbiology, University of Oklahoma, Norman, Oklahoma, United States of America
- * E-mail:
| |
Collapse
|
76
|
Currie TE, Mace R. Mode and tempo in the evolution of socio-political organization: reconciling 'Darwinian' and 'Spencerian' evolutionary approaches in anthropology. Philos Trans R Soc Lond B Biol Sci 2011; 366:1108-17. [PMID: 21357233 DOI: 10.1098/rstb.2010.0318] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Traditional investigations of the evolution of human social and political institutions trace their ancestry back to nineteenth century social scientists such as Herbert Spencer, and have concentrated on the increase in socio-political complexity over time. More recent studies of cultural evolution have been explicitly informed by Darwinian evolutionary theory and focus on the transmission of cultural traits between individuals. These two approaches to investigating cultural change are often seen as incompatible. However, we argue that many of the defining features and assumptions of 'Spencerian' cultural evolutionary theory represent testable hypotheses that can and should be tackled within a broader 'Darwinian' framework. In this paper we apply phylogenetic comparative techniques to data from Austronesian-speaking societies of Island South-East Asia and the Pacific to test hypotheses about the mode and tempo of human socio-political evolution. We find support for three ideas often associated with Spencerian cultural evolutionary theory: (i) political organization has evolved through a regular sequence of forms, (ii) increases in hierarchical political complexity have been more common than decreases, and (iii) political organization has co-evolved with the wider presence of hereditary social stratification.
Collapse
Affiliation(s)
- Thomas E Currie
- Evolutionary Cognitive Science Research Centre, Graduate School of Arts and Sciences, University of Tokyo, Tokyo, Japan.
| | | |
Collapse
|
77
|
Lees JG, Heriche JK, Morilla I, Ranea JA, Orengo CA. Systematic computational prediction of protein interaction networks. Phys Biol 2011; 8:035008. [PMID: 21572181 DOI: 10.1088/1478-3975/8/3/035008] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Determining the network of physical protein associations is an important first step in developing mechanistic evidence for elucidating biological pathways. Despite rapid advances in the field of high throughput experiments to determine protein interactions, the majority of associations remain unknown. Here we describe computational methods for significantly expanding protein association networks. We describe methods for integrating multiple independent sources of evidence to obtain higher quality predictions and we compare the major publicly available resources available for experimentalists to use.
Collapse
Affiliation(s)
- J G Lees
- Research Department of Structural & Molecular Biology, University College London, London, UK.
| | | | | | | | | |
Collapse
|
78
|
Tuller T, Girshovich Y, Sella Y, Kreimer A, Freilich S, Kupiec M, Gophna U, Ruppin E. Association between translation efficiency and horizontal gene transfer within microbial communities. Nucleic Acids Res 2011; 39:4743-55. [PMID: 21343180 PMCID: PMC3113575 DOI: 10.1093/nar/gkr054] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Horizontal gene transfer (HGT) is a major force in microbial evolution. Previous studies have suggested that a variety of factors, including restricted recombination and toxicity of foreign gene products, may act as barriers to the successful integration of horizontally transferred genes. This study identifies an additional central barrier to HGT-the lack of co-adaptation between the codon usage of the transferred gene and the tRNA pool of the recipient organism. Analyzing the genomic sequences of more than 190 microorganisms and the HGT events that have occurred between them, we show that the number of genes that were horizontally transferred between organisms is positively correlated with the similarity between their tRNA pools. Those genes that are better adapted to the tRNA pools of the target genomes tend to undergo more frequent HGT. At the community (or environment) level, organisms that share a common ecological niche tend to have similar tRNA pools. These results remain significant after controlling for diverse ecological and evolutionary parameters. Our analysis demonstrates that there are bi-directional associations between the similarity in the tRNA pools of organisms and the number of HGT events occurring between them. Similar tRNA pools between a donor and a host tend to increase the probability that a horizontally acquired gene will become fixed in its new genome. Our results also suggest that frequent HGT may be a homogenizing force that increases the similarity in the tRNA pools of organisms within the same community.
Collapse
Affiliation(s)
- Tamir Tuller
- Faculty of Mathematics and Computer Science, Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Blavatnik School of Computer Science, School of Medicine, Tel Aviv University, Ramat Aviv 69978, Israel
| | | | | | | | | | | | | | | |
Collapse
|
79
|
Breton S, Stewart DT, Shepardson S, Trdan RJ, Bogan AE, Chapman EG, Ruminas AJ, Piontkivska H, Hoeh WR. Novel protein genes in animal mtDNA: a new sex determination system in freshwater mussels (Bivalvia: Unionoida)? Mol Biol Evol 2010; 28:1645-59. [PMID: 21172831 DOI: 10.1093/molbev/msq345] [Citation(s) in RCA: 120] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Mitochondrial (mt) function depends critically on optimal interactions between components encoded by mt and nuclear DNAs. mitochondrial DNA (mtDNA) inheritance (SMI) is thought to have evolved in animal species to maintain mito-nuclear complementarity by preventing the spread of selfish mt elements thus typically rendering mtDNA heteroplasmy evolutionarily ephemeral. Here, we show that mtDNA intraorganismal heteroplasmy can have deterministic underpinnings and persist for hundreds of millions of years. We demonstrate that the only exception to SMI in the animal kingdom, that is, the doubly uniparental mtDNA inheritance system in bivalves, with its three-way interactions among egg mt-, sperm mt- and nucleus-encoded gene products, is tightly associated with the maintenance of separate male and female sexes (dioecy) in freshwater mussels. Specifically, this mother-through-daughter and father-through-son mtDNA inheritance system, containing highly differentiated mt genomes, is found in all dioecious freshwater mussel species. Conversely, all hermaphroditic species lack the paternally transmitted mtDNA (=possess SMI) and have heterogeneous macromutations in the recently discovered, novel protein-coding gene (F-orf) in their maternally transmitted mt genomes. Using immunoelectron microscopy, we have localized the F-open reading frame (ORF) protein, likely involved in specifying separate sexes, in mitochondria and in the nucleus. Our results support the hypothesis that proteins coded by the highly divergent maternally and paternally transmitted mt genomes could be directly involved in sex determination in freshwater mussels. Concomitantly, our study demonstrates novel features for animal mt genomes: the existence of additional, lineage-specific, mtDNA-encoded proteins with functional significance and the involvement of mtDNA-encoded proteins in extra-mt functions. Our results open new avenues for the identification, characterization, and functional analyses of ORFs in the intergenic regions, previously defined as "noncoding," found in a large proportion of animal mt genomes.
Collapse
Affiliation(s)
- Sophie Breton
- Department of Biological Sciences, Kent State University, Kent, OH, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
80
|
Ta HX, Koskinen P, Holm L. A novel method for assigning functional linkages to proteins using enhanced phylogenetic trees. ACTA ACUST UNITED AC 2010; 27:700-6. [PMID: 21169380 DOI: 10.1093/bioinformatics/btq705] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Functional linkages implicate pairwise relationships between proteins that work together to implement biological tasks. During evolution, functionally linked proteins are likely to be preserved or eliminated across a range of genomes in a correlated fashion. Based on this hypothesis, phylogenetic profiling-based approaches try to detect pairs of protein families that show similar evolutionary patterns. Traditionally, the evolutionary pattern of a protein is encoded by either a binary profile of presence and absence of this protein across species or an occurrence profile that indicates the distribution of copies of this protein across species. RESULTS In our study, we characterize each protein by its enhanced phylogenetic tree, a novel graphical model of the evolution of a protein family with explicitly marked by speciation and duplication events. By topological comparison between enhanced phylogenetic trees, we are able to detect the functionally associated protein pairs. Because the enhanced phylogenetic trees contain more evolutionary information of proteins, our method shows greater performance and discovers functional linkages among proteins more reliably compared with the conventional approaches.
Collapse
Affiliation(s)
- Hung Xuan Ta
- Institute of Biotechnology, Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland.
| | | | | |
Collapse
|
81
|
Vanderpoorten A, Gradstein SR, Carine MA, Devos N. The ghosts of Gondwana and Laurasia in modern liverwort distributions. Biol Rev Camb Philos Soc 2010; 85:471-87. [PMID: 20015315 DOI: 10.1111/j.1469-185x.2009.00111.x] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Recent advances in phylogenetics and, in particular, molecular dating, indicate that transoceanic dispersal has played an important role in shaping plant and animal distributions, obscuring any effect of tectonic history. Taxonomic sampling in biogeographic studies is, however, systematically biased towards vertebrates and higher plants and the possibility remains that a much stronger signature of ancient vicariance might be evident among other organisms, particularly among basal land plants. Here, an explicit Bayesian model-based approach was used to investigate global-scale biogeographic patterns among liverwort genera and to determine whether the patterns identified are consistent with the expectations of vicariance or dispersal scenarios. The distribution of each genus was mapped onto the phylograms describing the floristic affinities among areas in order to define the synapomorphic transitions supporting the observed groupings. The probabilities of change in a branch were calculated by implementing the Markov model of BayesTraits. The consistent ambiguity in ancestral state reconstructions returned by the unconstrained, two-rate model indicated that the overall signal in the data was weak, leading us to test the performance of competing, explicit models. The analyses resolved clades of geographic areas that are mostly consistent with the kingdoms traditionally identified for plants and animals, but with strikingly lower rates of endemism. The major split observed in the phylograms is into almost entirely Laurasian and Gondwanan clades. Other patterns recovered by the analyses, including Wallace's line and the South Atlantic Disjunction, have also traditionally been interpreted in terms of vicariance. These observations contrast with the idea that, in spore-dispersed organisms like bryophytes and pteridophytes, dispersal obscures evidence of vicariance. However, some discrepancies between the liverwort trees and expectations from a continental drift scenario were observed, such as the sister-group relationship of the Australian and New Zealand floras, which is supported by the co-occurrence of many genera, often endemic to these two areas. Together with an interpretation of the results within a phylogenetic context, our analyses suggest that patterns, which are at first sight consistent with an ancient vicariance hypothesis, may, in fact, conceal a complex mixture of relictual distributions and more recent, asymmetrical dispersal events. Our results provide a framework for testing specific evolutionary hypotheses concerning the extremely low levels of endemism in bryophytes and in particular, the significance of dispersal and cryptic diversification.
Collapse
Affiliation(s)
- Alain Vanderpoorten
- Institute of Botany, University of Liège, B22 Sart Tilman, 4000 Liège, Belgium.
| | | | | | | |
Collapse
|
82
|
Nian Chua H. Prediction of Protein Function. Genomics 2010. [DOI: 10.1002/9780470711675.ch9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
83
|
Ferrer L, Dale JM, Karp PD. A systematic study of genome context methods: calibration, normalization and combination. BMC Bioinformatics 2010; 11:493. [PMID: 20920312 PMCID: PMC3247869 DOI: 10.1186/1471-2105-11-493] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2010] [Accepted: 10/01/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genome context methods have been introduced in the last decade as automatic methods to predict functional relatedness between genes in a target genome using the patterns of existence and relative locations of the homologs of those genes in a set of reference genomes. Much work has been done in the application of these methods to different bioinformatics tasks, but few papers present a systematic study of the methods and their combination necessary for their optimal use. RESULTS We present a thorough study of the four main families of genome context methods found in the literature: phylogenetic profile, gene fusion, gene cluster, and gene neighbor. We find that for most organisms the gene neighbor method outperforms the phylogenetic profile method by as much as 40% in sensitivity, being competitive with the gene cluster method at low sensitivities. Gene fusion is generally the worst performing of the four methods. A thorough exploration of the parameter space for each method is performed and results across different target organisms are presented. We propose the use of normalization procedures as those used on microarray data for the genome context scores. We show that substantial gains can be achieved from the use of a simple normalization technique. In particular, the sensitivity of the phylogenetic profile method is improved by around 25% after normalization, resulting, to our knowledge, on the best-performing phylogenetic profile system in the literature. Finally, we show results from combining the various genome context methods into a single score. When using a cross-validation procedure to train the combiners, with both original and normalized scores as input, a decision tree combiner results in gains of up to 20% with respect to the gene neighbor method. Overall, this represents a gain of around 15% over what can be considered the state of the art in this area: the four original genome context methods combined using a procedure like that used in the STRING database. Unfortunately, we find that these gains disappear when the combiner is trained only with organisms that are phylogenetically distant from the target organism. CONCLUSIONS Our experiments indicate that gene neighbor is the best individual genome context method and that gains from the combination of individual methods are very sensitive to the training data used to obtain the combiner's parameters. If adequate training data is not available, using the gene neighbor score by itself instead of a combined score might be the best choice.
Collapse
Affiliation(s)
- Luciana Ferrer
- Artificial Intelligence Center, SRI International, Menlo Park, California, USA
| | - Joseph M Dale
- Artificial Intelligence Center, SRI International, Menlo Park, California, USA
| | - Peter D Karp
- Artificial Intelligence Center, SRI International, Menlo Park, California, USA
| |
Collapse
|
84
|
Janga SC, Díaz-Mejía JJ, Moreno-Hagelsieb G. Network-based function prediction and interactomics: the case for metabolic enzymes. Metab Eng 2010; 13:1-10. [PMID: 20654726 DOI: 10.1016/j.ymben.2010.07.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2010] [Revised: 07/15/2010] [Accepted: 07/16/2010] [Indexed: 12/19/2022]
Abstract
As sequencing technologies increase in power, determining the functions of unknown proteins encoded by the DNA sequences so produced becomes a major challenge. Functional annotation is commonly done on the basis of amino-acid sequence similarity alone. Long after sequence similarity becomes undetectable by pair-wise comparison, profile-based identification of homologs can often succeed due to the conservation of position-specific patterns, important for a protein's three dimensional folding and function. Nevertheless, prediction of protein function from homology-driven approaches is not without problems. Homologous proteins might evolve different functions and the power of homology detection has already started to reach its maximum. Computational methods for inferring protein function, which exploit the context of a protein in cellular networks, have come to be built on top of homology-based approaches. These network-based functional inference techniques provide both a first hand hint into a proteins' functional role and offer complementary insights to traditional methods for understanding the function of uncharacterized proteins. Most recent network-based approaches aim to integrate diverse kinds of functional interactions to boost both coverage and confidence level. These techniques not only promise to solve the moonlighting aspect of proteins by annotating proteins with multiple functions, but also increase our understanding on the interplay between different functional classes in a cell. In this article we review the state of the art in network-based function prediction and describe some of the underlying difficulties and successes. Given the volume of high-throughput data that is being reported the time is ripe to employ these network-based approaches, which can be used to unravel the functions of the uncharacterized proteins accumulating in the genomic databases.
Collapse
Affiliation(s)
- S C Janga
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB20QH, United Kingdom.
| | | | | |
Collapse
|
85
|
Genomes as documents of evolutionary history. Trends Ecol Evol 2010; 25:224-32. [DOI: 10.1016/j.tree.2009.09.007] [Citation(s) in RCA: 59] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2009] [Revised: 09/18/2009] [Accepted: 09/21/2009] [Indexed: 02/02/2023]
|
86
|
Raman K. Construction and analysis of protein-protein interaction networks. AUTOMATED EXPERIMENTATION 2010; 2:2. [PMID: 20334628 PMCID: PMC2834675 DOI: 10.1186/1759-4499-2-2] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2009] [Accepted: 02/15/2010] [Indexed: 12/28/2022]
Abstract
Protein–protein interactions form the basis for a vast majority of cellular events, including signal transduction and transcriptional regulation. It is now understood that the study of interactions between cellular macromolecules is fundamental to the understanding of biological systems. Interactions between proteins have been studied through a number of high-throughput experiments and have also been predicted through an array of computational methods that leverage the vast amount of sequence data generated in the last decade. In this review, I discuss some of the important computational methods for the prediction of functional linkages between proteins. I then give a brief overview of some of the databases and tools that are useful for a study of protein–protein interactions. I also present an introduction to network theory, followed by a discussion of the parameters commonly used in analysing networks, important network topologies, as well as methods to identify important network components, based on perturbations.
Collapse
Affiliation(s)
- Karthik Raman
- Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, 8057 Zürich, Switzerland.
| |
Collapse
|
87
|
Tuller T, Felder Y, Kupiec M. Discovering local patterns of co-evolution: computational aspects and biological examples. BMC Bioinformatics 2010; 11:43. [PMID: 20096103 PMCID: PMC3224649 DOI: 10.1186/1471-2105-11-43] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2009] [Accepted: 01/22/2010] [Indexed: 12/02/2022] Open
Abstract
Background Co-evolution is the process in which two (or more) sets of orthologs exhibit a similar or correlative pattern of evolution. Co-evolution is a powerful way to learn about the functional interdependencies between sets of genes and cellular functions and to predict physical interactions. More generally, it can be used for answering fundamental questions about the evolution of biological systems. Orthologs that exhibit a strong signal of co-evolution in a certain part of the evolutionary tree may show a mild signal of co-evolution in other branches of the tree. The major reasons for this phenomenon are noise in the biological input, genes that gain or lose functions, and the fact that some measures of co-evolution relate to rare events such as positive selection. Previous publications in the field dealt with the problem of finding sets of genes that co-evolved along an entire underlying phylogenetic tree, without considering the fact that often co-evolution is local. Results In this work, we describe a new set of biological problems that are related to finding patterns of local co-evolution. We discuss their computational complexity and design algorithms for solving them. These algorithms outperform other bi-clustering methods as they are designed specifically for solving the set of problems mentioned above. We use our approach to trace the co-evolution of fungal, eukaryotic, and mammalian genes at high resolution across the different parts of the corresponding phylogenetic trees. Specifically, we discover regions in the fungi tree that are enriched with positive evolution. We show that metabolic genes exhibit a remarkable level of co-evolution and different patterns of co-evolution in various biological datasets. In addition, we find that protein complexes that are related to gene expression exhibit non-homogenous levels of co-evolution across different parts of the fungi evolutionary line. In the case of mammalian evolution, signaling pathways that are related to neurotransmission exhibit a relatively higher level of co-evolution along the primate subtree. Conclusions We show that finding local patterns of co-evolution is a computationally challenging task and we offer novel algorithms that allow us to solve this problem, thus opening a new approach for analyzing the evolution of biological systems.
Collapse
Affiliation(s)
- Tamir Tuller
- School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
| | | | | |
Collapse
|
88
|
Ruano-Rubio V, Poch O, Thompson JD. Comparison of eukaryotic phylogenetic profiling approaches using species tree aware methods. BMC Bioinformatics 2009; 10:383. [PMID: 19930674 PMCID: PMC2787529 DOI: 10.1186/1471-2105-10-383] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2009] [Accepted: 11/24/2009] [Indexed: 12/02/2022] Open
Abstract
Background Phylogenetic profiling encompasses an important set of methodologies for in silico high throughput inference of functional relationships between genes. The simplest profiles represent the distribution of gene presence-absence in a set of species as a sequence of 0's and 1's, and it is assumed that functionally related genes will have more similar profiles. The methodology has been successfully used in numerous studies of prokaryotic genomes, although its application in eukaryotes appears problematic, with reported low accuracy due to the complex genomic organization within this domain of life. Recently some groups have proposed an alternative approach based on the correlation of homologous gene group sizes, taking into account all potentially informative genetic events leading to a change in group size, regardless of whether they result in a de novo group gain or total gene group loss. Results We have compared the performance of classical presence-absence and group size based approaches using a large, diverse set of eukaryotic species. In contrast to most previous comparisons in Eukarya, we take into account the species phylogeny. We also compare the approaches using two different group categories, based on orthology and on domain-sharing. Our results confirm a limited overall performance of phylogenetic profiling in eukaryotes. Although group size based approaches initially showed an increase in performance for the domain-sharing based groups, this seems to be an overestimation due to a simplistic negative control dataset and the choice of null hypothesis rejection criteria. Conclusion Presence-absence profiling represents a more accurate classifier of related versus non-related profile pairs, when the profiles under consideration have enough information content. Group size based approaches provide a complementary means of detecting domain or family level co-evolution between groups that may be elusive to presence-absence profiling. Moreover positive correlation between co-evolution scores and functional links imply that these methods could be used to estimate functional distances between gene groups and to cluster them based on their functional relatedness. This study should have important implications for the future development and application of phylogenetic profiling methods, not only in eukaryotic, but also in prokaryotic datasets.
Collapse
Affiliation(s)
- Valentín Ruano-Rubio
- Laboratoire de Biologie et Génomique Intégrative, Département de Biologie et Génomique Structurales, Institut de Génétique et de Biologie Moléculaire et Cellulaire, CNRS/INSERM/UDS, Illkirch, France.
| | | | | |
Collapse
|
89
|
Abstract
Genome assemblies are now available for nine primate species, and large-scale sequencing projects are underway or approved for six others. An explicitly evolutionary and phylogenetic approach to comparative genomics, called phylogenomics, will be essential in unlocking the valuable information about evolutionary history and genomic function that is contained within these genomes. However, most phylogenomic analyses so far have ignored the effects of variation in ancestral populations on patterns of sequence divergence. These effects can be pronounced in the primates, owing to large ancestral effective population sizes relative to the intervals between speciation events. In particular, local genealogies can vary considerably across loci, which can produce biases and diminished power in many phylogenomic analyses of interest, including phylogeny reconstruction, the identification of functional elements, and the detection of natural selection. At the same time, this variation in genealogies can be exploited to gain insight into the nature of ancestral populations. In this Perspective, I explore this area of intersection between phylogenetics and population genetics, and its implications for primate phylogenomics. I begin by "lifting the hood" on the conventional tree-like representation of the phylogenetic relationships between species, to expose the population-genetic processes that operate along its branches. Next, I briefly review an emerging literature that makes use of the complex relationships among coalescence, recombination, and speciation to produce inferences about evolutionary histories, ancestral populations, and natural selection. Finally, I discuss remaining challenges and future prospects at this nexus of phylogenetics, population genetics, and genomics.
Collapse
Affiliation(s)
- Adam Siepel
- Department of Biological Statistics and Computational Biology, Cornell Center for Comparative and Population Genomics, Cornell University, Ithaca, New York 14853, USA.
| |
Collapse
|
90
|
Stubben CJ, Duffield ML, Cooper IA, Ford DC, Gans JD, Karlyshev AV, Lingard B, Oyston PCF, de Rochefort A, Song J, Wren BW, Titball RW, Wolinsky M. Steps toward broad-spectrum therapeutics: discovering virulence-associated genes present in diverse human pathogens. BMC Genomics 2009; 10:501. [PMID: 19874620 PMCID: PMC2774872 DOI: 10.1186/1471-2164-10-501] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2009] [Accepted: 10/29/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND New and improved antimicrobial countermeasures are urgently needed to counteract increased resistance to existing antimicrobial treatments and to combat currently untreatable or new emerging infectious diseases. We demonstrate that computational comparative genomics, together with experimental screening, can identify potential generic (i.e., conserved across multiple pathogen species) and novel virulence-associated genes that may serve as targets for broad-spectrum countermeasures. RESULTS Using phylogenetic profiles of protein clusters from completed microbial genome sequences, we identified seventeen protein candidates that are common to diverse human pathogens and absent or uncommon in non-pathogens. Mutants of 13 of these candidates were successfully generated in Yersinia pseudotuberculosis and the potential role of the proteins in virulence was assayed in an animal model. Six candidate proteins are suggested to be involved in the virulence of Y. pseudotuberculosis, none of which have previously been implicated in the virulence of Y. pseudotuberculosis and three have no record of involvement in the virulence of any bacteria. CONCLUSION This work demonstrates a strategy for the identification of potential virulence factors that are conserved across a number of human pathogenic bacterial species, confirming the usefulness of this tool.
Collapse
Affiliation(s)
- Chris J Stubben
- Bioscience Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
91
|
Gouret P, Thompson JD, Pontarotti P. PhyloPattern: regular expressions to identify complex patterns in phylogenetic trees. BMC Bioinformatics 2009; 10:298. [PMID: 19765311 PMCID: PMC2759962 DOI: 10.1186/1471-2105-10-298] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2009] [Accepted: 09/19/2009] [Indexed: 11/23/2022] Open
Abstract
Background To effectively apply evolutionary concepts in genome-scale studies, large numbers of phylogenetic trees have to be automatically analysed, at a level approaching human expertise. Complex architectures must be recognized within the trees, so that associated information can be extracted. Results Here, we present a new software library, PhyloPattern, for automating tree manipulations and analysis. PhyloPattern includes three main modules, which address essential tasks in high-throughput phylogenetic tree analysis: node annotation, pattern matching, and tree comparison. PhyloPattern thus allows the programmer to focus on: i) the use of predefined or user defined annotation functions to perform immediate or deferred evaluation of node properties, ii) the search for user-defined patterns in large phylogenetic trees, iii) the pairwise comparison of trees by dynamically generating patterns from one tree and applying them to the other. Conclusion PhyloPattern greatly simplifies and accelerates the work of the computer scientist in the evolutionary biology field. The library has been used to automatically identify phylogenetic evidence for domain shuffling or gene loss events in the evolutionary histories of protein sequences. However any workflow that relies on phylogenetic tree analysis, could be automated with PhyloPattern.
Collapse
Affiliation(s)
- Philippe Gouret
- UMR 6632, Evolutionary Biology and Modeling, University of Provence, Marseille, France.
| | | | | |
Collapse
|
92
|
Baussand J, Carbone A. A combinatorial approach to detect coevolved amino acid networks in protein families of variable divergence. PLoS Comput Biol 2009; 5:e1000488. [PMID: 19730672 PMCID: PMC2723916 DOI: 10.1371/journal.pcbi.1000488] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2008] [Accepted: 07/27/2009] [Indexed: 11/17/2022] Open
Abstract
Communication between distant sites often defines the biological role of a protein: amino acid long-range interactions are as important in binding specificity, allosteric regulation and conformational change as residues directly contacting the substrate. The maintaining of functional and structural coupling of long-range interacting residues requires coevolution of these residues. Networks of interaction between coevolved residues can be reconstructed, and from the networks, one can possibly derive insights into functional mechanisms for the protein family. We propose a combinatorial method for mapping conserved networks of amino acid interactions in a protein which is based on the analysis of a set of aligned sequences, the associated distance tree and the combinatorics of its subtrees. The degree of coevolution of all pairs of coevolved residues is identified numerically, and networks are reconstructed with a dedicated clustering algorithm. The method drops the constraints on high sequence divergence limiting the range of applicability of the statistical approaches previously proposed. We apply the method to four protein families where we show an accurate detection of functional networks and the possibility to treat sets of protein sequences of variable divergence.
Collapse
Affiliation(s)
- Julie Baussand
- Génomique Analytique, Université Pierre et Marie Curie, Paris, France
- Génomique des Microorganismes, CNRS, Paris, France
| | - Alessandra Carbone
- Génomique Analytique, Université Pierre et Marie Curie, Paris, France
- Génomique des Microorganismes, CNRS, Paris, France
| |
Collapse
|
93
|
Notebaart RA, Kensche PR, Huynen MA, Dutilh BE. Asymmetric relationships between proteins shape genome evolution. Genome Biol 2009; 10:R19. [PMID: 19216750 PMCID: PMC2688278 DOI: 10.1186/gb-2009-10-2-r19] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2008] [Revised: 01/28/2009] [Accepted: 02/12/2009] [Indexed: 12/18/2022] Open
Abstract
An investigation of metabolic networks in E. coli and S. cerevisiae reveals that asymmetric protein interactions affect gene expression, the relative effect of gene-knockouts and genome evolution. Background The relationships between proteins are often asymmetric: one protein (A) depends for its function on another protein (B), but the second protein does not depend on the first. In metabolic networks there are multiple pathways that converge into one central pathway. The enzymes in the converging pathways depend on the enzymes in the central pathway, but the enzymes in the latter do not depend on any specific enzyme in the converging pathways. Asymmetric relations are analogous to the “if->then” logical relation where A implies B, but B does not imply A (A->B). Results We show that the majority of relationships between enzymes in metabolic flux models of metabolism in Escherichia coli and Saccharomyces cerevisiae are asymmetric. We show furthermore that these asymmetric relationships are reflected in the expression of the genes encoding those enzymes, the effect of gene knockouts and the evolution of genomes. From the asymmetric relative dependency, one would expect that the gene that is relatively independent (B) can occur without the other dependent gene (A), but not the reverse. Indeed, when only one gene of an A->B pair is expressed, is essential, is present in a genome after an evolutionary gain or loss, it tends to be the independent gene (B). This bias is strongest for genes encoding proteins whose asymmetric relationship is evolutionarily conserved. Conclusions The asymmetric relations between proteins that arise from the system properties of metabolic networks affect gene expression, the relative effect of gene knockouts and genome evolution in a predictable manner.
Collapse
Affiliation(s)
- Richard A Notebaart
- Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences, Radboud University Nijmegen Medical Center, Geert Grooteplein 26-28, 6525 GA, Nijmegen, The Netherlands
| | | | | | | |
Collapse
|
94
|
Wang Z, Johnston PR, Yang ZL, Townsend JP. Evolution of reproductive morphology in leaf endophytes. PLoS One 2009; 4:e4246. [PMID: 19158947 PMCID: PMC2617777 DOI: 10.1371/journal.pone.0004246] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2008] [Accepted: 12/17/2008] [Indexed: 11/19/2022] Open
Abstract
The endophytic lifestyle has played an important role in the evolution of the morphology of reproductive structures (body) in one of the most problematic groups in fungal classification, the Leotiomycetes (Ascomycota). Mapping fungal morphologies to two groups in the Leiotiomycetes, the Rhytismatales and Hemiphacidiaceae reveals significant divergence in body size, shape and complexity. Mapping ecological roles to these taxa reveals that the groups include endophytic fungi living on leaves and saprobic fungi living on duff or dead wood. Finally, mapping of the morphologies to ecological roles reveals that leaf endophytes produce small, highly reduced fruiting bodies covered with fungal tissue or dead host tissue, while saprobic species produce large and intricate fruiting bodies. Intriguingly, resemblance between asexual conidiomata and sexual ascomata in some leotiomycetes implicates some common developmental pathways for sexual and asexual development in these fungi.
Collapse
Affiliation(s)
- Zheng Wang
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, USA.
| | | | | | | |
Collapse
|
95
|
Caporaso JG, Smit S, Easton BC, Hunter L, Huttley GA, Knight R. Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics. BMC Evol Biol 2008; 8:327. [PMID: 19055758 PMCID: PMC2637866 DOI: 10.1186/1471-2148-8-327] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2008] [Accepted: 12/03/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Identifying coevolving positions in protein sequences has myriad applications, ranging from understanding and predicting the structure of single molecules to generating proteome-wide predictions of interactions. Algorithms for detecting coevolving positions can be classified into two categories: tree-aware, which incorporate knowledge of phylogeny, and tree-ignorant, which do not. Tree-ignorant methods are frequently orders of magnitude faster, but are widely held to be insufficiently accurate because of a confounding of shared ancestry with coevolution. We conjectured that by using a null distribution that appropriately controls for the shared-ancestry signal, tree-ignorant methods would exhibit equivalent statistical power to tree-aware methods. Using a novel t-test transformation of coevolution metrics, we systematically compared four tree-aware and five tree-ignorant coevolution algorithms, applying them to myoglobin and myosin. We further considered the influence of sequence recoding using reduced-state amino acid alphabets, a common tactic employed in coevolutionary analyses to improve both statistical and computational performance. RESULTS Consistent with our conjecture, the transformed tree-ignorant metrics (particularly Mutual Information) often outperformed the tree-aware metrics. Our examination of the effect of recoding suggested that charge-based alphabets were generally superior for identifying the stabilizing interactions in alpha helices. Performance was not always improved by recoding however, indicating that the choice of alphabet is critical. CONCLUSION The results suggest that t-test transformation of tree-ignorant metrics can be sufficient to control for patterns arising from shared ancestry.
Collapse
Affiliation(s)
- J Gregory Caporaso
- Department of Chemistry and Biochemistry, University of Colorado at Boulder, Boulder, CO, USA.
| | | | | | | | | | | |
Collapse
|
96
|
Karimpour-Fard A, Leach SM, Gill RT, Hunter LE. Predicting protein linkages in bacteria: which method is best depends on task. BMC Bioinformatics 2008; 9:397. [PMID: 18816389 PMCID: PMC2570368 DOI: 10.1186/1471-2105-9-397] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2008] [Accepted: 09/24/2008] [Indexed: 01/06/2023] Open
Abstract
Background Applications of computational methods for predicting protein functional linkages are increasing. In recent years, several bacteria-specific methods for predicting linkages have been developed. The four major genomic context methods are: Gene cluster, Gene neighbor, Rosetta Stone, and Phylogenetic profiles. These methods have been shown to be powerful tools and this paper provides guidelines for when each method is appropriate by exploring different features of each method and potential improvements offered by their combination. We also review many previous treatments of these prediction methods, use the latest available annotations, and offer a number of new observations. Results Using Escherichia coli K12 and Bacillus subtilis, linkage predictions made by each of these methods were evaluated against three benchmarks: functional categories defined by COG and KEGG, known pathways listed in EcoCyc, and known operons listed in RegulonDB. Each evaluated method had strengths and weaknesses, with no one method dominating all aspects of predictive ability studied. For functional categories, as previous studies have shown, the Rosetta Stone method was individually best at detecting linkages and predicting functions among proteins with shared KEGG categories while the Phylogenetic profile method was best for linkage detection and function prediction among proteins with common COG functions. Differences in performance under COG versus KEGG may be attributable to the presence of paralogs. Better function prediction was observed when using a weighted combination of linkages based on reliability versus using a simple unweighted union of the linkage sets. For pathway reconstruction, 99 complete metabolic pathways in E. coli K12 (out of the 209 known, non-trivial pathways) and 193 pathways with 50% of their proteins were covered by linkages from at least one method. Gene neighbor was most effective individually on pathway reconstruction, with 48 complete pathways reconstructed. For operon prediction, Gene cluster predicted completely 59% of the known operons in E. coli K12 and 88% (333/418)in B. subtilis. Comparing two versions of the E. coli K12 operon database, many of the unannotated predictions in the earlier version were updated to true predictions in the later version. Using only linkages found by both Gene Cluster and Gene Neighbor improved the precision of operon predictions. Additionally, as previous studies have shown, combining features based on intergenic region and protein function improved the specificity of operon prediction. Conclusion A common problem for computational methods is the generation of a large number of false positives that might be caused by an incomplete source of validation. By comparing two versions of a database, we demonstrated the dramatic differences on reported results. We used several benchmarks on which we have shown the comparative effectiveness of each prediction method, as well as provided guidelines as to which method is most appropriate for a given prediction task.
Collapse
Affiliation(s)
- Anis Karimpour-Fard
- Center for Computational Pharmacology, University of Colorado School of Medicine, Aurora, Colorado 80045, USA.
| | | | | | | |
Collapse
|
97
|
Liu Z, DeSantis TZ, Andersen GL, Knight R. Accurate taxonomy assignments from 16S rRNA sequences produced by highly parallel pyrosequencers. Nucleic Acids Res 2008; 36:e120. [PMID: 18723574 PMCID: PMC2566877 DOI: 10.1093/nar/gkn491] [Citation(s) in RCA: 386] [Impact Index Per Article: 24.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
The recent introduction of massively parallel pyrosequencers allows rapid, inexpensive analysis of microbial community composition using 16S ribosomal RNA (rRNA) sequences. However, a major challenge is to design a workflow so that taxonomic information can be accurately and rapidly assigned to each read, so that the composition of each community can be linked back to likely ecological roles played by members of each species, genus, family or phylum. Here, we use three large 16S rRNA datasets to test whether taxonomic information based on the full-length sequences can be recaptured by short reads that simulate the pyrosequencer outputs. We find that different taxonomic assignment methods vary radically in their ability to recapture the taxonomic information in full-length 16S rRNA sequences: most methods are sensitive to the region of the 16S rRNA gene that is targeted for sequencing, but many combinations of methods and rRNA regions produce consistent and accurate results. To process large datasets of partial 16S rRNA sequences obtained from surveys of various microbial communities, including those from human body habitats, we recommend the use of Greengenes or RDP classifier with fragments of at least 250 bases, starting from one of the primers R357, R534, R798, F343 or F517.
Collapse
Affiliation(s)
- Zongzhi Liu
- Department of Chemistry and Biochemistry, UCB 215, University of Colorado at Boulder, Boulder, CO 80309-0215, USA
| | | | | | | |
Collapse
|
98
|
Jabbour F, Damerval C, Nadot S. Evolutionary trends in the flowers of Asteridae: is polyandry an alternative to zygomorphy? ANNALS OF BOTANY 2008; 102:153-65. [PMID: 18511411 PMCID: PMC2712368 DOI: 10.1093/aob/mcn082] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2008] [Revised: 03/17/2008] [Accepted: 04/21/2008] [Indexed: 05/12/2023]
Abstract
BACKGROUND AND AIMS Floral symmetry presents two main states in angiosperms, actinomorphy (polysymmetry or radial symmetry) and zygomorphy (monosymmetry or bilateral symmetry). Transitions from actinomorphy to zygomorphy have occurred repeatedly among flowering plants, possibly in coadaptation with specialized pollinators. In this paper, the rules controlling the evolution of floral symmetry were investigated to determine in which architectural context zygomorphy can evolve. METHODS Floral traits potentially associated with perianth symmetry shifts in Asteridae, one of the major clades of the core eudicots, were selected: namely the perianth merism, the presence and number of spurs, and the androecium organ number. The evolution of these characters was optimized on a composite tree. Correlations between symmetry and the other morphological traits were then examined using a phylogenetic comparative method. KEY RESULTS The analyses reveal that the evolution of floral symmetry in Asteridae is conditioned by both androecium organ number and perianth merism and that zygomorphy is a prerequisite to the emergence of spurs. CONCLUSIONS The statistically significant correlation between perianth zygomorphy and oligandry suggests that the evolution of floral symmetry could be canalized by developmental or spatial constraint. Interestingly, the evolution of polyandry in an actinomorphic context appears as an alternative evolutionary pathway to zygomorphy in Asteridae. These results may be interpreted either in terms of plant-pollinator adaptation or in terms of developmental or physical constraints. The results are discussed in relation to current knowledge about the molecular bases underlying floral symmetry.
Collapse
Affiliation(s)
- Florian Jabbour
- Université Paris-Sud, Laboratoire Ecologie, Systématique, Evolution, CNRS UMR 8079, AgroParisTech, Orsay, F-91405, France.
| | | | | |
Collapse
|
99
|
Gomez SM, Choi K, Wu Y. Prediction of protein-protein interaction networks. ACTA ACUST UNITED AC 2008; Chapter 8:8.2.1-8.2.14. [PMID: 18551416 DOI: 10.1002/0471250953.bi0802s22] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This unit offers a general overview of several techniques that have been developed for inferring functional and/or protein-protein interaction networks. The majority of these use whole-genome sequences as their primary input source of data. In addition, a few methods that utilize both protein features and experimental protein-protein interaction data directly in the prediction of new interactions have recently been developed. While an exhaustive list of approaches is not presented, it is hoped that the reader will gain a sense of how these approaches are implemented and an idea of their relative strengths and weaknesses, and a broader perspective on the type of work being conducted in this highly active area of research.
Collapse
Affiliation(s)
- Shawn M Gomez
- Joint Department of Biomedical Engineering, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | | |
Collapse
|
100
|
Karimpour-Fard A, Leach SM, Hunter LE, Gill RT. The topology of the bacterial co-conserved protein network and its implications for predicting protein function. BMC Genomics 2008; 9:313. [PMID: 18590549 PMCID: PMC2488357 DOI: 10.1186/1471-2164-9-313] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2008] [Accepted: 06/30/2008] [Indexed: 11/12/2022] Open
Abstract
Background Protein-protein interactions networks are most often generated from physical protein-protein interaction data. Co-conservation, also known as phylogenetic profiles, is an alternative source of information for generating protein interaction networks. Co-conservation methods generate interaction networks among proteins that are gained or lost together through evolution. Co-conservation is a particularly useful technique in the compact bacteria genomes. Prior studies in yeast suggest that the topology of protein-protein interaction networks generated from physical interaction assays can offer important insight into protein function. Here, we hypothesize that in bacteria, the topology of protein interaction networks derived via co-conservation information could similarly improve methods for predicting protein function. Since the topology of bacteria co-conservation protein-protein interaction networks has not previously been studied in depth, we first perform such an analysis for co-conservation networks in E. coli K12. Next, we demonstrate one way in which network connectivity measures and global and local function distribution can be exploited to predict protein function for previously uncharacterized proteins. Results Our results showed, like most biological networks, our bacteria co-conserved protein-protein interaction networks had scale-free topologies. Our results indicated that some properties of the physical yeast interaction network hold in our bacteria co-conservation networks, such as high connectivity for essential proteins. However, the high connectivity among protein complexes in the yeast physical network was not seen in the co-conservation network which uses all bacteria as the reference set. We found that the distribution of node connectivity varied by functional category and could be informative for function prediction. By integrating of functional information from different annotation sources and using the network topology, we were able to infer function for uncharacterized proteins. Conclusion Interactions networks based on co-conservation can contain information distinct from networks based on physical or other interaction types. Our study has shown co-conservation based networks to exhibit a scale free topology, as expected for biological networks. We also revealed ways that connectivity in our networks can be informative for the functional characterization of proteins.
Collapse
Affiliation(s)
- Anis Karimpour-Fard
- Center for Computational Pharmacology, University of Colorado School of Medicine, Aurora, Colorado 80045, USA.
| | | | | | | |
Collapse
|