1
|
Cellier MFM. Nramp: Deprive and conquer? Front Cell Dev Biol 2022; 10:988866. [PMID: 36313567 PMCID: PMC9606685 DOI: 10.3389/fcell.2022.988866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 09/20/2022] [Indexed: 11/13/2022] Open
Abstract
Solute carriers 11 (Slc11) evolved from bacterial permease (MntH) to eukaryotic antibacterial defense (Nramp) while continuously mediating proton (H+)-dependent manganese (Mn2+) import. Also, Nramp horizontal gene transfer (HGT) toward bacteria led to mntH polyphyly. Prior demonstration that evolutionary rate-shifts distinguishing Slc11 from outgroup carriers dictate catalytic specificity suggested that resolving Slc11 family tree may provide a function-aware phylogenetic framework. Hence, MntH C (MC) subgroups resulted from HGTs of prototype Nramp (pNs) parologs while archetype Nramp (aNs) correlated with phagocytosis. PHI-Blast based taxonomic profiling confirmed MntH B phylogroup is confined to anaerobic bacteria vs. MntH A (MA)’s broad distribution; suggested niche-related spread of MC subgroups; established that MA-variant MH, which carries ‘eukaryotic signature’ marks, predominates in archaea. Slc11 phylogeny shows MH is sister to Nramp. Site-specific analysis of Slc11 charge network known to interact with the protonmotive force demonstrates sequential rate-shifts that recapitulate Slc11 evolution. 3D mapping of similarly coevolved sites across Slc11 hydrophobic core revealed successive targeting of discrete areas. The data imply that pN HGT could advantage recipient bacteria for H+-dependent Mn2+ acquisition and Alphafold 3D models suggest conformational divergence among MC subgroups. It is proposed that Slc11 originated as a bacterial stress resistance function allowing Mn2+-dependent persistence in conditions adverse for growth, and that archaeal MH could contribute to eukaryogenesis as a Mn2+ sequestering defense perhaps favoring intracellular growth-competent bacteria.
Collapse
|
2
|
Santos C, Mendes T, Antunes A. The genes from the pseudoautosomal region 1 (PAR1) of the mammalian sex chromosomes: Synteny, phylogeny and selection. Genomics 2022; 114:110419. [PMID: 35753589 DOI: 10.1016/j.ygeno.2022.110419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 06/10/2022] [Accepted: 06/20/2022] [Indexed: 11/04/2022]
Abstract
Sex chromosomes recombine restrictly in their homologous area, the pseudoautosomal region (PAR), represented by PAR1 and PAR2, which behave like an autosome in both pairing and recombination. The PAR1, common to most of the eutherian mammals, is located at the terminus of the sex chromosomes short arm and exhibit recombination rates ~20 times higher than the autosomes. Here, we assessed the interspecific evolutionary genomic dynamics of 15 genes of the PAR1 across 41 mammalian genera (representing six orders). The strong negative selection detected in most of the assessed groups reinforces the presence of evolutionary constraints, imposed by the important function of the PAR1 genes. Indeed, mutations in these genes are associated with various diseases in humans, including stature problems (Klinefelter Syndrome), leukemia and mental diseases. Yet, a few genes exhibiting positive selection (ω-value >1) were depicted in Rodentia (ASMT and ZBED1) and Primates (CRLF2 and CSF2RA). Rodents have the smallest described PAR1, while that of simian primates/humans underwent a 3 to 5 fold size reduction. The assessment of the PAR1 genes synteny revealed differences among the mammalian species, especially in the Rodentia order where chromosomic translocations from the sex chromosomes to the autosomes were observed. Such syntenic changes may be an evidence of the rapid evolution in rodents, as previous referred in other papers, also depicted by their increased branch lengths in the phylogenetic analyses. Concluding, we suggest that genome migration is an important factor influencing the evolution of mammals and may result in changes of the selective pressures operating on the genome.
Collapse
Affiliation(s)
- Carla Santos
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Avenida General Norton de Matos, s/n, 4450-208 Porto, Portugal; Institute of Biomedical Sciences Abel Salazar (ICBAS), University of Porto, Portugal
| | - Tito Mendes
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Avenida General Norton de Matos, s/n, 4450-208 Porto, Portugal
| | - Agostinho Antunes
- Interdisciplinary Centre of Marine and Environmental Research (CIIMAR/CIMAR), University of Porto, Avenida General Norton de Matos, s/n, 4450-208 Porto, Portugal; Department of Biology, Faculty of Sciences, University of Porto, Porto, Portugal.
| |
Collapse
|
3
|
Structural and Evolutionary Adaptations of Nei-Like DNA Glycosylases Proteins Involved in Base Excision Repair of Oxidative DNA Damage in Vertebrates. OXIDATIVE MEDICINE AND CELLULAR LONGEVITY 2022; 2022:1144387. [PMID: 35419164 PMCID: PMC9001079 DOI: 10.1155/2022/1144387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Accepted: 03/03/2022] [Indexed: 12/25/2022]
Abstract
Oxidative stress is a type of stress that damages DNA and can occur from both endogenous and exogenous sources. Damage to DNA caused by oxidative stress can result in base modifications that promote replication errors and the formation of sites of base loss, which pose unique challenges to the preservation of genomic integrity. However, the adaptive evolution of the DNA repair mechanism is poorly understood in vertebrates. This research aimed to explore the evolutionary relationships, physicochemical characteristics, and comparative genomic analysis of the Nei-like glycosylase gene family involved in DNA base repair in the vertebrates. The genomic sequences of NEIL1, NEIL2, and NEIL3 genes were aligned to observe selection constraints in the genes, which were relatively low conserved across vertebrate species. The positive selection signals were identified in these genes across the vertebrate lineages. We identified that only about 2.7% of codons in these genes were subjected to positive selection. We also revealed that positive selection pressure was increased in the Fapy-DNA-glyco and H2TH domain, which are involved in the base excision repair of DNA that has been damaged by oxidative stress. Gene structure, motif, and conserved domain analysis indicated that the Nei-like glycosylase genes in mammals and avians are evolutionarily low conserved compared to other glycosylase genes in other “vertebrates” species. This study revealed that adaptive selection played a critical role in the evolution of Nei-like glycosylase in vertebrate species. Systematic comparative genome analyses will give key insights to elucidate the links between DNA repair and the development of lifespan in various organisms as more diverse vertebrate genome sequences become accessible.
Collapse
|
4
|
TwinCons: Conservation score for uncovering deep sequence similarity and divergence. PLoS Comput Biol 2021; 17:e1009541. [PMID: 34714829 PMCID: PMC8580257 DOI: 10.1371/journal.pcbi.1009541] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 11/10/2021] [Accepted: 10/06/2021] [Indexed: 11/19/2022] Open
Abstract
We have developed the program TwinCons, to detect noisy signals of deep ancestry of proteins or nucleic acids. As input, the program uses a composite alignment containing pre-defined groups, and mathematically determines a 'cost' of transforming one group to the other at each position of the alignment. The output distinguishes conserved, variable and signature positions. A signature is conserved within groups but differs between groups. The method automatically detects continuous characteristic stretches (segments) within alignments. TwinCons provides a convenient representation of conserved, variable and signature positions as a single score, enabling the structural mapping and visualization of these characteristics. Structure is more conserved than sequence. TwinCons highlights alternative sequences of conserved structures. Using TwinCons, we detected highly similar segments between proteins from the translation and transcription systems. TwinCons detects conserved residues within regions of high functional importance for the ribosomal RNA (rRNA) and demonstrates that signatures are not confined to specific regions but are distributed across the rRNA structure. The ability to evaluate both nucleic acid and protein alignments allows TwinCons to be used in combined sequence and structural analysis of signatures and conservation in rRNA and in ribosomal proteins (rProteins). TwinCons detects a strong sequence conservation signal between bacterial and archaeal rProteins related by circular permutation. This conserved sequence is structurally colocalized with conserved rRNA, indicated by TwinCons scores of rRNA alignments of bacterial and archaeal groups. This combined analysis revealed deep co-evolution of rRNA and rProtein buried within the deepest branching points in the tree of life.
Collapse
|
5
|
A computational analysis of molecular evolution for virulence genes of zoonotic novel coronavirus (COVID-19). Comput Biol Chem 2021; 93:107532. [PMID: 34171504 PMCID: PMC8213524 DOI: 10.1016/j.compbiolchem.2021.107532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 06/04/2021] [Accepted: 06/16/2021] [Indexed: 11/21/2022]
Abstract
Zoonotic Novel coronavirus disease 2019 (COVID-19) is highly pathogenic and transmissible considered as emerging pandemic disease. The virus belongs from a large virus Coronaviridae family affect respiratory tract of animal and human likely originated from bat and homology to SARA-CoV and MERS-CoV. The virus consists of single-stranded positive genomic RNA coated by nucleocapsid protein. The rate of mutation in any virulence gene may influence the phenomenon of host radiation. We have studied the molecular evolution of selected virulence genes (HA, N, RdRP and S) of novel COVID-19. We used a site-specific comparison of synonymous (silent) and non-synonymous (amino acid altering) nucleotide substitutions. Maximum Likelihood genealogies based on differential gamma distribution rates were used for the analysis of null and alternate hypothesis. The null hypothesis was found more suitable for the analysis using Likelihood Ratio Test (LRT) method, confirming higher rate of substitution. The analysis revealed that RdRP gene had the fastest rate evolution followed by HA gene. We have also reported the new motifs for different virulence genes, which are further useful to design new detection and diagnosis kit for COVID -19.
Collapse
|
6
|
Structural and Evolutionary Adaptation of NOD-Like Receptors in Birds. BIOMED RESEARCH INTERNATIONAL 2021; 2021:5546170. [PMID: 33997004 PMCID: PMC8105094 DOI: 10.1155/2021/5546170] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 04/07/2021] [Accepted: 04/20/2021] [Indexed: 11/17/2022]
Abstract
NOD-like receptors (NLRs) are intracellular sensors of the innate immune system that recognize intracellular pathogen-associated molecular patterns (PAMPs) and danger-associated molecular patterns (DAMPs). Little information exists regarding the incidence of positive selection in the evolution of NLRs of birds or the structural differences between bird and mammal NLRs. Evidence of positive selection was identified in four avian NLRs (NOD1, NLRC3, NLRC5, and NLRP3) using the maximum likelihood approach. These NLRs are under different selection pressures which is indicative of different evolution patterns. Analysis of these NLRs showed a lower percentage of codons under positive selection in the LRR domain than seen in the studies of Toll-like receptors (TLRs), suggesting that the LRR domain evolves differently between NLRs and TLRs. Modeling of human, chicken, mammalian, and avian ancestral NLRs revealed the existence of variable evolution patterns in protein structure that may be adaptively driven.
Collapse
|
7
|
Site-Specific Evolutionary Rate Shifts in HIV-1 and SIV. Viruses 2020; 12:v12111312. [PMID: 33207801 PMCID: PMC7696578 DOI: 10.3390/v12111312] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 11/12/2020] [Accepted: 11/13/2020] [Indexed: 12/28/2022] Open
Abstract
Site-specific evolutionary rate shifts are defined as protein sites, where the rate of substitution has changed dramatically across the phylogeny. With respect to a given clade, sites may either undergo a rate acceleration or a rate deceleration, reflecting a site that was conserved and became variable, or vice-versa, respectively. Sites displaying such a dramatic evolutionary change may point to a loss or gain of function at the protein site, reflecting adaptation, or they may indicate epistatic interactions among sites. Here, we analyzed full genomes of HIV and SIV-1 and identified 271 rate-shifting sites along the HIV-1/SIV phylogeny. The majority of rate shifts occurred at long branches, often corresponding to cross-species transmission branches. We noted that in most proteins, the number of rate accelerations and decelerations was equal, and we suggest that this reflects epistatic interactions among sites. However, several accessory proteins were enriched for either accelerations or decelerations, and we suggest that this may be a signature of adaptation to new hosts. Interestingly, the non-pandemic HIV-1 group O clade exhibited a substantially higher number of rate-shift events than the pandemic group M clade. We propose that this may be a reflection of the height of the species barrier between gorillas and humans versus chimpanzees and humans. Our results provide a genome-wide view of the constraints operating on proteins of HIV-1 and SIV.
Collapse
|
8
|
Pan Z, Zhao M, Peng Y, Wang J. Functional divergence analysis of vertebrate neuronal nicotinic acetylcholine receptor subunits. J Biomol Struct Dyn 2018; 37:2938-2948. [PMID: 30044167 DOI: 10.1080/07391102.2018.1500945] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Nicotinic acetylcholine receptors (nAChRs) are pentamers formed by subunits from a large multigene family and are highly variable in kinetic, electrophysiological and pharmacological properties. Due to the essential roles of nAChRs in many physiological procedures and diversity in function, identifying the function-related sites specific to each subunit is not only necessary to understand the properties of the receptors but also useful to design potential therapeutic compounds that target these macromolecules for treating a series of central neuronal disorders. By conducting a detailed function divergence analysis on nine neuronal nAChR subunits from representative vertebrate species, we revealed the existence of significant functional variation between most subunit pairs. Specifically, 44 unique residues were identified for the α7 subunit, while another 22 residues that were likely responsible for the specific features of other subunits were detected. By mapping these sites onto the 3 D structure of the human α7 subunit, a structure-function relationship profile was revealed. Our results suggested that the functional divergence related sites clustered in the ligand binding domain, the β2-β3 linker close to the N-terminal α-helix, the intracellular linkers between transmembrane domains, and the "transition zone" may have experienced altered evolutionary rates. The former two regions may be potential binding sites for the α7* subtype-specific allosteric modulators, while the latter region is likely to be subtype-specific allosteric modulations of the heteropentameric descendants such as the α4β2* nAChRs. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Zhenhua Pan
- a School of Biomedical Engineering , Tianjin Medical University , Tianjin , China.,b Tianjin Key Laboratory of Lung Cancer Metastasis and Tumor Microenvironment , Tianjin Lung Cancer Institute, Tianjin Medical University General Hospital , Tianjin , China
| | - Mengwen Zhao
- a School of Biomedical Engineering , Tianjin Medical University , Tianjin , China
| | - Yonglin Peng
- a School of Biomedical Engineering , Tianjin Medical University , Tianjin , China
| | - Ju Wang
- a School of Biomedical Engineering , Tianjin Medical University , Tianjin , China
| |
Collapse
|
9
|
Antunes A, Ramos MJ. Gathering Computational Genomics and Proteomics to Unravel Adaptive Evolution. Evol Bioinform Online 2017. [DOI: 10.1177/117693430700300004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
A recent editorial in PLoS Biology by MacCallum and Hill (2006) pointed out the inappropriateness of studies evaluating signatures of positive selection based solely in single-site analyses. Therefore the rising number of articles claiming positive selection that have been recently published urges the question of how to improve the bioinformatics standards for reliably unravel positive selection? Deeper integrative efforts using state-of-the-art methodologies at the gene-level and protein-level are improving positive selection studies. Here we provide some computational guidelines to thoroughly document molecular adaptation.
Collapse
Affiliation(s)
- Agostinho Antunes
- REQUIMTE, Departamento de Química, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 687; 4169-007 Porto, Portugal
| | - Maria João Ramos
- REQUIMTE, Departamento de Química, Faculdade de Ciências, Universidade do Porto, Rua do Campo Alegre, 687; 4169-007 Porto, Portugal
| |
Collapse
|
10
|
Effective estimation of the minimum number of amino acid residues required for functional divergence between duplicate genes. Mol Phylogenet Evol 2017; 113:126-138. [DOI: 10.1016/j.ympev.2017.05.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2017] [Revised: 03/19/2017] [Accepted: 05/10/2017] [Indexed: 01/10/2023]
|
11
|
Sa Z, Zhou J, Zou Y, Su Z, Gu X. Paralog-divergent Features May Help Reduce Off-target Effects of Drugs: Hints from Glucagon Subfamily Analysis. GENOMICS PROTEOMICS & BIOINFORMATICS 2017. [PMID: 28642113 PMCID: PMC5582795 DOI: 10.1016/j.gpb.2017.03.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Side effects from targeted drugs remain a serious concern. One reason is the nonselective binding of a drug to unintended proteins such as its paralogs, which are highly homologous in sequences and have similar structures and drug-binding pockets. To identify targetable differences between paralogs, we analyzed two types (type-I and type-II) of functional divergence between two paralogs in the known target protein receptor family G-protein coupled receptors (GPCRs) at the amino acid level. Paralogous protein receptors in glucagon-like subfamily, glucagon receptor (GCGR) and glucagon-like peptide-1 receptor (GLP-1R), exhibit divergence in ligands and are clinically validated drug targets for type 2 diabetes. Our data showed that type-II amino acids were significantly enriched in the binding sites of antagonist MK-0893 to GCGR, which had a radical shift in physicochemical properties between GCGR and GLP-1R. We also examined the role of type-I amino acids between GCGR and GLP-1R. The divergent features between GCGR and GLP-1R paralogs may be helpful in their discrimination, thus enabling the identification of binding sites to reduce undesirable side effects and increase the target specificity of drugs.
Collapse
Affiliation(s)
- Zhining Sa
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Jingqi Zhou
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200433, China
| | - Yangyun Zou
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200433, China.
| | - Zhixi Su
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200433, China.
| | - Xun Gu
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai 200433, China; Department of Genetics, Development and Cell Biology, Program of Bioinformatics and Computational Biology, Iowa State University, Ames, IA 50011, USA.
| |
Collapse
|
12
|
Thiltgen G, Dos Reis M, Goldstein RA. Finding Direction in the Search for Selection. J Mol Evol 2016; 84:39-50. [PMID: 27913840 PMCID: PMC5253163 DOI: 10.1007/s00239-016-9765-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Accepted: 11/10/2016] [Indexed: 11/24/2022]
Abstract
Tests for positive selection have mostly been developed to look for diversifying selection where change away from the current amino acid is often favorable. However, in many cases we are interested in directional selection where there is a shift toward specific amino acids, resulting in increased fitness in the species. Recently, a few methods have been developed to detect and characterize directional selection on a molecular level. Using the results of evolutionary simulations as well as HIV drug resistance data as models of directional selection, we compare two such methods with each other, as well as against a standard method for detecting diversifying selection. We find that the method to detect diversifying selection also detects directional selection under certain conditions. One method developed for detecting directional selection is powerful and accurate for a wide range of conditions, while the other can generate an excessive number of false positives.
Collapse
Affiliation(s)
- Grant Thiltgen
- Institute of Child Health, University College London, London, UK
| | - Mario Dos Reis
- The School of Biological and Chemical Sciences, Queen Mary University of London, London, UK
| | | |
Collapse
|
13
|
In-Silico Computing of the Most Deleterious nsSNPs in HBA1 Gene. PLoS One 2016; 11:e0147702. [PMID: 26824843 PMCID: PMC4733110 DOI: 10.1371/journal.pone.0147702] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2015] [Accepted: 01/07/2016] [Indexed: 01/30/2023] Open
Abstract
Background α-Thalassemia (α-thal) is a genetic disorder caused by the substitution of single amino acid or large deletions in the HBA1 and/or HBA2 genes. Method Using modern bioinformatics tools as a systematic in-silico approach to predict the deleterious SNPs in the HBA1 gene and its significant pathogenic impact on the functions and structure of HBA1 protein was predicted. Results and Discussion A total of 389 SNPs in HBA1 were retrieved from dbSNP database, which includes: 201 non-coding synonymous (nsSNPs), 43 human active SNPs, 16 intronic SNPs, 11 mRNA 3′ UTR SNPs, 9 coding synonymous SNPs, 9 5′ UTR SNPs and other types. Structural homology-based method (PolyPhen) and sequence homology-based tool (SIFT), SNPs&Go, PROVEAN and PANTHER revealed that 2.4% of the nsSNPs are pathogenic. Conclusions A total of 5 nsSNPs (G60V, K17M, K17T, L92F and W15R) were predicted to be responsible for the structural and functional modifications of HBA1 protein. It is evident from the deep comprehensive in-silico analysis that, two nsSNPs such as G60Vand W15R in HBA1 are highly deleterious. These “2 pathogenic nsSNPs” can be considered for wet-lab confirmatory analysis.
Collapse
|
14
|
Nguyen Ba AN, Strome B, Hua JJ, Desmond J, Gagnon-Arsenault I, Weiss EL, Landry CR, Moses AM. Detecting functional divergence after gene duplication through evolutionary changes in posttranslational regulatory sequences. PLoS Comput Biol 2014; 10:e1003977. [PMID: 25474245 PMCID: PMC4256066 DOI: 10.1371/journal.pcbi.1003977] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2014] [Accepted: 10/07/2014] [Indexed: 11/18/2022] Open
Abstract
Gene duplication is an important evolutionary mechanism that can result in functional divergence in paralogs due to neo-functionalization or sub-functionalization. Consistent with functional divergence after gene duplication, recent studies have shown accelerated evolution in retained paralogs. However, little is known in general about the impact of this accelerated evolution on the molecular functions of retained paralogs. For example, do new functions typically involve changes in enzymatic activities, or changes in protein regulation? Here we study the evolution of posttranslational regulation by examining the evolution of important regulatory sequences (short linear motifs) in retained duplicates created by the whole-genome duplication in budding yeast. To do so, we identified short linear motifs whose evolutionary constraint has relaxed after gene duplication with a likelihood-ratio test that can account for heterogeneity in the evolutionary process by using a non-central chi-squared null distribution. We find that short linear motifs are more likely to show changes in evolutionary constraints in retained duplicates compared to single-copy genes. We examine changes in constraints on known regulatory sequences and show that for the Rck1/Rck2, Fkh1/Fkh2, Ace2/Swi5 paralogs, they are associated with previously characterized differences in posttranslational regulation. Finally, we experimentally confirm our prediction that for the Ace2/Swi5 paralogs, Cbk1 regulated localization was lost along the lineage leading to SWI5 after gene duplication. Our analysis suggests that changes in posttranslational regulation mediated by short regulatory motifs systematically contribute to functional divergence after gene duplication. How a protein is controlled is intimately linked to its function. Therefore, evolution can drive the functional divergence of proteins by tweaking their regulation, even if enzymatic capacities are preserved. Changes in posttranslational regulation (protein phosphorylation, degradation, subcellular localization, etc.) could therefore represent key mechanisms in functional divergence and lead to different phenotypic outcomes. Since disordered protein regions contain sites of protein modification and interaction (known as short linear motifs) and evolve rapidly relative to domains encoding enzymatic functions, these regions are good candidates to harbour sequence changes that underlie changes in function. In this study, we develop a statistical framework to identify changes in rate of evolution specific to protein regulatory sequences and identify hundreds of short linear motifs in disordered regions that are likely to have diverged after the whole-genome duplication in budding yeast. We show that these divergent motifs are much more frequent in paralogs than in single-copy proteins, and that they are more frequent in duplicate pairs that have functionally diverged. Our analysis suggests that changes in short linear motifs in disordered protein regions could be important molecular mechanisms of functional divergence after gene duplication.
Collapse
Affiliation(s)
- Alex N Nguyen Ba
- Department of Cell & Systems Biology, University of Toronto, Toronto, Canada; Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Canada
| | - Bob Strome
- Department of Cell & Systems Biology, University of Toronto, Toronto, Canada
| | - Jun Jie Hua
- Department of Cell & Systems Biology, University of Toronto, Toronto, Canada
| | - Jonathan Desmond
- Department of Cell & Systems Biology, University of Toronto, Toronto, Canada
| | - Isabelle Gagnon-Arsenault
- Département de Biologie, IBIS and PROTEO, Pavillon Charles-Eugene-Marchand, Laval University, Québec City, Canada
| | - Eric L Weiss
- Department of Molecular Biosciences, Northwestern University, Evanston, Illinois, United States of America
| | - Christian R Landry
- Département de Biologie, IBIS and PROTEO, Pavillon Charles-Eugene-Marchand, Laval University, Québec City, Canada
| | - Alan M Moses
- Department of Cell & Systems Biology, University of Toronto, Toronto, Canada; Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Canada
| |
Collapse
|
15
|
Huang YF, Golding GB. FuncPatch: a web server for the fast Bayesian inference of conserved functional patches in protein 3D structures. Bioinformatics 2014; 31:523-31. [PMID: 25322839 DOI: 10.1093/bioinformatics/btu673] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION A number of statistical phylogenetic methods have been developed to infer conserved functional sites or regions in proteins. Many methods, e.g. Rate4Site, apply the standard phylogenetic models to infer site-specific substitution rates and totally ignore the spatial correlation of substitution rates in protein tertiary structures, which may reduce their power to identify conserved functional patches in protein tertiary structures when the sequences used in the analysis are highly similar. The 3D sliding window method has been proposed to infer conserved functional patches in protein tertiary structures, but the window size, which reflects the strength of the spatial correlation, must be predefined and is not inferred from data. We recently developed GP4Rate to solve these problems under the Bayesian framework. Unfortunately, GP4Rate is computationally slow. Here, we present an intuitive web server, FuncPatch, to perform a fast approximate Bayesian inference of conserved functional patches in protein tertiary structures. RESULTS Both simulations and four case studies based on empirical data suggest that FuncPatch is a good approximation to GP4Rate. However, FuncPatch is orders of magnitudes faster than GP4Rate. In addition, simulations suggest that FuncPatch is potentially a useful tool complementary to Rate4Site, but the 3D sliding window method is less powerful than FuncPatch and Rate4Site. The functional patches predicted by FuncPatch in the four case studies are supported by experimental evidence, which corroborates the usefulness of FuncPatch. AVAILABILITY AND IMPLEMENTATION The software FuncPatch is freely available at the web site, http://info.mcmaster.ca/yifei/FuncPatch CONTACT golding@mcmaster.ca SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yi-Fei Huang
- Department of Biology, McMaster University, Hamilton, ON L8S4K1, Canada
| | - G Brian Golding
- Department of Biology, McMaster University, Hamilton, ON L8S4K1, Canada
| |
Collapse
|
16
|
Chakraborty A, Chakrabarti S. A survey on prediction of specificity-determining sites in proteins. Brief Bioinform 2014; 16:71-88. [DOI: 10.1093/bib/bbt092] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
17
|
Zhang ZH, Khoo AA, Mihalek I. Cube - an online tool for comparison and contrasting of protein sequences. PLoS One 2013; 8:e79480. [PMID: 24363790 PMCID: PMC3867285 DOI: 10.1371/journal.pone.0079480] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2013] [Accepted: 09/23/2013] [Indexed: 01/10/2023] Open
Abstract
When comparing sequences of similar proteins, two kinds of questions can be asked, and the related two kinds of inference made. First, one may ask to what degree they are similar, and then, how they differ. In the first case one may tentatively conclude that the conserved elements common to all sequences are of central and common importance to the protein's function. In the latter case the regions of specialization may be discriminative of the function or binding partners across subfamilies of related proteins. Experimental efforts - mutagenesis or pharmacological intervention - can then be pointed in either direction, depending on the context of the study. Cube simplifies this process for users that already have their favorite sets of sequences, and helps them collate the information by visualization of the conservation and specialization scores on the sequence and on the structure, and by spreadsheet tabulation. All information can be visualized on the spot, or downloaded for reference and later inspection. Server homepage: http://eopsf.org/cube
Collapse
Affiliation(s)
- Zong Hong Zhang
- Bioinformatics Institute, Agency for Science, Technology and Research, Singapore
| | - Aik Aun Khoo
- Bioinformatics Institute, Agency for Science, Technology and Research, Singapore
| | - Ivana Mihalek
- Bioinformatics Institute, Agency for Science, Technology and Research, Singapore
- * E-mail: Corresponding
| |
Collapse
|
18
|
Pavlopoulou A, Vlachakis D, Balatsos NAA, Kossida S. A comprehensive phylogenetic analysis of deadenylases. Evol Bioinform Online 2013; 9:491-7. [PMID: 24348009 PMCID: PMC3859875 DOI: 10.4137/ebo.s12746] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Deadenylases catalyze the shortening of the poly(A) tail at the messenger ribonucleic acid (mRNA) 3′-end in eukaryotes. Therefore, these enzymes influence mRNA decay, and constitute a major emerging group of promising anti-cancer pharmacological targets. Herein, we conducted full phylogenetic analyses of the deadenylase homologs in all available genomes in an effort to investigate evolutionary relationships between the deadenylase families and to identify invariant residues, which probably play key roles in the function of deadenylation across species. Our study includes both major Asp-Glu-Asp-Asp (DEDD) and exonuclease-endonuclease-phospatase (EEP) deadenylase superfamilies. The phylogenetic analysis has provided us with important information regarding conserved and invariant deadenylase amino acids across species. Knowledge of the phylogenetic properties and evolution of the domain of deadenylases provides the foundation for the targeted drug design in the pharmaceutical industry and modern exonuclease anti-cancer scientific research.
Collapse
Affiliation(s)
- Athanasia Pavlopoulou
- Bioinformatics and Medical Informatics Team, Biomedical Research Foundation, Academy of Athens, Soranou Efessiou 4, Athens 11527, Greece
| | - Dimitrios Vlachakis
- Bioinformatics and Medical Informatics Team, Biomedical Research Foundation, Academy of Athens, Soranou Efessiou 4, Athens 11527, Greece
| | - Nikolaos A A Balatsos
- Department of Biochemistry and Biotechnology, University of Thessaly, 26 Ploutonos st., 41 221 Larissa, Greece
| | - Sophia Kossida
- Bioinformatics and Medical Informatics Team, Biomedical Research Foundation, Academy of Athens, Soranou Efessiou 4, Athens 11527, Greece
| |
Collapse
|
19
|
Kamneva OK, Knight SJ, Liberles DA, Ward NL. Analysis of genome content evolution in pvc bacterial super-phylum: assessment of candidate genes associated with cellular organization and lifestyle. Genome Biol Evol 2013; 4:1375-90. [PMID: 23221607 PMCID: PMC3542564 DOI: 10.1093/gbe/evs113] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The Planctomycetes, Verrucomicrobia, Chlamydiae (PVC) super-phylum contains bacteria with either complex cellular organization or simple cell structure; it also includes organisms of different lifestyles (pathogens, mutualists, commensal, and free-living). Genome content evolution of this group has not been studied in a systematic fashion, which would reveal genes underlying the emergence of PVC-specific phenotypes. Here, we analyzed the evolutionary dynamics of 26 PVC genomes and several outgroup species. We inferred HGT, duplications, and losses by reconciliation of 27,123 gene trees with the species phylogeny. We showed that genome expansion and contraction have driven evolution within Planctomycetes and Chlamydiae, respectively, and balanced each other in Verrucomicrobia and Lentisphaerae. We also found that for a large number of genes in PVC genomes the most similar sequences are present in Acidobacteria, suggesting past and/or current ecological interaction between organisms from these groups. We also found evidence of shared ancestry between carbohydrate degradation genes in the mucin-degrading human intestinal commensal Akkermansia muciniphila and sequences from Acidobacteria and Bacteroidetes, suggesting that glycoside hydrolases are transferred laterally between gut microbes and that the process of carbohydrate degradation is crucial for microbial survival within the human digestive system. Further, we identified a highly conserved genetic module preferentially present in compartmentalized PVC species and possibly associated with the complex cell plan in these organisms. This conserved machinery is likely to be membrane targeted and involved in electron transport, although its exact function is unknown. These genes represent good candidates for future functional studies.
Collapse
Affiliation(s)
- Olga K Kamneva
- Department of Molecular Biology, University of Wyoming, WY, USA
| | | | | | | |
Collapse
|
20
|
Gu X, Zou Y, Su Z, Huang W, Zhou Z, Arendsee Z, Zeng Y. An update of DIVERGE software for functional divergence analysis of protein family. Mol Biol Evol 2013; 30:1713-9. [PMID: 23589455 DOI: 10.1093/molbev/mst069] [Citation(s) in RCA: 137] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
DIVERGE is a software system for phylogeny-based analyses of protein family evolution and functional divergence. It provides a suite of statistical tools for selection and prioritization of the amino acid sites that are responsible for the functional divergence of a gene family. The synergistic efforts of DIVERGE and other methods have convincingly demonstrated that the pattern of rate change at a particular amino acid site may contain insightful information about the underlying functional divergence following gene duplication. These predicted sites may be used as candidates for further experiments. We are now releasing an updated version of DIVERGE with the following improvements: 1) a feasible approach to examining functional divergence in nearly complete sequences by including deletions and insertions (indels); 2) the calculation of the false discovery rate of functionally diverging sites; 3) estimation of the effective number of functional divergence-related sites that is reliable and insensitive to cutoffs; 4) a statistical test for asymmetric functional divergence; and 5) a new method to infer functional divergence specific to a given duplicate cluster. In addition, we have made efforts to improve software design and produce a well-written software manual for the general user.
Collapse
Affiliation(s)
- Xun Gu
- State Key Laboratory of Genetic Engineering and MOE Key Laboratory of Contemporary Anthropology, School of Life Sciences, Fudan University, Shanghai, China.
| | | | | | | | | | | | | |
Collapse
|
21
|
Inference of functional divergence among proteins when the evolutionary process is non-stationary. J Mol Evol 2013; 76:205-15. [PMID: 23443835 DOI: 10.1007/s00239-013-9549-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2012] [Accepted: 02/09/2013] [Indexed: 10/27/2022]
Abstract
Functional shifts during protein evolution are expected to yield shifts in substitution rate, and statistical methods can test for this at both codon and amino acid levels. Although methods based on models of sequence evolution serve as powerful tools for studying evolutionary processes, violating underlying assumptions can lead to false biological conclusions. It is not unusual for functional shifts to be accompanied by changes in other aspects of the evolutionary process, such as codon or amino acid frequencies. However, models used to test for functional divergence assume these frequencies remain constant over time. We employed simulation to investigate the impact of non-stationary evolution on functional divergence inference. We investigated three likelihood ratio tests based on codon models and found varying degrees of sensitivity. Joint effects of shifts in frequencies and selection pressures can be large, leading to false signals for positive selection. Amino acid-based tests (FunDi and Bivar) were also compromised when several aspects of the substitution process were not adequately modeled. We applied the same tests to a core genome "scan" for functional divergence between light-adapted ecotypes of the cyanobacteria Prochlorococcus, and carried out gene-specific simulations for ten genes. Results of those simulations illustrated how the inference of functional divergence at the genomic level can be seriously impacted by model misspecification. Although computationally costly, simulations motivated by data in hand are warranted when several aspects of the substitution process are either misspecified or not included in the models upon which the statistical tests were built.
Collapse
|
22
|
Convergent intron gains in hymenopteran elongation factor-1α. Mol Phylogenet Evol 2013; 67:266-76. [PMID: 23396205 DOI: 10.1016/j.ympev.2013.01.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2012] [Revised: 01/17/2013] [Accepted: 01/29/2013] [Indexed: 11/23/2022]
Abstract
The eukaryotic translation elongation factor-1α gene (eEF1A) has been used extensively in higher level phylogenetics of insects and other groups, despite being present in two or more copies in several taxa. Orthology assessment has relied heavily on the position of introns, but the basic assumption of low rates of intron loss and absence of convergent intron gains has not been tested thoroughly. Here, we study the evolution of eEF1A based on a broad sample of taxa in the insect order Hymenoptera. The gene is universally present in two copies - F1 and F2 - both of which apparently originated before the emergence of the order. An elevated ratio of non-synonymous versus synonymous substitutions and differences in rates of amino acid replacements between the copies suggest that they evolve independently, and phylogenetic methods clearly cluster the copies separately. The F2 copy appears to be ancient; it is orthologous with the copy known as F1 in Diptera, and is likely present in most insect orders. The hymenopteran F1 copy, which may or may not be unique to this order, apparently originated through retroposition and was originally intron free. During the evolution of the Hymenoptera, it has successively accumulated introns, at least three of which have appeared at the same position as introns in the F2 copy or in eEF1A copies in other insects. The sites of convergent intron gain are characterized by highly conserved nucleotides that strongly resemble specific intron-associated sequence motifs, so-called proto-splice sites. The significant rate of convergent intron gain renders intron-exon structure unreliable as an indicator of orthology in eEF1A, and probably also in other protein-coding genes.
Collapse
|
23
|
Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes. Biochem J 2013; 449:581-94. [DOI: 10.1042/bj20121221] [Citation(s) in RCA: 131] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The present review focuses on the evolution of proteins and the impact of amino acid mutations on function from a structural perspective. Proteins evolve under the law of natural selection and undergo alternating periods of conservative evolution and of relatively rapid change. The likelihood of mutations being fixed in the genome depends on various factors, such as the fitness of the phenotype or the position of the residues in the three-dimensional structure. For example, co-evolution of residues located close together in three-dimensional space can occur to preserve global stability. Whereas point mutations can fine-tune the protein function, residue insertions and deletions (‘decorations’ at the structural level) can sometimes modify functional sites and protein interactions more dramatically. We discuss recent developments and tools to identify such episodic mutations, and examine their applications in medical research. Such tools have been tested on simulated data and applied to real data such as viruses or animal sequences. Traditionally, there has been little if any cross-talk between the fields of protein biophysics, protein structure–function and molecular evolution. However, the last several years have seen some exciting developments in combining these approaches to obtain an in-depth understanding of how proteins evolve. For example, a better understanding of how structural constraints affect protein evolution will greatly help us to optimize our models of sequence evolution. The present review explores this new synthesis of perspectives.
Collapse
|
24
|
Zotti MJ, Christiaens O, Rougé P, Grutzmacher AD, Zimmer PD, Smagghe G. Structural changes under low evolutionary constraint may decrease the affinity of dibenzoylhydrazine insecticides for the ecdysone receptor in non-lepidopteran insects. INSECT MOLECULAR BIOLOGY 2012; 21:488-501. [PMID: 22808992 DOI: 10.1111/j.1365-2583.2012.01154.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Understanding how variations in genetic sequences are conveyed into structural and biochemical properties is of increasing interest in the field of molecular evolution. In order to gain insight into this process, we studied the ecdysone receptor (EcR), a transcription factor that controls moulting and metamorphosis in arthropods. Using an in silico homology model, we identified a region in the lepidopteran EcR that has no direct interaction with the natural hormone but is under strong evolutionary constraint. This region causes a small indentation in the three-dimensional structure of the protein which facilitates the binding of tebufenozide. Non-Mecopterida are considered much older, evolutionarily, than Lepidoptera and they do not have this extended cavity. This location shows differences in evolutionary constraint between Lepidoptera and other insects, where a much lower constraint is observed compared with the Lepidoptera. It is possible that the higher flexibility seen in the EcR of Lepidoptera is an entirely new trait and the higher constraint could then be an indication that this region does have another important function. Finally, we suggest that Try123, which is evolutionarily constrained and is up to now exclusively present in Lepidoptera EcRs, could play a critical role in discriminating between steroidal and non-steroidal ligands.
Collapse
Affiliation(s)
- M J Zotti
- Department of Crop Protection, Ghent University, Ghent, Belgium.
| | | | | | | | | | | |
Collapse
|
25
|
Lawton J, Brugat T, Yan YX, Reid AJ, Böhme U, Otto TD, Pain A, Jackson A, Berriman M, Cunningham D, Preiser P, Langhorne J. Characterization and gene expression analysis of the cir multi-gene family of Plasmodium chabaudi chabaudi (AS). BMC Genomics 2012; 13:125. [PMID: 22458863 PMCID: PMC3384456 DOI: 10.1186/1471-2164-13-125] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Accepted: 03/29/2012] [Indexed: 11/13/2022] Open
Abstract
Background The pir genes comprise the largest multi-gene family in Plasmodium, with members found in P. vivax, P. knowlesi and the rodent malaria species. Despite comprising up to 5% of the genome, little is known about the functions of the proteins encoded by pir genes. P. chabaudi causes chronic infection in mice, which may be due to antigenic variation. In this model, pir genes are called cirs and may be involved in this mechanism, allowing evasion of host immune responses. In order to fully understand the role(s) of CIR proteins during P. chabaudi infection, a detailed characterization of the cir gene family was required. Results The cir repertoire was annotated and a detailed bioinformatic characterization of the encoded CIR proteins was performed. Two major sub-families were identified, which have been named A and B. Members of each sub-family displayed different amino acid motifs, and were thus predicted to have undergone functional divergence. In addition, the expression of the entire cir repertoire was analyzed via RNA sequencing and microarray. Up to 40% of the cir gene repertoire was expressed in the parasite population during infection, and dominant cir transcripts could be identified. In addition, some differences were observed in the pattern of expression between the cir subgroups at the peak of P. chabaudi infection. Finally, specific cir genes were expressed at different time points during asexual blood stages. Conclusions In conclusion, the large number of cir genes and their expression throughout the intraerythrocytic cycle of development indicates that CIR proteins are likely to be important for parasite survival. In particular, the detection of dominant cir transcripts at the peak of P. chabaudi infection supports the idea that CIR proteins are expressed, and could perform important functions in the biology of this parasite. Further application of the methodologies described here may allow the elucidation of CIR sub-family A and B protein functions, including their contribution to antigenic variation and immune evasion.
Collapse
Affiliation(s)
- Jennifer Lawton
- Division of Parasitology, MRC National Institute for Medical Research, London, UK
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Gotoh O. Evolution of Cytochrome P450 Genes from the Viewpoint of Genome Informatics. Biol Pharm Bull 2012; 35:812-7. [DOI: 10.1248/bpb.35.812] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Osamu Gotoh
- Department of Intelligence Science and Technology, Graduate School of Informatics, Kyoto University
- Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology
| |
Collapse
|
27
|
|
28
|
Zhang ZH, Bharatham K, Chee SMQ, Mihalek I. Cube-DB: detection of functional divergence in human protein families. Nucleic Acids Res 2012; 40:D490-4. [PMID: 22139934 PMCID: PMC3245124 DOI: 10.1093/nar/gkr1129] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2011] [Revised: 11/08/2011] [Accepted: 11/08/2011] [Indexed: 12/11/2022] Open
Abstract
Cube-DB is a database of pre-evaluated results for detection of functional divergence in human/vertebrate protein families. The analysis is organized around the nomenclature associated with the human proteins, but based on all currently available vertebrate genomes. Using full genomes enables us, through a mutual-best-hit strategy, to construct comparable taxonomical samples for all paralogues under consideration. Functional specialization is scored on the residue level according to two models of behavior after divergence: heterotachy and homotachy. In the first case, the positions on the protein sequence are scored highly if they are conserved in the reference group of orthologs, and overlap poorly with the residue type choice in the paralogs groups (such positions will also be termed functional determinants). The second model additionally requires conservation within each group of paralogs (functional discriminants). The scoring functions are phylogeny independent, but sensitive to the residue type similarity. The results are presented as a table of per-residue scores, and mapped onto related structure (when available) via browser-embedded visualization tool. They can also be downloaded as a spreadsheet table, and sessions for two additional molecular visualization tools. The database interface is available at http://epsf.bmad.bii.a-star.edu.sg/cube/db/html/home.html.
Collapse
Affiliation(s)
- Zong Hong Zhang
- Bioinformatics Institute 30 Biopolis Street, #07-01 Matrix, Singapore 138671 and School of Biological Sciences, Nanyang Technological University, 50 Nanyang Avenue, Singapore 63979
| | - Kavitha Bharatham
- Bioinformatics Institute 30 Biopolis Street, #07-01 Matrix, Singapore 138671 and School of Biological Sciences, Nanyang Technological University, 50 Nanyang Avenue, Singapore 63979
| | - Sharon M. Q. Chee
- Bioinformatics Institute 30 Biopolis Street, #07-01 Matrix, Singapore 138671 and School of Biological Sciences, Nanyang Technological University, 50 Nanyang Avenue, Singapore 63979
| | - Ivana Mihalek
- Bioinformatics Institute 30 Biopolis Street, #07-01 Matrix, Singapore 138671 and School of Biological Sciences, Nanyang Technological University, 50 Nanyang Avenue, Singapore 63979
| |
Collapse
|
29
|
Bay RA, Bielawski JP. Recombination Detection Under Evolutionary Scenarios Relevant to Functional Divergence. J Mol Evol 2012; 73:273-86. [DOI: 10.1007/s00239-011-9473-0] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2011] [Accepted: 11/07/2011] [Indexed: 12/01/2022]
|
30
|
Huang YF, Golding GB. Inferring sequence regions under functional divergence in duplicate genes. Bioinformatics 2011; 28:176-83. [DOI: 10.1093/bioinformatics/btr635] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
31
|
Rooting phylogenies using gene duplications: An empirical example from the bees (Apoidea). Mol Phylogenet Evol 2011; 60:295-304. [DOI: 10.1016/j.ympev.2011.05.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2010] [Revised: 04/26/2011] [Accepted: 05/03/2011] [Indexed: 12/23/2022]
|
32
|
Moury B, Simon V. dN/dS-based methods detect positive selection linked to trade-offs between different fitness traits in the coat protein of potato virus Y. Mol Biol Evol 2011; 28:2707-17. [PMID: 21498601 DOI: 10.1093/molbev/msr105] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The dN/dS ratio between nonsynonymous and synonymous substitution rates has been used extensively to identify codon positions involved in adaptive processes. However, the accuracy of this approach has been questioned, and very few studies have attempted to validate experimentally its predictions. Using the coat protein (CP) of Potato virus Y (PVY; genus Potyvirus, family Potyviridae) as a case study, we identified several candidate positively selected codon positions that differed between clades. In the CP of the N clade of PVY, positive selection was detected at codon positions 25 and 68 by both the softwares PAML and HyPhy. We introduced nonsynonymous substitutions at these positions in an infectious cDNA clone of PVY and measured the effect of these mutations on virus accumulation in its two major cultivated hosts, tobacco and potato, and on its efficiency of transmission from plant to plant by aphid vectors. The mutation at codon position 25 significantly modified the virus accumulation in the two hosts, whereas the mutation at codon position 68 significantly modified the virus accumulation in one of its hosts and its transmissibility by aphids. Both mutations were involved in adaptive trade-offs. We suggest that our study was particularly favorable to the detection of adaptive mutations using dN/dS estimates because, as obligate parasites, viruses undergo a continuous and dynamic interaction with their hosts that favors the recurrent selection of adaptive mutations and because trade-offs between different fitness traits impede (or at least slow down) the fixation of these mutations and maintain polymorphism within populations.
Collapse
Affiliation(s)
- Benoît Moury
- UR407 Pathologie Végétale, Institut National de la Recherche Agronomique, Montfavet, France.
| | | |
Collapse
|
33
|
Gaston D, Susko E, Roger AJ. A phylogenetic mixture model for the identification of functionally divergent protein residues. ACTA ACUST UNITED AC 2011; 27:2655-63. [PMID: 21840876 DOI: 10.1093/bioinformatics/btr470] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION To understand the evolution of molecular function within protein families, it is important to identify those amino acid residues responsible for functional divergence; i.e. those sites in a protein family that affect cofactor, protein or substrate binding preferences; affinity; catalysis; flexibility; or folding. Type I functional divergence (FD) results from changes in conservation (evolutionary rate) at a site between protein subfamilies, whereas type II FD occurs when there has been a shift in preferences for different amino acid chemical properties. A variety of methods have been developed for identifying both site types in protein subfamilies, both from phylogenetic and information-theoretic angles. However, evaluation of the performance of these methods has typically relied upon a handful of reasonably well-characterized biological datasets or analyses of a single biological example. While experimental validation of many truly functionally divergent sites (true positives) can be relatively straightforward, determining that particular sites do not contribute to functional divergence (i.e. false positives and true negatives) is much more difficult, resulting in noisy 'gold standard' examples. RESULTS We describe a novel, phylogeny-based functional divergence classifier, FunDi. Unlike previous approaches, FunDi uses a unified mixture model-based approach to detect type I and type II FD. To assess FunDi's overall classification performance relative to other methods, we introduce two methods for simulating functionally divergent datasets. We find that the FunDi method performs better than several other predictors over a wide variety of simulation conditions. AVAILABILITY http://rogerlab.biochem.dal.ca/Software CONTACT andrew.roger@dal.ca SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Daniel Gaston
- Centre for Comparative Genomics and Evolutionary Bioinformatics, Dalhousie University, Halifax, Canada, B3H 1X5
| | | | | |
Collapse
|
34
|
Wang HC, Susko E, Roger AJ. Fast statistical tests for detecting heterotachy in protein evolution. Mol Biol Evol 2011; 28:2305-15. [PMID: 21343603 DOI: 10.1093/molbev/msr050] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The w statistic introduced by Lockhart et al. (1998. A covariotide model explains apparent phylogenetic structure of oxygenic photosynthetic lineages. Mol Biol Evol. 15:1183-1188) is a simple and easily calculated statistic intended to detect heterotachy by comparing amino acid substitution patterns between two monophyletic groups of protein sequences. It is defined as the difference between the fraction of varied sites in both groups and the fraction of varied sites in each group. The w test has been used to distinguish a covarion process from equal rates and rates variation across sites processes. Using simulation we show that the w test is effective for small data sets and for data sets that have low substitution rates in the groups but can have difficulties when these conditions are not met. Using site entropy as a measure of variability of a sequence site, we modify the w statistic to a w' statistic by assigning as varied in one group those sites that are actually varied in both groups but have a large entropy difference. We show that the w' test has more power to detect two kinds of heterotachy processes (covarion and bivariate rate shifts) in large and variable data. We also show that a test of Pearson's correlation of the site entropies between two monophyletic groups can be used to detect heterotachy and has more power than the w' test. Furthermore, we demonstrate that there are settings where the correlation test as well as w and w' tests do not detect heterotachy signals in data simulated under a branch length mixture model. In such cases, it is sometimes possible to detect heterotachy through subselection of appropriate taxa. Finally, we discuss the abilities of the three statistical tests to detect a fourth mode of heterotachy: lineage-specific changes in proportion of variable sites.
Collapse
Affiliation(s)
- Huai-Chun Wang
- Department of Mathematics and Statistics, Dalhousie University, Halifax, Nova Scotia, Canada.
| | | | | |
Collapse
|
35
|
Exploiting models of molecular evolution to efficiently direct protein engineering. J Mol Evol 2010; 72:193-203. [PMID: 21132281 DOI: 10.1007/s00239-010-9415-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2010] [Accepted: 11/19/2010] [Indexed: 10/18/2022]
Abstract
Directed evolution and protein engineering approaches used to generate novel or enhanced biomolecular function often use the evolutionary sequence diversity of protein homologs to rationally guide library design. To fully capture this sequence diversity, however, libraries containing millions of variants are often necessary. Screening libraries of this size is often undesirable due to inaccuracies of high-throughput assays, costs, and time constraints. The ability to effectively cull sequence diversity while still generating the functional diversity within a library thus holds considerable value. This is particularly relevant when high-throughput assays are not amenable to select/screen for certain biomolecular properties. Here, we summarize our recent attempts to develop an evolution-guided approach, Reconstructing Evolutionary Adaptive Paths (REAP), for directed evolution and protein engineering that exploits phylogenetic and sequence analyses to identify amino acid substitutions that are likely to alter or enhance function of a protein. To demonstrate the utility of this technique, we highlight our previous work with DNA polymerases in which a REAP-designed small library was used to identify a DNA polymerase capable of accepting non-standard nucleosides. We anticipate that the REAP approach will be used in the future to facilitate the engineering of biopolymers with expanded functions and will thus have a significant impact on the developing field of 'evolutionary synthetic biology'.
Collapse
|
36
|
Kamneva OK, Liberles DA, Ward NL. Genome-wide influence of indel Substitutions on evolution of bacteria of the PVC superphylum, revealed using a novel computational method. Genome Biol Evol 2010; 2:870-86. [PMID: 21048002 PMCID: PMC3000692 DOI: 10.1093/gbe/evq071] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Whole-genome scans for positive Darwinian selection are widely used to detect evolution of genome novelty. Most approaches are based on evaluation of nonsynonymous to synonymous substitution rate ratio across evolutionary lineages. These methods are sensitive to saturation of synonymous sites and thus cannot be used to study evolution of distantly related organisms. In contrast, indels occur less frequently than amino acid replacements, accumulate more slowly, and can be employed to characterize evolution of diverged organisms. As indels are also subject to the forces of natural selection, they can generate functional changes through positive selection. Here, we present a new computational approach to detect selective constraints on indel substitutions at the whole-genome level for distantly related organisms. Our method is based on ancestral sequence reconstruction, takes into account the varying susceptibility of different types of secondary structure to indels, and according to simulation studies is conservative. We applied this newly developed framework to characterize the evolution of organisms of the Planctomycetes, Verrucomicrobia, Chlamydiae (PVC) bacterial superphylum. The superphylum contains organisms with unique cell biology, physiology, and diverse lifestyles. It includes bacteria with simple cell organization and more complex eukaryote-like compartmentalization. Lifestyles range from free-living organisms to obligate pathogens. In this study, we conduct a whole-genome level analysis of indel substitutions specific to evolutionary lineages of the PVC superphylum and found that indels evolved under positive selection on up to 12% of gene tree branches. We also analyzed possible functional consequences for several case studies of predicted indel events.
Collapse
Affiliation(s)
| | | | - Naomi L. Ward
- Department of Molecular Biology, University of Wyoming
- Department of Botany, University of Wyoming
- Program in Ecology, University of Wyoming
- Corresponding author: E-mail:
| |
Collapse
|
37
|
Pavlopoulou A, Pampalakis G, Michalopoulos I, Sotiropoulou G. Evolutionary history of tissue kallikreins. PLoS One 2010; 5:e13781. [PMID: 21072173 PMCID: PMC2967472 DOI: 10.1371/journal.pone.0013781] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2010] [Accepted: 10/08/2010] [Indexed: 12/12/2022] Open
Abstract
The gene family of human kallikrein-related peptidases (KLKs) encodes proteins with diverse and pleiotropic functions in normal physiology as well as in disease states. Currently, the most widely known KLK is KLK3 or prostate-specific antigen (PSA) that has applications in clinical diagnosis and monitoring of prostate cancer. The KLK gene family encompasses the largest contiguous cluster of serine proteases in humans which is not interrupted by non-KLK genes. This exceptional and unique characteristic of KLKs makes them ideal for evolutionary studies aiming to infer the direction and timing of gene duplication events. Previous studies on the evolution of KLKs were restricted to mammals and the emergence of KLKs was suggested about 150 million years ago (mya). In order to elucidate the evolutionary history of KLKs, we performed comprehensive phylogenetic analyses of KLK homologous proteins in multiple genomes including those that have been completed recently. Interestingly, we were able to identify novel reptilian, avian and amphibian KLK members which allowed us to trace the emergence of KLKs 330 mya. We suggest that a series of duplication and mutation events gave rise to the KLK gene family. The prominent feature of the KLK family is that it consists of tandemly and uninterruptedly arrayed genes in all species under investigation. The chromosomal co-localization in a single cluster distinguishes KLKs from trypsin and other trypsin-like proteases which are spread in different genetic loci. All the defining features of the KLKs were further found to be conserved in the novel KLK protein sequences. The study of this unique family will further assist in selecting new model organisms for functional studies of proteolytic pathways involving KLKs.
Collapse
Affiliation(s)
- Athanasia Pavlopoulou
- Department of Pharmacy, School of Health Sciences, University of Patras, Rion-Patras, Greece
| | - Georgios Pampalakis
- Department of Pharmacy, School of Health Sciences, University of Patras, Rion-Patras, Greece
| | | | - Georgia Sotiropoulou
- Department of Pharmacy, School of Health Sciences, University of Patras, Rion-Patras, Greece
- * E-mail:
| |
Collapse
|
38
|
Quental R, Moleirinho A, Azevedo L, Amorim A. Evolutionary History and Functional Diversification of Phosphomannomutase Genes. J Mol Evol 2010; 71:119-27. [DOI: 10.1007/s00239-010-9368-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2010] [Accepted: 07/12/2010] [Indexed: 11/29/2022]
|
39
|
Tamuri AU, dos Reis M, Hay AJ, Goldstein RA. Identifying changes in selective constraints: host shifts in influenza. PLoS Comput Biol 2009; 5:e1000564. [PMID: 19911053 PMCID: PMC2770840 DOI: 10.1371/journal.pcbi.1000564] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2009] [Accepted: 10/15/2009] [Indexed: 11/19/2022] Open
Abstract
The natural reservoir of Influenza A is waterfowl. Normally, waterfowl viruses are not adapted to infect and spread in the human population. Sometimes, through reassortment or through whole host shift events, genetic material from waterfowl viruses is introduced into the human population causing worldwide pandemics. Identifying which mutations allow viruses from avian origin to spread successfully in the human population is of great importance in predicting and controlling influenza pandemics. Here we describe a novel approach to identify such mutations. We use a sitewise non-homogeneous phylogenetic model that explicitly takes into account differences in the equilibrium frequencies of amino acids in different hosts and locations. We identify 172 amino acid sites with strong support and 518 sites with moderate support of different selection constraints in human and avian viruses. The sites that we identify provide an invaluable resource to experimental virologists studying adaptation of avian flu viruses to the human host. Identification of the sequence changes necessary for host shifts would help us predict the pandemic potential of various strains. The method is of broad applicability to investigating changes in selective constraints when the timing of the changes is known. Influenza A's natural reservoir is waterfowl. Sometimes avian virus genomic segments are able to shift to a human host, either in toto or by combining with those that underwent a previous host shift event. Such host shift events can cause worldwide pandemics in their immunologically naive hosts. In order for these host shifts to establish a stable lineage, the virus has to adapt to the new host. Identifying the changes that have occurred in the past can provide important clues about how this process happens, and how surveillance for new influenza threats should be targeted. Unfortunately, it is difficult to determine whether an amino acid has changed due to adaptation to the new host or whether the change occurred through random drift. Here we describe a novel phylogenetic approach to identifying locations where the nature of the selective pressure exerted on the location has changed corresponding to the host shift event. We identify a set of locations on a number of the genomic segments. The approach we describe is of wide applicability when the timing of the change of selective constraints is known in advance.
Collapse
Affiliation(s)
- Asif U. Tamuri
- National Institute for Medical Research, London, United Kingdom
| | - Mario dos Reis
- National Institute for Medical Research, London, United Kingdom
| | - Alan J. Hay
- National Institute for Medical Research, London, United Kingdom
| | | |
Collapse
|
40
|
Wang HC, Susko E, Roger AJ. PROCOV: maximum likelihood estimation of protein phylogeny under covarion models and site-specific covarion pattern analysis. BMC Evol Biol 2009; 9:225. [PMID: 19737395 PMCID: PMC2758850 DOI: 10.1186/1471-2148-9-225] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2009] [Accepted: 09/08/2009] [Indexed: 11/12/2022] Open
Abstract
Background The covarion hypothesis of molecular evolution holds that selective pressures on a given amino acid or nucleotide site are dependent on the identity of other sites in the molecule that change throughout time, resulting in changes of evolutionary rates of sites along the branches of a phylogenetic tree. At the sequence level, covarion-like evolution at a site manifests as conservation of nucleotide or amino acid states among some homologs where the states are not conserved in other homologs (or groups of homologs). Covarion-like evolution has been shown to relate to changes in functions at sites in different clades, and, if ignored, can adversely affect the accuracy of phylogenetic inference. Results PROCOV (protein covarion analysis) is a software tool that implements a number of previously proposed covarion models of protein evolution for phylogenetic inference in a maximum likelihood framework. Several algorithmic and implementation improvements in this tool over previous versions make computationally expensive tree searches with covarion models more efficient and analyses of large phylogenomic data sets tractable. PROCOV can be used to identify covarion sites by comparing the site likelihoods under the covarion process to the corresponding site likelihoods under a rates-across-sites (RAS) process. Those sites with the greatest log-likelihood difference between a 'covarion' and an RAS process were found to be of functional or structural significance in a dataset of bacterial and eukaryotic elongation factors. Conclusion Covarion models implemented in PROCOV may be especially useful for phylogenetic estimation when ancient divergences between sequences have occurred and rates of evolution at sites are likely to have changed over the tree. It can also be used to study lineage-specific functional shifts in protein families that result in changes in the patterns of site variability among subtrees.
Collapse
Affiliation(s)
- Huai-Chun Wang
- Department of Mathematics and Statistics, Dalhousie University, Halifax, NS, Canada.
| | | | | |
Collapse
|
41
|
Mertz B, Gu X, Reilly PJ. Analysis of functional divergence within two structurally related glycoside hydrolase families. Biopolymers 2009; 91:478-95. [DOI: 10.1002/bip.21154] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
42
|
Phylogeny, taxonomy, and evolution of the endothelin receptor gene family. Mol Phylogenet Evol 2009; 52:677-87. [PMID: 19410007 DOI: 10.1016/j.ympev.2009.04.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2008] [Revised: 03/28/2009] [Accepted: 04/23/2009] [Indexed: 01/29/2023]
Abstract
A gene phylogeny provides the natural historical order to classify genes and to understand their functional, structural, and genomic diversity. The gene family of endothelin receptors (EDNR) is responsible for many key physiological and developmental processes of tetrapods and teleosts. This study provides a well-defined gene phylogeny for the EDNR family, which is used to classify its members and to assess their evolution. The EDNR phylogeny supports the recognition of the EDNRA, EDNRB, and EDNRC subfamilies, as well as more lineage-specific duplicates of teleosts and the African clawed frog. The duplications for these nominal genes are related to the various whole-genome amplifications of vertebrates, jawed vertebrates, fishes, and frog. The EDNR phylogeny also identifies several gene losses, including that of EDNRC from placental and marsupial (therian) mammals. When coupled with structural and biochemical information, site-specific analyses of evolutionary rate shifts reveal two distinct patterns of potential functional changes at the sequence level between therian versus non-therian EDNRA and EDNRB (i.e., between groups without and with EDNRC). An analysis of linkage maps and tetrapod synteny further suggests that the loss of therian EDNRC may be related to a chromosomal deletion in its common ancestor.
Collapse
|
43
|
Macqueen DJ, Johnston IA. Evolution of the multifaceted eukaryotic akirin gene family. BMC Evol Biol 2009; 9:34. [PMID: 19200367 PMCID: PMC2660306 DOI: 10.1186/1471-2148-9-34] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2008] [Accepted: 02/06/2009] [Indexed: 11/10/2022] Open
Abstract
Background Akirins are nuclear proteins that form part of an innate immune response pathway conserved in Drosophila and mice. This studies aim was to characterise the evolution of akirin gene structure and protein function in the eukaryotes. Results akirin genes are present throughout the metazoa and arose before the separation of animal, plant and fungi lineages. Using comprehensive phylogenetic analysis, coupled with comparisons of conserved synteny and genomic organisation, we show that the intron-exon structure of metazoan akirin genes was established prior to the bilateria and that a single proto-orthologue duplicated in the vertebrates, before the gnathostome-agnathan separation, producing akirin1 and akirin2. Phylogenetic analyses of seven vertebrate gene families with members in chromosomal proximity to both akirin1 and akirin2 were compatible with a common duplication event affecting the genomic neighbourhood of the akirin proto-orthologue. A further duplication of akirins occurred in the teleost lineage and was followed by lineage-specific patterns of paralogue loss. Remarkably, akirins have been independently characterised by five research groups under different aliases and a comparison of the available literature revealed diverse functions, generally in regulating gene expression. For example, akirin was characterised in arthropods as subolesin, an important growth factor and in Drosophila as bhringi, which has an essential myogenic role. In vertebrates, akirin1 was named mighty in mice and was shown to regulate myogenesis, whereas akirin2 was characterised as FBI1 in rats and promoted carcinogenesis, acting as a transcriptional repressor when bound to a 14-3-3 protein. Both vertebrate Akirins have evolved under comparably strict constraints of purifying selection, although a likelihood ratio test predicted that functional divergence has occurred between paralogues. Bayesian and maximum likelihood tests identified amino-acid positions where the rate of evolution had shifted significantly between paralogues. Interestingly, the highest scoring position was within a conserved, validated binding-site for 14-3-3 proteins. Conclusion This work offers an evolutionary framework to facilitate future studies of eukaryotic akirins and provides insight into their multifaceted and conserved biochemical functions.
Collapse
Affiliation(s)
- Daniel J Macqueen
- Gatty Marine Laboratory, School of Biology, University of St Andrews, Fife, UK.
| | | |
Collapse
|
44
|
Iwema T, Chaumot A, Studer RA, Robinson-Rechavi M, Billas IML, Moras D, Laudet V, Bonneton F. Structural and evolutionary innovation of the heterodimerization interface between USP and the ecdysone receptor ECR in insects. Mol Biol Evol 2009; 26:753-68. [PMID: 19126866 DOI: 10.1093/molbev/msn302] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Understanding how the variability of protein structure arises during evolution and leads to new structure-function relationships ultimately promoting evolutionary novelties is a major goal of molecular evolution and is critical for interpreting genome sequences. We addressed this issue using the ecdysone receptor (ECR), a major developmental factor that controls development and reproduction of arthropods. The functional ECR is a heterodimer of two nuclear receptors: ECR, which binds ecdysteroids, and its obligatory partner ultraspirade (USP), which is orthologous to the retinoid X receptor of vertebrates. Both genes underwent a dramatic increase of evolutionary rate in Mecopterida, the major insect terminal group containing Dipteras and Lepidopteras. We therefore questioned the implication of this event in terms of coevolution of their dimerization interface. A structural comparison revealed a 30% larger ligand-binding domain (LBD) heterodimerization surface in the Lepidoptera Heliothis when compared with basal insects, associated with a symmetrization of the interface, which is exceptional for nuclear receptors. Reconstruction of ancestral sequences and homology modeling of the ancestral Mecopterida ECR-USP reveal that this enlarged dimerization surface is a synapomorphy for Mecopterida. Furthermore, we show that the residues implicated in the new dimerization surface underwent specific evolutionary constraints in Mecopterida indicative of their new and conserved role in the dimerization interface. Most of all, the novel surface originates from a 15 degrees torsion of a subdomain of USP LBD toward its partner ECR, which is a long-range consequence of the peculiar position of a Mecopterida-specific insertion in loop L1-3, located outside of the interaction surface, in a less crucial domain of the partner protein. These results indicate that the coevolution between ECR and USP occurred through a novel mechanism of intramolecular epistasis that will undoubtedly be generalized for other molecules because it uses flexibility of a less-constrained region of a protein to modify the structure of another, critical part of the molecule.
Collapse
Affiliation(s)
- Thomas Iwema
- Département de Biologie et de Génomique Structurales, IGBMC (Institut de Génétique et de Biologie Moléculaire et Cellulaire), Illkirch, France.
| | | | | | | | | | | | | | | |
Collapse
|
45
|
Penn O, Stern A, Rubinstein ND, Dutheil J, Bacharach E, Galtier N, Pupko T. Evolutionary modeling of rate shifts reveals specificity determinants in HIV-1 subtypes. PLoS Comput Biol 2008; 4:e1000214. [PMID: 18989394 PMCID: PMC2566816 DOI: 10.1371/journal.pcbi.1000214] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2008] [Accepted: 09/23/2008] [Indexed: 11/19/2022] Open
Abstract
A hallmark of the human immunodeficiency virus 1 (HIV-1) is its rapid rate of evolution within and among its various subtypes. Two complementary hypotheses are suggested to explain the sequence variability among HIV-1 subtypes. The first suggests that the functional constraints at each site remain the same across all subtypes, and the differences among subtypes are a direct reflection of random substitutions, which have occurred during the time elapsed since their divergence. The alternative hypothesis suggests that the functional constraints themselves have evolved, and thus sequence differences among subtypes in some sites reflect shifts in function. To determine the contribution of each of these two alternatives to HIV-1 subtype evolution, we have developed a novel Bayesian method for testing and detecting site-specific rate shifts. The RAte Shift EstimatoR (RASER) method determines whether or not site-specific functional shifts characterize the evolution of a protein and, if so, points to the specific sites and lineages in which these shifts have most likely occurred. Applying RASER to a dataset composed of large samples of HIV-1 sequences from different group M subtypes, we reveal rampant evolutionary shifts throughout the HIV-1 proteome. Most of these rate shifts have occurred during the divergence of the major subtypes, establishing that subtype divergence occurred together with functional diversification. We report further evidence for the emergence of a new sub-subtype, characterized by abundant rate-shifting sites. When focusing on the rate-shifting sites detected, we find that many are associated with known function relating to viral life cycle and drug resistance. Finally, we discuss mechanisms of covariation of rate-shifting sites. The AIDS epidemic, inflicted by the human immunodeficiency virus (HIV), has already claimed 25 million lives, thus posing a global threat. Since its discovery, several HIV subtypes have emerged, characterized by distinct genomic sequences and variable geographic locations. Here, we investigate the nature of the genetic differences among the subtypes. The neutral theory of evolution suggests that most genetic differences marginally affect the function of the encoded proteins (hence neutral) and thus occur randomly. Alternatively, changes in protein function are reflected by a pattern of nonrandom genetic differences. To address this issue, we developed a computational method, which studies the differences between sequences of different HIV subtypes, and estimates which of the explanations is more likely. Using a large sample of HIV protein sequences, we discovered that part of the variability among the subtypes is not random and possibly reflects different functional constraints imposed on the subtypes during the course of their evolution. An in-depth inspection of these nonrandom changes revealed a correlation with biological traits, such as drug resistance and mechanisms facilitating viral entry into the host cell. Interestingly, nonrandom changes are also characteristic of a viral strain that recently emerged in the former Soviet Union.
Collapse
Affiliation(s)
- Osnat Penn
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Adi Stern
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Nimrod D. Rubinstein
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Julien Dutheil
- BiRC—Bioinformatics Research Center, University of Aarhus, Århus, Denmark
| | - Eran Bacharach
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Nicolas Galtier
- Institut des Sciences de l'Evolution—CC64, Centre National de la Recherche Scientifique—Université Montpellier 2, Montpelier, France
| | - Tal Pupko
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
- * E-mail:
| |
Collapse
|
46
|
Burlakoti RR, Ali S, Secor GA, Neate SM, McMullen MP, Adhikari TB. Comparative mycotoxin profiles of Gibberella zeae populations from barley, wheat, potatoes, and sugar beets. Appl Environ Microbiol 2008; 74:6513-20. [PMID: 18791024 PMCID: PMC2576685 DOI: 10.1128/aem.01580-08] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2008] [Accepted: 08/28/2008] [Indexed: 11/20/2022] Open
Abstract
Gibberella zeae is one of the most devastating pathogens of barley and wheat in the United States. The fungus also infects noncereal crops, such as potatoes and sugar beets, and the genetic relationships among barley, wheat, potato, and sugar beet isolates indicate high levels of similarity. However, little is known about the toxigenic potential of G. zeae isolates from potatoes and sugar beets. A total of 336 isolates of G. zeae from barley, wheat, potatoes, and sugar beets were collected and analyzed by TRI (trichothecene biosynthesis gene)-based PCR assays. To verify the TRI-based PCR detection of genetic markers by chemical analysis, 45 representative isolates were grown in rice cultures for 28 days and 15 trichothecenes and 2 zearalenone (ZEA) analogs were quantified using gas chromatography-mass spectrometry. TRI-based PCR assays revealed that all isolates had the deoxynivalenol (DON) marker. The frequencies of isolates with the 15-acetyl-deoxynivalenol (15-ADON) marker were higher than those of isolates with the 3-acetyl-deoxynivalenol (3-ADON) marker among isolates from all four crops. Fusarium head blight (FHB)-resistant wheat cultivars had little or no influence on the diversity of isolates associated with the 3-ADON and 15-ADON markers. However, the frequency of isolates with the 3-ADON marker among isolates from the Langdon, ND, sampling site was higher than those among isolates from the Carrington and Minot, ND, sites. In chemical analyses, DON, 3-ADON, 15-ADON, b-ZEA, and ZEA were detected. All isolates produced DON (1 to 782 microg/g) and ZEA (1 to 623 microg/g). These findings may be useful for monitoring mycotoxin contamination and for formulating FHB management strategies for these crops.
Collapse
Affiliation(s)
- Rishi R Burlakoti
- Department of Plant Pathology, North Dakota State University, Fargo, ND 58105, USA
| | | | | | | | | | | |
Collapse
|
47
|
Freamat M, Sower SA. Glycoprotein Hormone Receptors in the Sea Lamprey Petromyzon marinus. Zoolog Sci 2008; 25:1037-44. [DOI: 10.2108/zsj.25.1037] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
48
|
Havird JC, Miyamoto MM, Choe KP, Evans DH. Gene Duplications and Losses within the Cyclooxygenase Family of Teleosts and Other Chordates. Mol Biol Evol 2008; 25:2349-59. [DOI: 10.1093/molbev/msn183] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
|
49
|
Joannin N, Abhiman S, Sonnhammer EL, Wahlgren M. Sub-grouping and sub-functionalization of the RIFIN multi-copy protein family. BMC Genomics 2008; 9:19. [PMID: 18197962 PMCID: PMC2257938 DOI: 10.1186/1471-2164-9-19] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2007] [Accepted: 01/15/2008] [Indexed: 01/06/2023] Open
Abstract
Background Parasitic protozoans possess many multicopy gene families which have central roles in parasite survival and virulence. The number and variability of members of these gene families often make it difficult to predict possible functions of the encoded proteins. The families of extra-cellular proteins that are exposed to a host immune response have been driven via immune selection to become antigenically variant, and thereby avoid immune recognition while maintaining protein function to establish a chronic infection. Results We have combined phylogenetic and function shift analyses to study the evolution of the RIFIN proteins, which are antigenically variant and are encoded by the largest multicopy gene family in Plasmodium falciparum. We show that this family can be subdivided into two major groups that we named A- and B-RIFIN proteins. This suggested sub-grouping is supported by a recently published study that showed that, despite the presence of the Plasmodium export (PEXEL) motif in all RIFIN variants, proteins from each group have different cellular localizations during the intraerythrocytic life cycle of the parasite. In the present study we show that function shift analysis, a novel technique to predict functional divergence between sub-groups of a protein family, indicates that RIFINs have undergone neo- or sub-functionalization. Conclusion These results question the general trend of clustering large antigenically variant protein groups into homogenous families. Assigning functions to protein families requires their subdivision into meaningful groups such as we have shown for the RIFIN protein family. Using phylogenetic and function shift analysis methods, we identify new directions for the investigation of this broad and complex group of proteins.
Collapse
Affiliation(s)
- Nicolas Joannin
- Department of Microbiology, Tumor and Cell biology (MTC), Karolinska Institutet, SE-17177 Stockholm, Sweden and Swedish Institute for Infectious Diseases Control, SE-17182 Stockholm, Sweden.
| | | | | | | |
Collapse
|
50
|
Negrisolo E, Bargelloni L, Patarnello T, Ozouf-Costaz C, Pisano E, di Prisco G, Verde C. Comparative and evolutionary genomics of globin genes in fish. Methods Enzymol 2008; 436:511-38. [PMID: 18237652 DOI: 10.1016/s0076-6879(08)36029-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Sequencing genomes of model organisms is a great challenge for biological sciences. In the past decade, scientists have developed a large number of methods to align and compare sequenced genomes. The analysis of a given sequence provides much information on the genome structure but to a lesser extent on the function. Comparative genomics are a useful tool for functional and evolutionary annotation of genomes. In principle, comparison of genomic sequences may allow for identification of the evolutionary selection (negative or positive) that the functional sequences have been subjected to over time. Positively selected genome regions are the most important ones for evolution, because most changes are adaptive and often induce biological differences in organisms. The draft genomes of five fish species have recently become available. We herewith review and discuss some new insights into comparative genomics in fish globin genes. Special attention will be given to a complementary methodological approach to comparative genomics, fluorescence in situ hybridization (FISH). Internet resources for analyzing sequence alignments and annotations and new bioinformatic tools to address critical problems are thoroughly discussed.
Collapse
Affiliation(s)
- Enrico Negrisolo
- Department of Public Health, Comparative Pathology, and Veterinary Hygiene, University of Padova, Legnaro, Italy
| | | | | | | | | | | | | |
Collapse
|