1
|
Zaman N, Parvaiz N, Gul F, Yousaf R, Gul K, Azam SS. Dynamics of water-mediated interaction effects on the stability and transmission of Omicron. Sci Rep 2023; 13:20894. [PMID: 38017052 PMCID: PMC10684572 DOI: 10.1038/s41598-023-48186-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Accepted: 11/23/2023] [Indexed: 11/30/2023] Open
Abstract
SARS-Cov-2 Omicron variant and its highly transmissible sublineages amidst news of emerging hybrid variants strengthen the evidence of its ability to rapidly spread and evolve giving rise to unprecedented future waves. Owing to the presence of isolated RBD, monomeric and trimeric Cryo-EM structures of spike protein in complex with ACE2 receptor, comparative analysis of Alpha, Beta, Gamma, Delta, and Omicron assist in a rational assessment of their probability to evolve as new or hybrid variants in future. This study proposes the role of hydration forces in mediating Omicron function and dynamics based on a stronger interplay between protein and solvent with each Covid wave. Mutations of multiple hydrophobic residues into hydrophilic residues underwent concerted interactions with water leading to variations in charge distribution in Delta and Omicron during molecular dynamics simulations. Moreover, comparative analysis of interacting moieties characterized a large number of mutations lying at RBD into constrained, homologous and low-affinity groups referred to as mutational drivers inferring that the probability of future mutations relies on their function. Furthermore, the computational findings reveal a significant difference in angular distances among variants of concern due 3 amino acid insertion (EPE) in Omicron variant that not only facilitates tight domain organization but also seems requisite for characterization of mutational processes. The outcome of this work signifies the possible relation between hydration forces, their impact on conformation and binding affinities, and viral fitness that will significantly aid in understanding dynamics of drug targets for Covid-19 countermeasures. The emerging scenario is that hydration forces and hydrophobic interactions are crucial variables to probe in mutational analysis to explore conformational landscape of macromolecules and reveal the molecular origins of protein behaviors.
Collapse
Affiliation(s)
- Naila Zaman
- Computational Biology Lab, National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Nousheen Parvaiz
- Computational Biology Lab, National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Fouzia Gul
- Computational Biology Lab, National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Rimsha Yousaf
- Computational Biology Lab, National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Kainat Gul
- Computational Biology Lab, National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan
| | - Syed Sikander Azam
- Computational Biology Lab, National Center for Bioinformatics (NCB), Quaid-i-Azam University, Islamabad, 45320, Pakistan.
| |
Collapse
|
2
|
Banerjee A, Bahar I. Structural Dynamics Predominantly Determine the Adaptability of Proteins to Amino Acid Deletions. Int J Mol Sci 2023; 24:8450. [PMID: 37176156 PMCID: PMC10179678 DOI: 10.3390/ijms24098450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 05/01/2023] [Accepted: 05/06/2023] [Indexed: 05/15/2023] Open
Abstract
The insertion or deletion (indel) of amino acids has a variety of effects on protein function, ranging from disease-forming changes to gaining new functions. Despite their importance, indels have not been systematically characterized towards protein engineering or modification goals. In the present work, we focus on deletions composed of multiple contiguous amino acids (mAA-dels) and their effects on the protein (mutant) folding ability. Our analysis reveals that the mutant retains the native fold when the mAA-del obeys well-defined structural dynamics properties: localization in intrinsically flexible regions, showing low resistance to mechanical stress, and separation from allosteric signaling paths. Motivated by the possibility of distinguishing the features that underlie the adaptability of proteins to mAA-dels, and by the rapid evaluation of these features using elastic network models, we developed a positive-unlabeled learning-based classifier that can be adopted for protein design purposes. Trained on a consolidated set of features, including those reflecting the intrinsic dynamics of the regions where the mAA-dels occur, the new classifier yields a high recall of 84.3% for identifying mAA-dels that are stably tolerated by the protein. The comparative examination of the relative contribution of different features to the prediction reveals the dominant role of structural dynamics in enabling the adaptation of the mutant to mAA-del without disrupting the native fold.
Collapse
Affiliation(s)
- Anupam Banerjee
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
| | - Ivet Bahar
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY 11794, USA
- Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, NY 11794, USA
| |
Collapse
|
3
|
Pillai AS, Hochberg GK, Thornton JW. Simple mechanisms for the evolution of protein complexity. Protein Sci 2022; 31:e4449. [PMID: 36107026 PMCID: PMC9601886 DOI: 10.1002/pro.4449] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 09/01/2022] [Accepted: 09/10/2022] [Indexed: 01/26/2023]
Abstract
Proteins are tiny models of biological complexity: specific interactions among their many amino acids cause proteins to fold into elaborate structures, assemble with other proteins into higher-order complexes, and change their functions and structures upon binding other molecules. These complex features are classically thought to evolve via long and gradual trajectories driven by persistent natural selection. But a growing body of evidence from biochemistry, protein engineering, and molecular evolution shows that naturally occurring proteins often exist at or near the genetic edge of multimerization, allostery, and even new folds, so just one or a few mutations can trigger acquisition of these properties. These sudden transitions can occur because many of the physical properties that underlie these features are present in simpler proteins as fortuitous by-products of their architecture. Moreover, complex features of proteins can be encoded by huge arrays of sequences, so they are accessible from many different starting points via many possible paths. Because the bridges to these features are both short and numerous, random chance can join selection as a key factor in explaining the evolution of molecular complexity.
Collapse
Affiliation(s)
- Arvind S. Pillai
- Department of Ecology and EvolutionUniversity of ChicagoChicagoIllinoisUSA
- Institute for Protein DesignUniversity of WashingtonSeattleWAUSA
| | - Georg K.A. Hochberg
- Max Planck Institute for Terrestrial MicrobiologyMarburgGermany
- Department of Chemistry, Center for Synthetic MicrobiologyPhilipps University MarburgMarburgGermany
| | - Joseph W. Thornton
- Department of Ecology and EvolutionUniversity of ChicagoChicagoIllinoisUSA
- Departments of Human Genetics and Ecology and EvolutionUniversity of ChicagoChicagoIllinoisUSA
| |
Collapse
|
4
|
Huded AKC, Jingade P, Mishra MK, Ercisli S, Ilhan G, Marc RA, Vodnar D. Comparative genomic analysis and phylogeny of NAC25 gene from cultivated and wild Coffea species. FRONTIERS IN PLANT SCIENCE 2022; 13:1009733. [PMID: 36186041 PMCID: PMC9523601 DOI: 10.3389/fpls.2022.1009733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 08/30/2022] [Indexed: 06/16/2023]
Abstract
Coffee is a high value agricultural commodity grown in about 80 countries. Sustainable coffee cultivation is hampered by multiple biotic and abiotic stress conditions predominantly driven by climate change. The NAC proteins are plants specific transcription factors associated with various physiological functions in plants which include cell division, secondary wall formation, formation of shoot apical meristem, leaf senescence, flowering embryo and seed development. Besides, they are also involved in biotic and abiotic stress regulation. Due to their ubiquitous influence, studies on NAC transcription factors have gained momentum in different crop plant species. In the present study, NAC25 like transcription factor was isolated and characterized from two cultivated coffee species, Coffea arabica and Coffea canephora and five Indian wild coffee species for the first time. The full-length NAC25 gene varied from 2,456 bp in Coffea jenkinsii to 2,493 bp in C. arabica. In all the seven coffee species, sequencing of the NAC25 gene revealed 3 exons and 2 introns. The NAC25 gene is characterized by a highly conserved 377 bp NAM domain (N-terminus) and a highly variable C terminus region. The sequence analysis revealed an average of one SNP per every 40.92 bp in the coding region and 37.7 bp in the intronic region. Further, the non-synonymous SNPs are 8-11 fold higher compared to synonymous SNPs in the non-coding and coding region of the NAC25 gene, respectively. The expression of NAC25 gene was studied in six different tissue types in C. canephora and higher expression levels were observed in leaf and flower tissues. Further, the relative expression of NAC25 in comparison with the GAPDH gene revealed four folds and eight folds increase in expression levels in green fruit and ripen fruit, respectively. The evolutionary relationship revealed the independent evolution of the NAC25 gene in coffee.
Collapse
Affiliation(s)
- Arun Kumar C. Huded
- Plant Biotechnology Division, Unit of Central Coffee Research Institute, Coffee Board, Mysore, Karnataka, India
| | - Pavankumar Jingade
- Plant Biotechnology Division, Unit of Central Coffee Research Institute, Coffee Board, Mysore, Karnataka, India
| | - Manoj Kumar Mishra
- Plant Biotechnology Division, Unit of Central Coffee Research Institute, Coffee Board, Mysore, Karnataka, India
| | - Sezai Ercisli
- Department of Horticulture, Faculty of Agriculture, Erzurum, Turkey
| | - Gulce Ilhan
- Department of Horticulture, Faculty of Agriculture, Erzurum, Turkey
| | - Romina Alina Marc
- Food Engineering Department, Faculty of Food Science and Technology, University of Agricultural Sciences and Veterinary Medicine, Cluj-Napoca, Romania
| | - Dan Vodnar
- Institute of Life Sciences, Faculty of Food Science and Technology, University of Agricultural Sciences and Veterinary Medicine, Cluj-Napoca, Romania
| |
Collapse
|
5
|
Houston DR, Hanna JG, Lathe JC, Hillier SG, Lathe R. Evidence that nuclear receptors are related to terpene synthases. J Mol Endocrinol 2022; 68:153-166. [PMID: 35112668 PMCID: PMC8942334 DOI: 10.1530/jme-21-0156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 02/03/2022] [Indexed: 11/08/2022]
Abstract
Ligand-activated nuclear receptors (NRs) orchestrate development, growth, and reproduction across all animal lifeforms - the Metazoa - but how NRs evolved remains mysterious. Given the NR ligands including steroids and retinoids are predominantly terpenoids, we asked whether NRs might have evolved from enzymes that catalyze terpene synthesis and metabolism. We provide evidence suggesting that NRs may be related to the terpene synthase (TS) enzyme superfamily. Based on over 10,000 3D structural comparisons, we report that the NR ligand-binding domain and TS enzymes share a conserved core of seven α-helical segments. In addition, the 3D locations of the major ligand-contacting residues are also conserved between the two protein classes. Primary sequence comparisons reveal suggestive similarities specifically between NRs and the subfamily of cis-isoprene transferases, notably with dehydrodolichyl pyrophosphate synthase and its obligate partner, NUS1/NOGOB receptor. Pharmacological overlaps between NRs and TS enzymes add weight to the contention that they share a distant evolutionary origin, and the combined data raise the possibility that a ligand-gated receptor may have arisen from an enzyme antecedent. However, our findings do not formally exclude other interpretations such as convergent evolution, and further analysis will be necessary to confirm the inferred relationship between the two protein classes.
Collapse
Affiliation(s)
- Douglas R Houston
- Institute of Quantitative Biology, Biochemistry, and Biotechnology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Jane G Hanna
- Institute of Quantitative Biology, Biochemistry, and Biotechnology, School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | | | - Stephen G Hillier
- Medical Research Council Centre for Reproductive Health, University of Edinburgh, Edinburgh, UK
- Correspondence should be addressed to S G Hillier or R Lathe: or
| | - Richard Lathe
- Division of Infection Medicine, University of Edinburgh, Edinburgh, UK
- Correspondence should be addressed to S G Hillier or R Lathe: or
| |
Collapse
|
6
|
Zhao VY, Rodrigues JV, Lozovsky ER, Hartl DL, Shakhnovich EI. Switching an active site helix in dihydrofolate reductase reveals limits to subdomain modularity. Biophys J 2021; 120:4738-4750. [PMID: 34571014 PMCID: PMC8595743 DOI: 10.1016/j.bpj.2021.09.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 09/14/2021] [Accepted: 09/22/2021] [Indexed: 11/23/2022] Open
Abstract
To what degree are individual structural elements within proteins modular such that similar structures from unrelated proteins can be interchanged? We study subdomain modularity by creating 20 chimeras of an enzyme, Escherichia coli dihydrofolate reductase (DHFR), in which a catalytically important, 10-residue α-helical sequence is replaced by α-helical sequences from a diverse set of proteins. The chimeras stably fold but have a range of diminished thermal stabilities and catalytic activities. Evolutionary coupling analysis indicates that the residues of this α-helix are under selection pressure to maintain catalytic activity in DHFR. Reversion to phenylalanine at key position 31 was found to partially restore catalytic activity, which could be explained by evolutionary coupling values. We performed molecular dynamics simulations using replica exchange with solute tempering. Chimeras with low catalytic activity exhibit nonhelical conformations that block the binding site and disrupt the positioning of the catalytically essential residue D27. Simulation observables and in vitro measurements of thermal stability and substrate-binding affinity are strongly correlated. Several E. coli strains with chromosomally integrated chimeric DHFRs can grow, with growth rates that follow predictions from a kinetic flux model that depends on the intracellular abundance and catalytic activity of DHFR. Our findings show that although α-helices are not universally substitutable, the molecular and fitness effects of modular segments can be predicted by the biophysical compatibility of the replacement segment.
Collapse
Affiliation(s)
- Victor Y Zhao
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts
| | - João V Rodrigues
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts
| | - Elena R Lozovsky
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts
| | - Daniel L Hartl
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts.
| |
Collapse
|
7
|
Tay NW, Liu F, Wang C, Zhang H, Zhang P, Chen YZ. Protein music of enhanced musicality by music style guided exploration of diverse amino acid properties. Heliyon 2021; 7:e07933. [PMID: 34632134 PMCID: PMC8488493 DOI: 10.1016/j.heliyon.2021.e07933] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Revised: 06/19/2021] [Accepted: 09/02/2021] [Indexed: 11/27/2022] Open
Abstract
Inspired by the traceable analogies between protein sequences and music notes, protein music has been composed from amino acid sequences for popularizing science and sourcing melodies. Despite the continuous development of protein-to-music algorithms, the musicality of protein music lags far behind human music. Musicality may be enhanced by fine-tuned protein-to-music mapping to the features of a specific music style. We analyzed the features of a music style (Fantasy-Impromptu style), and used the quantized musical features to guide broad exploration of diverse amino acid properties (104 properties, sequence patterns and variations) for developing a novel protein-to-music algorithm of enhanced musicality. This algorithm was applied to 18 proteins of various biological functions. The derived music pieces consistently exhibited enhanced musicality with respect to existing protein music. Music style guided exploration of diverse amino acid properties enable protein music composition of enhanced musicality, which may be further developed and applied to a wider variety of music styles.
Collapse
Affiliation(s)
- Nicole WanNi Tay
- Raffles Institution, 1 Raffles Institution Ln, 575954, Singapore
| | - Fanxi Liu
- Raffles Institution, 1 Raffles Institution Ln, 575954, Singapore
| | - Chaoxin Wang
- Department of Computer Science, Kansas State University, Manhattan, KS, 66506, USA
| | - Hui Zhang
- School of Arts, Minnan Normal University, Zhengzhou, 363000, China
| | - Peng Zhang
- Bioinformatics and Drug Design Group, Department of Pharmacy, and Center for Computational Science and Engineering, National University of Singapore, 117543, Singapore
| | - Yu Zong Chen
- Bioinformatics and Drug Design Group, Department of Pharmacy, and Center for Computational Science and Engineering, National University of Singapore, 117543, Singapore.,Qian Xuesen Collaborative Research Center of Astrochemistry and Space Life Sciences, Institute of Drug Discovery Technology, Ningbo University, Ningbo, 315211, China
| |
Collapse
|
8
|
Lipničanová S, Chmelová D, Ondrejovič M, Frecer V, Miertuš S. Diversity of sialidases found in the human body - A review. Int J Biol Macromol 2020; 148:857-868. [PMID: 31945439 DOI: 10.1016/j.ijbiomac.2020.01.123] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 01/10/2020] [Accepted: 01/11/2020] [Indexed: 12/31/2022]
Abstract
Sialidases are enzymes essential for numerous organisms including humans. Hydrolytic sialidases (EC 3.2.1.18), trans-sialidases and anhydrosialidases (intramolecular trans-sialidases, EC 4.2.2.15) are glycoside hydrolase enzymes that cleave the glycosidic linkage and release sialic acid residues from sialyl substrates. The paper summarizes diverse sialidases present in the human body and their potential impact on development of antiviral compounds - inhibitors of viral neuraminidases. It includes a brief overview of catalytic mechanisms of action of sialidases and describes the origin of sialidases in the human body. This is followed by description of the structure and function of sialidase families with a special focus on the GH33 and GH34 families. Various effects of sialidases on human body are also briefly described. Modulation of sialidase activity may be considered a useful tool for effective treatment of various diseases. In some cases, it is desired to completely suppress the activity of sialidases by suitable inhibitors. Specific sialidase inhibitors are useful for the treatment of influenza, epilepsy, Alzheimer's disease, diabetes, different types of cancer, or heart defects. Challenges and future directions are shortly depicted in the final part of the paper.
Collapse
Affiliation(s)
- Sabina Lipničanová
- Department of Biotechnology, Faculty of Natural Sciences, University of Ss. Cyril and Methodius in Trnava, Nám. J. Herdu 2, SK-91701 Trnava, Slovakia
| | - Daniela Chmelová
- Department of Biotechnology, Faculty of Natural Sciences, University of Ss. Cyril and Methodius in Trnava, Nám. J. Herdu 2, SK-91701 Trnava, Slovakia.
| | - Miroslav Ondrejovič
- Department of Biotechnology, Faculty of Natural Sciences, University of Ss. Cyril and Methodius in Trnava, Nám. J. Herdu 2, SK-91701 Trnava, Slovakia.
| | - Vladimír Frecer
- Department of Physical Chemistry of Drugs, Faculty of Pharmacy, Comenius University in Bratislava, Odbojárov 10, SK-83232 Bratislava, Slovakia; ICARST n.o., Jamnického 19, SK-84101, Bratislava, Slovakia.
| | - Stanislav Miertuš
- Department of Biotechnology, Faculty of Natural Sciences, University of Ss. Cyril and Methodius in Trnava, Nám. J. Herdu 2, SK-91701 Trnava, Slovakia; ICARST n.o., Jamnického 19, SK-84101, Bratislava, Slovakia.
| |
Collapse
|
9
|
Effect of synthesis medium on structural and photocatalytic properties of ZnO/carbon xerogel composites for solar and visible light degradation of 4-chlorophenol and bisphenol A. Colloids Surf A Physicochem Eng Asp 2020. [DOI: 10.1016/j.colsurfa.2019.124034] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
10
|
Sol-gel Syntheses of Photocatalysts for the Removal of Pharmaceutical Products in Water. NANOMATERIALS 2019; 9:nano9010126. [PMID: 30669532 PMCID: PMC6358872 DOI: 10.3390/nano9010126] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 01/13/2019] [Accepted: 01/17/2019] [Indexed: 11/17/2022]
Abstract
A screening study on seven photocatalysts was performed to identify the best candidate for pharmaceutical products degradation in water. Photocatalysts were deposited as thin films through a sol-gel process and subsequent dip-coating on glass slides. The efficiency of each photocatalyst was assessed through the degradation of methylene blue first, and then, through the degradation of 15 different pharmaceutical products. Two main types of synthesis methods were considered: aqueous syntheses, where the reaction takes place in water, and organic syntheses, where reactions take place in an organic solvent and only a stoichiometric amount of water is added to the reaction medium. Photocatalysts synthesized via aqueous sol-gel routes showed relatively lower degradation efficiencies; however, the organic route required a calcination step at high temperature to form the photoactive crystalline phase, while the aqueous route did not. The best performances for the degradation of pharmaceuticals arose when Evonik P25 and silver nanoparticles were added to TiO2, which was synthesized using an organic solvent. In the case of methylene blue degradation, TiO2 modified with Evonik P25 and TiO2 doped with MnO2 nanoparticles were the two best candidates.
Collapse
|
11
|
Tiwari I, Mahanwar PA. Polyacrylate/silica hybrid materials: A step towards multifunctional properties. J DISPER SCI TECHNOL 2018. [DOI: 10.1080/01932691.2018.1489276] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Affiliation(s)
- Ingita Tiwari
- Department of Polymer and Surface Engineering, Institute of Chemical Technology , Mumbai , India
| | - P. A. Mahanwar
- Department of Polymer and Surface Engineering, Institute of Chemical Technology , Mumbai , India
| |
Collapse
|
12
|
Hussain A, Calabria-Holley J, Jiang Y, Lawrence M. Modification of hemp shiv properties using water-repellent sol-gel coatings. JOURNAL OF SOL-GEL SCIENCE AND TECHNOLOGY 2018; 86:187-197. [PMID: 31258251 PMCID: PMC6560928 DOI: 10.1007/s10971-018-4621-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 02/24/2018] [Indexed: 05/31/2023]
Abstract
For the first time, the hydrophilicity of hemp shiv was modified without the compromise of its hygroscopic properties. This research focused on the use of sol-gel method in preparation of coatings on the natural plant material, hemp shiv, that has growing potential in the construction industry as a thermal insulator. The sol-gel coatings were produced by cohydrolysis and polycondensation of tetraethyl orthosilicate (TEOS) using an acidic catalyst. Methyltriethoxysilane (MTES) was added as the hydrophobic precursor to provide water resistance to the bio-based material. Scanning electron microscopy (SEM) and focused ion beam (FIB) have been used to determine the morphological changes on the surface as well as within the hemp shiv. It was found that the sol-gel coatings caused a reduction in water uptake but did not strongly influence the moisture sorption behaviour of hemp shiv. Fourier transformed infrared (FTIR) spectroscopy shows that the coating layer on hemp shiv acts a shield, thereby lowering peak intensity in the wavelength range 1200-1800 cm-1. The sol-gel coating affected pore size distribution and cumulative pore volume of the shiv resulting in tailored porosity. The overall porosity of shiv decreased with a refinement in diameter of the larger pores. Thermal analysis was performed using TGA and stability of coated and uncoated hemp shiv have been evaluated. Hemp shiv modified with sol-gel coating can potentially develop sustainable heat insulating composites with better hygrothermal properties.
Collapse
Affiliation(s)
- Atif Hussain
- BRE Centre for Innovative Construction Materials, Department of Architecture and Civil Engineering, University of Bath, Bath, BA2 7AY UK
| | - Juliana Calabria-Holley
- BRE Centre for Innovative Construction Materials, Department of Architecture and Civil Engineering, University of Bath, Bath, BA2 7AY UK
| | - Yunhong Jiang
- BRE Centre for Innovative Construction Materials, Department of Architecture and Civil Engineering, University of Bath, Bath, BA2 7AY UK
| | - Mike Lawrence
- BRE Centre for Innovative Construction Materials, Department of Architecture and Civil Engineering, University of Bath, Bath, BA2 7AY UK
| |
Collapse
|
13
|
Substrate-binding specificity of chitinase and chitosanase as revealed by active-site architecture analysis. Carbohydr Res 2015; 418:50-56. [PMID: 26545262 DOI: 10.1016/j.carres.2015.10.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2015] [Revised: 10/03/2015] [Accepted: 10/06/2015] [Indexed: 11/21/2022]
Abstract
Chitinases and chitosanases, referred to as chitinolytic enzymes, are two important categories of glycoside hydrolases (GH) that play a key role in degrading chitin and chitosan, two naturally abundant polysaccharides. Here, we investigate the active site architecture of the major chitosanase (GH8, GH46) and chitinase families (GH18, GH19). Both charged (Glu, His, Arg, Asp) and aromatic amino acids (Tyr, Trp, Phe) are observed with higher frequency within chitinolytic active sites as compared to elsewhere in the enzyme structure, indicating significant roles related to enzyme function. Hydrogen bonds between chitinolytic enzymes and the substrate C2 functional groups, i.e. amino groups and N-acetyl groups, drive substrate recognition, while non-specific CH-π interactions between aromatic residues and substrate mainly contribute to tighter binding and enhanced processivity evident in GH8 and GH18 enzymes. For different families of chitinolytic enzymes, the number, type, and position of substrate atoms bound in the active site vary, resulting in different substrate-binding specificities. The data presented here explain the synergistic action of multiple enzyme families at a molecular level and provide a more reasonable method for functional annotation, which can be further applied toward the practical engineering of chitinases and chitosanases.
Collapse
|
14
|
Orizio F, Damiati E, Giacopuzzi E, Benaglia G, Pianta S, Schauer R, Schwartz-Albiez R, Borsani G, Bresciani R, Monti E. Human sialic acid acetyl esterase: Towards a better understanding of a puzzling enzyme. Glycobiology 2015; 25:992-1006. [DOI: 10.1093/glycob/cwv034] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Accepted: 05/17/2015] [Indexed: 01/09/2023] Open
|
15
|
Al-Shatnawi M, Ahmad MO, Swamy MNS. Prediction of Indel flanking regions in protein sequences using a variable-order Markov model. Bioinformatics 2015; 31:40-7. [PMID: 25178462 DOI: 10.1093/bioinformatics/btu556] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Insertion/deletion (indel) and amino acid substitution are two common events that lead to the evolution of and variations in protein sequences. Further, many of the human diseases and functional divergence between homologous proteins are more related to indel mutations, even though they occur less often than the substitution mutations do. A reliable identification of indels and their flanking regions is a major challenge in research related to protein evolution, structures and functions. RESULTS In this article, we propose a novel scheme to predict indel flanking regions in a protein sequence for a given protein fold, based on a variable-order Markov model. The proposed indel flanking region (IndelFR) predictors are designed based on prediction by partial match (PPM) and probabilistic suffix tree (PST), which are referred to as the PPM IndelFR and PST IndelFR predictors, respectively. The overall performance evaluation results show that the proposed predictors are able to predict IndelFRs in the protein sequences with a high accuracy and F1 measure. In addition, the results show that if one is interested only in predicting IndelFRs in protein sequences, it would be preferable to use the proposed predictors instead of HMMER 3.0 in view of the substantially superior performance of the former.
Collapse
Affiliation(s)
- Mufleh Al-Shatnawi
- Department of Electrical and Computer Engineering, Concordia University, QC H3G 2W1, Canada
| | - M Omair Ahmad
- Department of Electrical and Computer Engineering, Concordia University, QC H3G 2W1, Canada
| | - M N S Swamy
- Department of Electrical and Computer Engineering, Concordia University, QC H3G 2W1, Canada
| |
Collapse
|
16
|
Hsieh SY, Tsai IP, Hung HC, Chen YC, Chou HH, Lee CW. An Enhanced Algorithm for Reconstructing a Phylogenetic Tree Based on the Tree Rearrangement and Maximum Likelihood Method. INTELLIGENT COMPUTING THEORIES AND METHODOLOGIES 2015. [DOI: 10.1007/978-3-319-22186-1_53] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
|
17
|
Glasauer SMK, Neuhauss SCF. Whole-genome duplication in teleost fishes and its evolutionary consequences. Mol Genet Genomics 2014; 289:1045-60. [PMID: 25092473 DOI: 10.1007/s00438-014-0889-2] [Citation(s) in RCA: 511] [Impact Index Per Article: 51.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 07/15/2014] [Indexed: 12/18/2022]
Abstract
Whole-genome duplication (WGD) events have shaped the history of many evolutionary lineages. One such duplication has been implicated in the evolution of teleost fishes, by far the most species-rich vertebrate clade. After initial controversy, there is now solid evidence that such event took place in the common ancestor of all extant teleosts. It is termed teleost-specific (TS) WGD. After WGD, duplicate genes have different fates. The most likely outcome is non-functionalization of one duplicate gene due to the lack of selective constraint on preserving both. Mechanisms that act on preservation of duplicates are subfunctionalization (partitioning of ancestral gene functions on the duplicates), neofunctionalization (assigning a novel function to one of the duplicates) and dosage selection (preserving genes to maintain dosage balance between interconnected components). Since the frequency of these mechanisms is influenced by the genes' properties, there are over-retained classes of genes, such as highly expressed ones and genes involved in neural function. The consequences of the TS-WGD, especially its impact on the massive radiation of teleosts, have been matter of controversial debate. It is evident that gene duplications are crucial for generating complexity and that WGDs provide large amounts of raw material for evolutionary adaptation and innovation. However, it is less clear whether the TS-WGD is directly linked to the evolutionary success of teleosts and their radiation. Recent studies let us conclude that TS-WGD has been important in generating teleost complexity, but that more recent ecological adaptations only marginally related to TS-WGD might have even contributed more to diversification. It is likely, however, that TS-WGD provided teleosts with diversification potential that can become effective much later, such as during phases of environmental change.
Collapse
Affiliation(s)
- Stella M K Glasauer
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, 8057, Zurich, Switzerland
| | | |
Collapse
|
18
|
Mutt E, Mathew OK, Sowdhamini R. LenVarDB: database of length-variant protein domains. Nucleic Acids Res 2013; 42:D246-50. [PMID: 24194591 PMCID: PMC3964994 DOI: 10.1093/nar/gkt1014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions–deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
Collapse
Affiliation(s)
- Eshita Mutt
- International Institute of Information Technology-Hyderabad, Gachibowli, Hyderabad 500032, Andhra Pradesh, India, National Centre for Biological Sciences (TIFR), UAS-GKVK Campus, Bellary Road, Bangalore 560065, Karnataka, India and SASTRA University, Tirumalaisamudram, Thanjavur 613401, Tamil Nadu, India
| | | | | |
Collapse
|
19
|
Bigi A, Tringali C, Forcella M, Mozzi A, Venerando B, Monti E, Fusi P. A proline-rich loop mediates specific functions of human sialidase NEU4 in SK-N-BE neuronal differentiation. Glycobiology 2013; 23:1499-509. [PMID: 24030392 DOI: 10.1093/glycob/cwt078] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Human sialidase NEU4 long (N4L) is a membrane-associated enzyme that has been shown to be localized in the outer mitochondrial membrane. A role in different cellular processes has been suggested for this enzyme, such as apoptosis, neuronal differentiation and tumorigenesis. However, the molecular bases for these roles, not found in any of the other highly similar human sialidases, are not understood. We have found that a proline-rich sequence of 81 amino acids, unique to NEU4 sequence, contains potential Akt and Erk1 kinase motifs. Molecular modeling, based on the experimentally determined three-dimensional structure of cytosolic human NEU2, showed that the proline-rich sequence is accommodated in a loop, thus preserving the typical beta-barrel structure of sialidases. In order to investigate the role of this loop in neuronal differentiation, we obtained SK-N-BE neuroblastoma cells stably overexpressing either human wild-type N4L or a deletion mutant lacking the proline-rich loop. Our results demonstrate that the proline-rich region can also enhance cell proliferation and retinoic acid (RA)-induced neuronal differentiation and it is also involved in NEU4 interaction with Akt, as well as in substrate recognition, modifying directly or through the interaction with other protein(s) the enzyme specificity toward sialylated glycoprotein(s). On the whole, our results suggest that N4L could be a downstream component of the PI3K/Akt signaling pathway required for RA-induced differentiation of neuroblastoma SK-N-BE cells.
Collapse
Affiliation(s)
- Alessandra Bigi
- Department of Biotechnologies and Biosciences, University of Milan-Bicocca, 20126 Milan, Italy
| | | | | | | | | | | | | |
Collapse
|
20
|
Stewart KL, Nelson MR, Eaton KV, Anderson WJ, Cordes MHJ. A role for indels in the evolution of Cro protein folds. Proteins 2013; 81:1988-96. [PMID: 23843258 DOI: 10.1002/prot.24358] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Revised: 05/30/2013] [Accepted: 06/10/2013] [Indexed: 11/06/2022]
Abstract
Insertions and deletions in protein sequences, or indels, can disrupt structure and may result in changes in protein folds during evolution or in association with alternative splicing. Pfl 6 and Xfaso 1 are two proteins in the Cro family that share a common ancestor but have different folds. Sequence alignments of the two proteins show two gaps, one at the N terminus, where the sequence of Xfaso 1 is two residues shorter, and one near the center of the sequence, where the sequence of Pfl 6 is five residues shorter. To test the potential importance of indels in Cro protein evolution, we generated hybrid variants of Pfl 6 and Xfaso 1 with indels in one or both regions, chosen according to several plausible sequence alignments. All but one deletion variant completely unfolded both proteins, showing that a longer N-terminal sequence was critical for Pfl 6 folding and a longer central region sequence was critical for Xfaso 1 folding. By contrast, Xfaso 1 tolerated a longer N-terminal sequence with little destabilization, and Pfl 6 tolerated central region insertions, albeit with substantial effects on thermal stability and some perturbation of the surrounding structure. None of the mutations appeared to convert one stable fold into the other. On the basis of this two-protein comparison, short insertion and deletion mutations probably played a role in evolutionary fold change in the Cro family, but were also not the only factors.
Collapse
Affiliation(s)
- Katie L Stewart
- Department of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona, 85721-0088
| | | | | | | | | |
Collapse
|
21
|
Residue mutations and their impact on protein structure and function: detecting beneficial and pathogenic changes. Biochem J 2013; 449:581-94. [DOI: 10.1042/bj20121221] [Citation(s) in RCA: 131] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The present review focuses on the evolution of proteins and the impact of amino acid mutations on function from a structural perspective. Proteins evolve under the law of natural selection and undergo alternating periods of conservative evolution and of relatively rapid change. The likelihood of mutations being fixed in the genome depends on various factors, such as the fitness of the phenotype or the position of the residues in the three-dimensional structure. For example, co-evolution of residues located close together in three-dimensional space can occur to preserve global stability. Whereas point mutations can fine-tune the protein function, residue insertions and deletions (‘decorations’ at the structural level) can sometimes modify functional sites and protein interactions more dramatically. We discuss recent developments and tools to identify such episodic mutations, and examine their applications in medical research. Such tools have been tested on simulated data and applied to real data such as viruses or animal sequences. Traditionally, there has been little if any cross-talk between the fields of protein biophysics, protein structure–function and molecular evolution. However, the last several years have seen some exciting developments in combining these approaches to obtain an in-depth understanding of how proteins evolve. For example, a better understanding of how structural constraints affect protein evolution will greatly help us to optimize our models of sequence evolution. The present review explores this new synthesis of perspectives.
Collapse
|
22
|
Giacopuzzi E, Bresciani R, Schauer R, Monti E, Borsani G. New insights on the sialidase protein family revealed by a phylogenetic analysis in metazoa. PLoS One 2012; 7:e44193. [PMID: 22952925 PMCID: PMC3431349 DOI: 10.1371/journal.pone.0044193] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2012] [Accepted: 07/30/2012] [Indexed: 11/19/2022] Open
Abstract
Sialidases are glycohydrolytic enzymes present from virus to mammals that remove sialic acid from oligosaccharide chains. Four different sialidase forms are known in vertebrates: the lysosomal NEU1, the cytosolic NEU2 and the membrane-associated NEU3 and NEU4. These enzymes modulate the cell sialic acid content and are involved in several cellular processes and pathological conditions. Molecular defects in NEU1 are responsible for sialidosis, an inherited disease characterized by lysosomal storage disorder and neurodegeneration. The studies on the biology of sialic acids and sialyltransferases, the anabolic counterparts of sialidases, have revealed a complex picture with more than 50 sialic acid variants selectively present in the different branches of the tree of life. The gain/loss of specific sialoconjugates have been proposed as key events in the evolution of deuterostomes and Homo sapiens, as well as in the host-pathogen interactions. To date, less attention has been paid to the evolution of sialidases. Thus we have conducted a survey on the state of the sialidase family in metazoan. Using an in silico approach, we identified and characterized sialidase orthologs from 21 different organisms distributed among the evolutionary tree: Metazoa relative (Monosiga brevicollis), early Deuterostomia, precursor of Chordata and Vertebrata (teleost fishes, amphibians, reptiles, avians and early and recent mammals). We were able to reconstruct the evolution of the sialidase protein family from the ancestral sialidase NEU1 and identify a new form of the enzyme, NEU5, representing an intermediate step in the evolution leading to the modern NEU3, NEU4 and NEU2. Our study provides new insights on the mechanisms that shaped the substrate specificity and other peculiar properties of the modern mammalian sialidases. Moreover, we further confirm findings on the catalytic residues and identified enzyme loop portions that behave as rapidly diverging regions and may be involved in the evolution of specific properties of sialidases.
Collapse
Affiliation(s)
- Edoardo Giacopuzzi
- Department of Biomedical Sciences and Biotechnology, Unit of Biology and Genetics, University of Brescia, Brescia, Italy
| | - Roberto Bresciani
- Department of Biomedical Sciences and Biotechnology, Unit of Biochemistry and Clinical Chemistry, University of Brescia, Brescia, Italy
| | - Roland Schauer
- Institute of Biochemistry, Christian-Albrechts University, Kiel, Germany
| | - Eugenio Monti
- Department of Biomedical Sciences and Biotechnology, Unit of Biochemistry and Clinical Chemistry, University of Brescia, Brescia, Italy
- * E-mail:
| | - Giuseppe Borsani
- Department of Biomedical Sciences and Biotechnology, Unit of Biology and Genetics, University of Brescia, Brescia, Italy
| |
Collapse
|
23
|
Wang Z, Zarlenga D, Martin J, Abubucker S, Mitreva M. Exploring metazoan evolution through dynamic and holistic changes in protein families and domains. BMC Evol Biol 2012; 12:138. [PMID: 22862991 PMCID: PMC3483195 DOI: 10.1186/1471-2148-12-138] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 07/19/2012] [Indexed: 11/18/2022] Open
Abstract
Background Proteins convey the majority of biochemical and cellular activities in organisms. Over the course of evolution, proteins undergo normal sequence mutations as well as large scale mutations involving domain duplication and/or domain shuffling. These events result in the generation of new proteins and protein families. Processes that affect proteome evolution drive species diversity and adaptation. Herein, change over the course of metazoan evolution, as defined by birth/death and duplication/deletion events within protein families and domains, was examined using the proteomes of 9 metazoan and two outgroup species. Results In studying members of the three major metazoan groups, the vertebrates, arthropods, and nematodes, we found that the number of protein families increased at the majority of lineages over the course of metazoan evolution where the magnitude of these increases was greatest at the lineages leading to mammals. In contrast, the number of protein domains decreased at most lineages and at all terminal lineages. This resulted in a weak correlation between protein family birth and domain birth; however, the correlation between domain birth and domain member duplication was quite strong. These data suggest that domain birth and protein family birth occur via different mechanisms, and that domain shuffling plays a role in the formation of protein families. The ratio of protein family birth to protein domain birth (domain shuffling index) suggests that shuffling had a more demonstrable effect on protein families in nematodes and arthropods than in vertebrates. Through the contrast of high and low domain shuffling indices at the lineages of Trichinella spiralis and Gallus gallus, we propose a link between protein redundancy and evolutionary changes controlled by domain shuffling; however, the speed of adaptation among the different lineages was relatively invariant. Evaluating the functions of protein families that appeared or disappeared at the last common ancestors (LCAs) of the three metazoan clades supports a correlation with organism adaptation. Furthermore, bursts of new protein families and domains in the LCAs of metazoans and vertebrates are consistent with whole genome duplications. Conclusion Metazoan speciation and adaptation were explored by birth/death and duplication/deletion events among protein families and domains. Our results provide insights into protein evolution and its bearing on metazoan evolution.
Collapse
Affiliation(s)
- Zhengyuan Wang
- The Genome Institute, Washington University School of Medicine, St. Louis, MO 63108, USA
| | | | | | | | | |
Collapse
|
24
|
Joseph AP, Valadié H, Srinivasan N, de Brevern AG. Local structural differences in homologous proteins: specificities in different SCOP classes. PLoS One 2012; 7:e38805. [PMID: 22745680 PMCID: PMC3382195 DOI: 10.1371/journal.pone.0038805] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2011] [Accepted: 05/10/2012] [Indexed: 11/19/2022] Open
Abstract
The constant increase in the number of solved protein structures is of great help in understanding the basic principles behind protein folding and evolution. 3-D structural knowledge is valuable in designing and developing methods for comparison, modelling and prediction of protein structures. These approaches for structure analysis can be directly implicated in studying protein function and for drug design. The backbone of a protein structure favours certain local conformations which include α-helices, β-strands and turns. Libraries of limited number of local conformations (Structural Alphabets) were developed in the past to obtain a useful categorization of backbone conformation. Protein Block (PB) is one such Structural Alphabet that gave a reasonable structure approximation of 0.42 Å. In this study, we use PB description of local structures to analyse conformations that are preferred sites for structural variations and insertions, among group of related folds. This knowledge can be utilized in improving tools for structure comparison that work by analysing local structure similarities. Conformational differences between homologous proteins are known to occur often in the regions comprising turns and loops. Interestingly, these differences are found to have specific preferences depending upon the structural classes of proteins. Such class-specific preferences are mainly seen in the all-β class with changes involving short helical conformations and hairpin turns. A test carried out on a benchmark dataset also indicates that the use of knowledge on the class specific variations can improve the performance of a PB based structure comparison approach. The preference for the indel sites also seem to be confined to a few backbone conformations involving β-turns and helix C-caps. These are mainly associated with short loops joining the regular secondary structures that mediate a reversal in the chain direction. Rare β-turns of type I’ and II’ are also identified as preferred sites for insertions.
Collapse
Affiliation(s)
- Agnel Praveen Joseph
- INSERM, UMR-S 665, Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, UMR 665, Paris, France
- Institut National de la Transfusion Sanguine (INTS), Paris, France
| | - Hélène Valadié
- INSERM UMR-S 726, DSIMB, Université Paris Diderot - Paris 7, Paris, France
| | | | - Alexandre G. de Brevern
- INSERM, UMR-S 665, Dynamique des Structures et Interactions des Macromolécules Biologiques (DSIMB), Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, UMR 665, Paris, France
- Institut National de la Transfusion Sanguine (INTS), Paris, France
- * E-mail:
| |
Collapse
|
25
|
Guo B, Zou M, Wagner A. Pervasive indels and their evolutionary dynamics after the fish-specific genome duplication. Mol Biol Evol 2012; 29:3005-22. [PMID: 22490820 DOI: 10.1093/molbev/mss108] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Insertions and deletions (indels) in protein-coding genes are important sources of genetic variation. Their role in creating new proteins may be especially important after gene duplication. However, little is known about how indels affect the divergence of duplicate genes. We here study thousands of duplicate genes in five fish (teleost) species with completely sequenced genomes. The ancestor of these species has been subject to a fish-specific genome duplication (FSGD) event that occurred approximately 350 Ma. We find that duplicate genes contain at least 25% more indels than single-copy genes. These indels accumulated preferentially in the first 40 my after the FSGD. A lack of widespread asymmetric indel accumulation indicates that both members of a duplicate gene pair typically experience relaxed selection. Strikingly, we observe a 30-80% excess of deletions over insertions that is consistent for indels of various lengths and across the five genomes. We also find that indels preferentially accumulate inside loop regions of protein secondary structure and in regions where amino acids are exposed to solvent. We show that duplicate genes with high indel density also show high DNA sequence divergence. Indel density, but not amino acid divergence, can explain a large proportion of the tertiary structure divergence between proteins encoded by duplicate genes. Our observations are consistent across all five fish species. Taken together, they suggest a general pattern of duplicate gene evolution in which indels are important driving forces of evolutionary change.
Collapse
Affiliation(s)
- Baocheng Guo
- Institute of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, Switzerland
| | | | | |
Collapse
|
26
|
Zhang Z, Xing C, Wang L, Gong B, Liu H. IndelFR: a database of indels in protein structures and their flanking regions. Nucleic Acids Res 2011; 40:D512-8. [PMID: 22127860 PMCID: PMC3245007 DOI: 10.1093/nar/gkr1107] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Insertion/deletion (indel) is one of the most common methods of protein sequence variation. Recent studies showed that indels could affect their flanking regions and they are important for protein function and evolution. Here, we describe the Indel Flanking Region Database (IndelFR, http://indel.bioinfo.sdu.edu.cn), which provides sequence and structure information about indels and their flanking regions in known protein domains. The indels were obtained through the pairwise alignment of homologous structures in SCOP superfamilies. The IndelFR database contains 2,925,017 indels with flanking regions extracted from 373,402 structural alignment pairs of 12,573 non-redundant domains from 1053 superfamilies. IndelFR provides access to information about indels and their flanking regions, including amino acid sequences, lengths, locations, secondary structure constitutions, hydrophilicity/hydrophobicity, domain information, 3D structures and so on. IndelFR has already been used for molecular evolution studies and may help to promote future functional studies of indels and their flanking regions.
Collapse
Affiliation(s)
- Zheng Zhang
- State Key Laboratory of Microbial Technology, Shandong University, Jinan 250100, China
| | | | | | | | | |
Collapse
|
27
|
Giacopuzzi E, Barlati S, Preti A, Venerando B, Monti E, Borsani G, Bresciani R. Gallus gallus NEU3 sialidase as model to study protein evolution mechanism based on rapid evolving loops. BMC BIOCHEMISTRY 2011; 12:45. [PMID: 21861893 PMCID: PMC3179935 DOI: 10.1186/1471-2091-12-45] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2011] [Accepted: 08/23/2011] [Indexed: 11/10/2022]
Abstract
BACKGROUND Large surface loops contained within compact protein structures and not involved in catalytic process have been proposed as preferred regions for protein family evolution. These loops are subjected to lower sequence constraints and can evolve rapidly in novel structural variants. A good model to study this hypothesis is represented by sialidase enzymes. Indeed, the structure of sialidases is a β-propeller composed by anti-parallel β-sheets connected by loops that suit well with the rapid evolving loop hypothesis. These features prompted us to extend our studies on this protein family in birds, to get insights on the evolution of this class of glycohydrolases. RESULTS Gallus gallus (Gg) genome contains one NEU3 gene encoding a protein with a unique 188 amino acid sequence mainly constituted by a peptide motif repeated six times in tandem with no homology with any other known protein sequence. The repeat region is located at the same position as the roughly 80 amino acid loop characteristic of mammalian NEU4. Based on molecular modeling, all these sequences represent a connecting loop between the first two highly conserved β-strands of the fifth blade of the sialidase β-propeller. Moreover this loop is highly variable in sequence and size in NEU3 sialidases from other vertebrates. Finally, we found that the general enzymatic properties and subcellular localization of Gg NEU3 are not influenced by the deletion of the repeat sequence. CONCLUSION In this study we demonstrated that sialidase protein structure contains a surface loop, highly variable both in sequence and size, connecting two conserved β-sheets and emerging on the opposite site of the catalytic crevice. These data confirm that sialidase family can serve as suitable model for the study of the evolutionary process based on rapid evolving loops, which may had occurred in sialidases. Giving the peculiar organization of the loop region identified in Gg NEU3, this protein can be considered of particular interest in such evolutionary studies and to get deeper insights in sialidase evolution.
Collapse
Affiliation(s)
- Edoardo Giacopuzzi
- Department of Biomedical Sciences and Biotechnology, Unit of Biology and Genetics, University of Brescia, viale Europa 11, Brescia 25123, Italy
| | - Sergio Barlati
- Department of Biomedical Sciences and Biotechnology, Unit of Biology and Genetics, University of Brescia, viale Europa 11, Brescia 25123, Italy
| | - Augusto Preti
- Department of Biomedical Sciences and Biotechnology, Unit of Biochemistry and Clinical Chemistry, University of Brescia, viale Europa 11, Brescia 25123, Italy
| | - Bruno Venerando
- Department of Medical Chemistry, Biochemistry and Biotechnology, L.I.T.A., University of Milano, Via F.lli Cervi 93, Segrate 20090, Italy
| | - Eugenio Monti
- Department of Biomedical Sciences and Biotechnology, Unit of Biochemistry and Clinical Chemistry, University of Brescia, viale Europa 11, Brescia 25123, Italy
| | - Giuseppe Borsani
- Department of Biomedical Sciences and Biotechnology, Unit of Biology and Genetics, University of Brescia, viale Europa 11, Brescia 25123, Italy
| | - Roberto Bresciani
- Department of Biomedical Sciences and Biotechnology, Unit of Biochemistry and Clinical Chemistry, University of Brescia, viale Europa 11, Brescia 25123, Italy
| |
Collapse
|
28
|
Dessailly BH, Redfern OC, Cuff AL, Orengo CA. Detailed analysis of function divergence in a large and diverse domain superfamily: toward a refined protocol of function classification. Structure 2011; 18:1522-35. [PMID: 21070951 DOI: 10.1016/j.str.2010.08.017] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2010] [Revised: 08/06/2010] [Accepted: 08/13/2010] [Indexed: 10/18/2022]
Abstract
Some superfamilies contain large numbers of protein domains with very different functions. The ability to refine the functional classification of domains within these superfamilies is necessary for better understanding the evolution of functions and to guide function prediction of new relatives. To achieve this, a suitable starting point is the detailed analysis of functional divisions and mechanisms of functional divergence in a single superfamily. Here, we present such a detailed analysis in the superfamily of HUP domains. A biologically meaningful functional classification of HUP domains is obtained manually. Mechanisms of function diversification are investigated in detail using this classification. We observe that structural motifs play an important role in shaping broad functional divergence, whereas residue-level changes shape diversity at a more specific level. In parallel we examine the ability of an automated protocol to capture the biologically meaningful classification, with a view to automatically extending this classification in the future.
Collapse
Affiliation(s)
- Benoit H Dessailly
- Department of Structural and Molecular Biology, University College of London, Gower Street, London WC1E6BT, UK.
| | | | | | | |
Collapse
|
29
|
Zhang Z, Wang Y, Wang L, Gao P. The combined effects of amino acid substitutions and indels on the evolution of structure within protein families. PLoS One 2010; 5:e14316. [PMID: 21179197 PMCID: PMC3001449 DOI: 10.1371/journal.pone.0014316] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2010] [Accepted: 11/16/2010] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND In the process of protein evolution, sequence variations within protein families can cause changes in protein structures and functions. However, structures tend to be more conserved than sequences and functions. This leads to an intriguing question: what is the evolutionary mechanism by which sequence variations produce structural changes? To investigate this question, we focused on the most common types of sequence variations: amino acid substitutions and insertions/deletions (indels). Here their combined effects on protein structure evolution within protein families are studied. RESULTS Sequence-structure correlation analysis on 75 homologous structure families (from SCOP) that contain 20 or more non-redundant structures shows that in most of these families there is, statistically, a bilinear correlation between the amount of substitutions and indels versus the degree of structure variations. Bilinear regression of percent sequence non-identity (PNI) and standardized number of gaps (SNG) versus RMSD was performed. The coefficients from the regression analysis could be used to estimate the structure changes caused by each unit of substitution (structural substitution sensitivity, SSS) and by each unit of indel (structural indel sensitivity, SIDS). An analysis on 52 families with high bilinear fitting multiple correlation coefficients and statistically significant regression coefficients showed that SSS is mainly constrained by disulfide bonds, which almost have no effects on SIDS. CONCLUSIONS Structural changes in homologous protein families could be rationally explained by a bilinear model combining amino acid substitutions and indels. These results may further improve our understanding of the evolutionary mechanisms of protein structures.
Collapse
Affiliation(s)
- Zheng Zhang
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong, China
| | - Yuxiao Wang
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong, China
- Division of Basic Science, UT Southwestern, Dallas, Texas, United States of America
| | - Lushan Wang
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong, China
- * E-mail: (LW); (PG)
| | - Peiji Gao
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, Shandong, China
- * E-mail: (LW); (PG)
| |
Collapse
|
30
|
Mechanisms of protein oligomerization, the critical role of insertions and deletions in maintaining different oligomeric states. Proc Natl Acad Sci U S A 2010; 107:20352-7. [PMID: 21048085 DOI: 10.1073/pnas.1012999107] [Citation(s) in RCA: 140] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The main principles of protein-protein recognition are elucidated by the studies of homooligomers which in turn mediate and regulate gene expression, activity of enzymes, ion channels, receptors, and cell-cell adhesion processes. Here we explore oligomeric states of homologous proteins in various organisms to better understand the functional roles and evolutionary mechanisms of homooligomerization. We observe a great diversity in mechanisms controlling oligomerization and focus in our study on insertions and deletions in homologous proteins and how they enable or disable complex formation. We show that insertions and deletions which differentiate monomers and dimers have a significant tendency to be located on the interaction interfaces and about a quarter of all proteins studied and forty percent of enzymes have regions which mediate or disrupt the formation of oligomers. We suggest that relatively small insertions or deletions may have a profound effect on complex stability and/or specificity. Indeed removal of complex enabling regions from protein structures in many cases resulted in the complete or partial loss of stability. Moreover, we find that insertions and deletions modulating oligomerization have a lower aggregation propensity and contain a larger fraction of polar, charged residues, glycine and proline compared to conventional interfaces and protein surface. Most likely, these regions may mediate specific interactions, prevent nonspecific dysfunctional aggregation and preclude undesired interactions between close paralogs therefore separating their functional pathways. Last, we show how the presence or absence of insertions and deletions on interfaces might be of practical value in annotating protein oligomeric states.
Collapse
|
31
|
Zhang Z, Huang J, Wang Z, Wang L, Gao P. Impact of indels on the flanking regions in structural domains. Mol Biol Evol 2010; 28:291-301. [PMID: 20671041 DOI: 10.1093/molbev/msq196] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Amino acid substitution and insertions/deletions (indels) are two common events in protein evolution; however, current knowledge on indels is limited. In this study, we investigated the effects of indels on the flanking regions in protein structure superfamilies. Comprehensive analysis of structural classification of proteins superfamilies revealed that indels lead to a series of changes in the flanking regions, including the following: 1) structural shift in the tertiary structure, with a first-order exponential decay relation between structural shift and the distance to indels, 2) instability of the secondary structure elements in which parts of the α helix and β sheet are destroyed, and 3) an increase in the amino acid substitution rate of the primary structure and the nonsimilar amino acid substitution rate. In general, these quality changes are due to the combined effects of the "regional-inherent effect," "indel-accompanied effect," and "indel-following effect." Furthermore, these quality changes reflect changes in selective pressure. Indels are more likely to be preserved in regions with low selective pressure, and indels can further reduce the selective pressure on the flanking regions. These findings improve our understanding of the role of indels in protein evolution.
Collapse
Affiliation(s)
- Zheng Zhang
- State Key Laboratory of Microbial Technology, Shandong University, Jinan, China
| | | | | | | | | |
Collapse
|
32
|
Tyagi M, Bornot A, Offmann B, de Brevern AG. Analysis of loop boundaries using different local structure assignment methods. Protein Sci 2009; 18:1869-81. [PMID: 19606500 DOI: 10.1002/pro.198] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Loops connect regular secondary structures. In many instances, they are known to play important biological roles. Analysis and prediction of loop conformations depend directly on the definition of repetitive structures. Nonetheless, the secondary structure assignment methods (SSAMs) often lead to divergent assignments. In this study, we analyzed, both structure and sequence point of views, how the divergence between different SSAMs affect boundary definitions of loops connecting regular secondary structures. The analysis of SSAMs underlines that no clear consensus between the different SSAMs can be easily found. Because these latter greatly influence the loop boundary definitions, important variations are indeed observed, that is, capping positions are shifted between different SSAMs. On the other hand, our results show that the sequence information in these capping regions are more stable than expected, and, classical and equivalent sequence patterns were found for most of the SSAMs. This is, to our knowledge, the most exhaustive survey in this field as (i) various databank have been used leading to similar results without implication of protein redundancy and (ii) the first time various SSAMs have been used. This work hence gives new insights into the difficult question of assignment of repetitive structures and addresses the issue of loop boundaries definition. Although SSAMs give very different local structure assignments capping sequence patterns remain efficiently stable.
Collapse
Affiliation(s)
- Manoj Tyagi
- Laboratoire de Biochimie et Génétique Moléculaire, Université de La Réunion, BP 7151, 15 avenue René Cassin, 97715 Saint Denis Messag Cedex 09, La Réunion, France
| | | | | | | |
Collapse
|
33
|
Shortridge MD, Powers R. Structural and functional similarity between the bacterial type III secretion system needle protein PrgI and the eukaryotic apoptosis Bcl-2 proteins. PLoS One 2009; 4:e7442. [PMID: 19823588 PMCID: PMC2757720 DOI: 10.1371/journal.pone.0007442] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2009] [Accepted: 09/15/2009] [Indexed: 11/25/2022] Open
Abstract
Background Functional similarity is challenging to identify when global sequence and structure similarity is low. Active-sites or functionally relevant regions are evolutionarily more stable relative to the remainder of a protein structure and provide an alternative means to identify potential functional similarity between proteins. We recently developed the FAST-NMR methodology to discover biochemical functions or functional hypotheses of proteins of unknown function by experimentally identifying ligand binding sites. FAST-NMR utilizes our CPASS software and database to assign a function based on a similarity in the structure and sequence of ligand binding sites between proteins of known and unknown function. Methodology/Principal Findings The PrgI protein from Salmonella typhimurium forms the needle complex in the type III secretion system (T3SS). A FAST-NMR screen identified a similarity between the ligand binding sites of PrgI and the Bcl-2 apoptosis protein Bcl-xL. These ligand binding sites correlate with known protein-protein binding interfaces required for oligomerization. Both proteins form membrane pores through this oligomerization to release effector proteins to stimulate cell death. Structural analysis indicates an overlap between the PrgI structure and the pore forming motif of Bcl-xL. A sequence alignment indicates conservation between the PrgI and Bcl-xL ligand binding sites and pore formation regions. This active-site similarity was then used to verify that chelerythrine, a known Bcl-xL inhibitor, also binds PrgI. Conclusions/Significance A structural and functional relationship between the bacterial T3SS and eukaryotic apoptosis was identified using our FAST-NMR ligand affinity screen in combination with a bioinformatic analysis based on our CPASS program. A similarity between PrgI and Bcl-xL is not readily apparent using traditional global sequence and structure analysis, but was only identified because of conservation in ligand binding sites. These results demonstrate the unique opportunity that ligand-binding sites provide for the identification of functional relationships when global sequence and structural information is limited.
Collapse
Affiliation(s)
- Matthew D. Shortridge
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, Nebraska, United States of America
| | - Robert Powers
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, Nebraska, United States of America
- * E-mail:
| |
Collapse
|
34
|
Sandhya S, Rani SS, Pankaj B, Govind MK, Offmann B, Srinivasan N, Sowdhamini R. Length variations amongst protein domain superfamilies and consequences on structure and function. PLoS One 2009; 4:e4981. [PMID: 19333395 PMCID: PMC2659687 DOI: 10.1371/journal.pone.0004981] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2008] [Accepted: 02/26/2009] [Indexed: 11/24/2022] Open
Abstract
Background Related protein domains of a superfamily can be specified by proteins of diverse lengths. The structural and functional implications of indels in a domain scaffold have been examined. Methodology In this study, domain superfamilies with large length variations (more than 30% difference from average domain size, referred as ‘length-deviant’ superfamilies and ‘length-rigid’ domain superfamilies (<10% length difference from average domain size) were analyzed for the functional impact of such structural differences. Our delineated dataset, derived from an objective algorithm, enables us to address indel roles in the presence of peculiar structural repeats, functional variation, protein-protein interactions and to examine ‘domain contexts’ of proteins tolerant to large length variations. Amongst the top-10 length-deviant superfamilies analyzed, we found that 80% of length-deviant superfamilies possess distant internal structural repeats and nearly half of them acquired diverse biological functions. In general, length-deviant superfamilies have higher chance, than length-rigid superfamilies, to be engaged in internal structural repeats. We also found that ∼40% of length-deviant domains exist as multi-domain proteins involving interactions with domains from the same or other superfamilies. Indels, in diverse domain superfamilies, were found to participate in the accretion of structural and functional features amongst related domains. With specific examples, we discuss how indels are involved directly or indirectly in the generation of oligomerization interfaces, introduction of substrate specificity, regulation of protein function and stability. Conclusions Our data suggests a multitude of roles for indels that are specialized for domain members of different domain superfamilies. These specialist roles that we observe and trends in the extent of length variation could influence decision making in modeling of new superfamily members. Likewise, the observed limits of length variation, specific for each domain superfamily would be particularly relevant in the choice of alignment length search filters commonly applied in protein sequence analysis.
Collapse
Affiliation(s)
- Sankaran Sandhya
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore, India
| | - Saane Sudha Rani
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore, India
| | - Barah Pankaj
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore, India
| | | | - Bernard Offmann
- Laboratoire de Biochimie et Génétique Moléculaire BP 7151, Université de La Réunion, La Réunion, France
| | | | - Ramanathan Sowdhamini
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bangalore, India
- * E-mail:
| |
Collapse
|
35
|
Pascual-García A, Abia D, Ortiz ÁR, Bastolla U. Cross-over between discrete and continuous protein structure space: insights into automatic classification and networks of protein structures. PLoS Comput Biol 2009; 5:e1000331. [PMID: 19325884 PMCID: PMC2654728 DOI: 10.1371/journal.pcbi.1000331] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2008] [Accepted: 02/11/2009] [Indexed: 11/19/2022] Open
Abstract
Structural classifications of proteins assume the existence of the fold, which is an intrinsic equivalence class of protein domains. Here, we test in which conditions such an equivalence class is compatible with objective similarity measures. We base our analysis on the transitive property of the equivalence relationship, requiring that similarity of A with B and B with C implies that A and C are also similar. Divergent gene evolution leads us to expect that the transitive property should approximately hold. However, if protein domains are a combination of recurrent short polypeptide fragments, as proposed by several authors, then similarity of partial fragments may violate the transitive property, favouring the continuous view of the protein structure space. We propose a measure to quantify the violations of the transitive property when a clustering algorithm joins elements into clusters, and we find out that such violations present a well defined and detectable cross-over point, from an approximately transitive regime at high structure similarity to a regime with large transitivity violations and large differences in length at low similarity. We argue that protein structure space is discrete and hierarchic classification is justified up to this cross-over point, whereas at lower similarities the structure space is continuous and it should be represented as a network. We have tested the qualitative behaviour of this measure, varying all the choices involved in the automatic classification procedure, i.e., domain decomposition, alignment algorithm, similarity score, and clustering algorithm, and we have found out that this behaviour is quite robust. The final classification depends on the chosen algorithms. We used the values of the clustering coefficient and the transitivity violations to select the optimal choices among those that we tested. Interestingly, this criterion also favours the agreement between automatic and expert classifications. As a domain set, we have selected a consensus set of 2,890 domains decomposed very similarly in SCOP and CATH. As an alignment algorithm, we used a global version of MAMMOTH developed in our group, which is both rapid and accurate. As a similarity measure, we used the size-normalized contact overlap, and as a clustering algorithm, we used average linkage. The resulting automatic classification at the cross-over point was more consistent than expert ones with respect to the structure similarity measure, with 86% of the clusters corresponding to subsets of either SCOP or CATH superfamilies and fewer than 5% containing domains in distinct folds according to both SCOP and CATH. Almost 15% of SCOP superfamilies and 10% of CATH superfamilies were split, consistent with the notion of fold change in protein evolution. These results were qualitatively robust for all choices that we tested, although we did not try to use alignment algorithms developed by other groups. Folds defined in SCOP and CATH would be completely joined in the regime of large transitivity violations where clustering is more arbitrary. Consistently, the agreement between SCOP and CATH at fold level was lower than their agreement with the automatic classification obtained using as a clustering algorithm, respectively, average linkage (for SCOP) or single linkage (for CATH). The networks representing significant evolutionary and structural relationships between clusters beyond the cross-over point may allow us to perform evolutionary, structural, or functional analyses beyond the limits of classification schemes. These networks and the underlying clusters are available at http://ub.cbm.uam.es/research/ProtNet.php Making order of the fast-growing information on proteins is essential for gaining evolutionary and functional knowledge. The most successful approaches to this task are based on classifications of protein structures, such as SCOP and CATH, which assume a discrete view of the protein structure space as a collection of separated equivalence classes (folds). However, several authors proposed that protein domains should be regarded as assemblies of polypeptide fragments, which implies that the protein–structure space is continuous. Here, we assess these views of domain space through the concept of transitivity; i.e., we test whether structure similarity of A with B and B with C implies that A and C are similar, as required for consistent classification. We find that the domain space is approximately transitive and discrete at high similarity and continuous at low similarity, where transitivity is severely violated. Comparing our classification at the cross-over similarity with CATH and SCOP, we find that they join proteins at low similarity where classification is inconsistent. Part of this discrepancy is due to structural divergence of homologous domains, which are forced to be in a single cluster in CATH and SCOP. Structural and evolutionary relationships between consistent clusters are represented as a network in our approach, going beyond current protein classification schemes. We conjecture that our results are related to a change of evolutionary regime, from uniparental divergent evolution for highly related domains to assembly of large fragments for which the classical tree representation is unsuitable.
Collapse
Affiliation(s)
| | - David Abia
- Centro de Biología Molecular ‘Severo Ochoa’ (CSIC-UAM), Cantoblanco, Madrid, Spain
| | - Ángel R. Ortiz
- Centro de Biología Molecular ‘Severo Ochoa’ (CSIC-UAM), Cantoblanco, Madrid, Spain
| | - Ugo Bastolla
- Centro de Biología Molecular ‘Severo Ochoa’ (CSIC-UAM), Cantoblanco, Madrid, Spain
- * E-mail:
| |
Collapse
|
36
|
Wang Z, Martin J, Abubucker S, Yin Y, Gasser RB, Mitreva M. Systematic analysis of insertions and deletions specific to nematode proteins and their proposed functional and evolutionary relevance. BMC Evol Biol 2009; 9:23. [PMID: 19175938 PMCID: PMC2644674 DOI: 10.1186/1471-2148-9-23] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2008] [Accepted: 01/28/2009] [Indexed: 11/25/2022] Open
Abstract
Background Amino acid insertions and deletions in proteins are considered relatively rare events, and their associations with the evolution and adaptation of organisms are not yet understood. In this study, we undertook a systematic analysis of over 214,000 polypeptides from 32 nematode species and identified insertions and deletions unique to nematode proteins in more than 1000 families and provided indirect evidence that these alterations are linked to the evolution and adaptation of nematodes. Results Amino acid alterations in sequences of nematodes were identified by comparison with homologous sequences from a wide range of eukaryotic (metzoan) organisms. This comparison revealed that the proteins inferred from transcriptomic datasets for nematodes contained more deletions than insertions, and that the deletions tended to be larger in length than insertions, indicating a decreased size of the transcriptome of nematodes compared with other organisms. The present findings showed that this reduction is more pronounced in parasitic nematodes compared with the free-living nematodes of the genus Caenorhabditis. Consistent with a requirement for conservation in proteins involved in the processing of genetic information, fewer insertions and deletions were detected in such proteins. On the other hand, more insertions and deletions were recorded for proteins inferred to be involved in the endocrine and immune systems, suggesting a link with adaptation. Similarly, proteins involved in multiple cellular pathways tended to display more deletions and insertions than those involved in a single pathway. The number of insertions and deletions shared by a range of plant parasitic nematodes were higher for proteins involved in lipid metabolism and electron transport compared with other nematodes, suggesting an association between metabolic adaptation and parasitism in plant hosts. We also identified three sizable deletions from proteins found to be specific to and shared by parasitic nematodes, which, given their uniqueness, might serve as target candidates for drug design. Conclusion This study illustrates the significance of using comparative genomics approaches to identify molecular elements unique to parasitic nematodes, which have adapted to a particular host organism and mode of existence during evolution. While the focus of this study was on nematodes, the approach has applicability to a wide range of other groups of organisms.
Collapse
Affiliation(s)
- Zhengyuan Wang
- The Genome Center, Department of Genetics, Washington University School of Medicine, St Louis, MO 63110, USA.
| | | | | | | | | | | |
Collapse
|
37
|
Yang H, Wu Y, Feng J, Yang S, Tian D. Evolutionary pattern of protein architecture in mammal and fruit fly genomes. Genomics 2008; 93:90-7. [PMID: 18929639 DOI: 10.1016/j.ygeno.2008.09.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2008] [Revised: 09/12/2008] [Accepted: 09/13/2008] [Indexed: 11/17/2022]
Abstract
Mutations, which can alter amino acid constitution, contribute greatly to protein evolution. However, little is reported of their pattern during protein structural evolution. We investigated the distribution of non-synonymous single nucleotide polymorphisms (nsSNPs) and insertions/deletions (indels) along mammal and fruit fly proteins. We found the nsSNPs (and d(N)) and indels increased in protein boundary regions, and this pattern is inversely correlated with the distribution of protein domain density. Additionally, synonymous substitutions (and d(S)) are reduced in 5' and 3' regions, indicating more variable protein boundaries, compared with central interior. All evidence suggests that the inner part of coding sequences (CDSs) is comparatively conserved, whereas the 5' and 3' regions, with higher evolution rates, are more variable. We assumed that due to greater frequencies of nsSNPs and indels in adaptive regions of CDSs it could be easier to ultimately alter, gain, or lose amino acids, thus becoming the front line of protein evolution.
Collapse
Affiliation(s)
- Haiwang Yang
- State Key Laboratory of Pharmaceutical Biotechnology, Department of Biology, Nanjing University, Nanjing 210093, China
| | | | | | | | | |
Collapse
|
38
|
Redfern OC, Dessailly B, Orengo CA. Exploring the structure and function paradigm. Curr Opin Struct Biol 2008; 18:394-402. [PMID: 18554899 DOI: 10.1016/j.sbi.2008.05.007] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2008] [Revised: 04/16/2008] [Accepted: 05/07/2008] [Indexed: 11/29/2022]
Abstract
Advances in protein structure determination, led by the structural genomics initiatives have increased the proportion of novel folds deposited in the Protein Data Bank. However, these structures are often not accompanied by functional annotations with experimental confirmation. In this review, we reassess the meaning of structural novelty and examine its relevance to the complexity of the structure-function paradigm. Recent advances in the prediction of protein function from structure are discussed, as well as new sequence-based methods for partitioning large, diverse superfamilies into biologically meaningful clusters. Obtaining structural data for these functionally coherent groups of proteins will allow us to better understand the relationship between structure and function.
Collapse
Affiliation(s)
- Oliver C Redfern
- Department of Structural and Molecular Biology, University College London, London WC1E 6BT, United Kingdom
| | | | | |
Collapse
|