1
|
Coban A, Bornberg-Bauer E, Kemena C. Tracing the paths of modular evolution by quantifying rearrangement events of protein domains. BMC Ecol Evol 2025; 25:6. [PMID: 39773110 PMCID: PMC11707847 DOI: 10.1186/s12862-024-02347-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Accepted: 12/27/2024] [Indexed: 01/11/2025] Open
Abstract
BACKGROUND Protein evolution is central to molecular adaptation and largely characterized by modular rearrangements of domains, the evolutionary and structural building blocks of proteins. Genetic events underlying protein rearrangements are relatively rare compared to changes of amino-acids. Therefore, these events can be used to characterize and reconstruct major events of molecular adaptation by comparing large data sets of proteomes. RESULTS Here we determine, at unprecedented completeness, the rates of fusion, fission, emergence and loss of domains in five eukaryotic clades (monocots, eudicots, fungi, insects, vertebrates). By characterizing rearrangements that were previously considered "ambiguous" or "complex" we raise the fraction of resolved rearrangement events from previously ca. 60% to around 92%. We exemplify our method by analyzing the evolutionary histories of protein rearrangements in (i) the extracellular matrix, (ii) innate immunity across Eukaryota, Metazoa, and Vertebrata, and (iii) Toll-Like-Receptors in the innate immune system of Eukaryota. In all three cases we can find hot-spots of rearrangement events in their phylogeny which (i) can be related with major events of adaptation and (ii) which follow the emergence of new domains which become integrated into existing arrangements. CONCLUSION Our results demonstrate that, akin to the change at the level of amino acids, domain rearrangements follow a clock-like dynamic which can be well quantified and supports the concept of evolutionary tinkering. While many novel domain emergence events are ancient, emerged domains are quickly incorporated into a great number of proteins. In parallel, the observed rates of emergence of new domains are becoming smaller over time.
Collapse
Affiliation(s)
- Abdulbaki Coban
- Institute for Evolution and Biodiversity, University of Münster, Münster, 48159, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, 48159, Germany
- Departement of Protein Evolution, Max Planck Institute for Biology Tübingen, Tübingen, 72076, Germany
| | - Carsten Kemena
- Institute for Evolution and Biodiversity, University of Münster, Münster, 48159, Germany.
| |
Collapse
|
2
|
Dallinger R, Pedrini‐Martha V, Burdisso ML, Capdevila M, Palacios O, Albalat R. Experimental recombining of repetitive motifs leads to large functional metallothioneins and demonstrates their modular evolvability potential. Protein Sci 2025; 34:e5247. [PMID: 39673460 PMCID: PMC11645667 DOI: 10.1002/pro.5247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2024] [Revised: 11/04/2024] [Accepted: 11/23/2024] [Indexed: 12/16/2024]
Abstract
Protein modularity is acknowledged for promoting the emergence of new protein variants via domain rearrangements. Metallothioneins (MTs) offer an excellent model system for experimentally examining the consequences of domain rearrangements due to the possibility to assess the functional properties of native and artificially created variants using spectroscopic methods and metal tolerance assays. In this study, we have investigated the functional properties of AbiMT4 from the snail Alinda biplicata (Gastropoda, Mollusca), a large MT comprising 10 putative β domains (β39β1), alongside four artificially designed variants differing in domain number, type, or order. Our findings reveal that AbiMT4 is a cadmium-selective protein with a high metal-binding capacity, characterized by structurally and functionally independent domains repeated in tandem along the protein. Our results indicate that due to its modular organization, AbiMT4 remains functional even when the number, type, and order of the domains are significantly altered. Furthermore, we demonstrate that the metal-binding properties of AbiMT4 are not dictated by the overall architecture of the protein but primarily arise from the properties of each individual domain. Using MTs as example, this work provides empirical evidence that domain rearrangements are an effective strategy for exploring new viable sequences and creating novel protein variants subject to adaptive selection. Thus, our study highlights the importance of the modular structure of proteins, as increasing their functional flexibility enhances their evolvability. Additionally, our work demonstrates a simple way to design and model new proteins for predefined functions.
Collapse
Affiliation(s)
- Reinhard Dallinger
- Institute of Zoology and Center of Molecular Biosciences InnsbruckUniversity of InnsbruckInnsbruckAustria
| | - Veronika Pedrini‐Martha
- Institute of Zoology and Center of Molecular Biosciences InnsbruckUniversity of InnsbruckInnsbruckAustria
| | - Maria Lucia Burdisso
- Departament de Genètica, Microbiologia i Estadística, Facultat de BiologiaUniversitat de Barcelona (UB)BarcelonaSpain
- Centro de Estudios Fotosintéticos y Bioquímicos (CEFOBI‐CONICET)Universidad Nacional de RosarioRosarioArgentina
| | - Mercè Capdevila
- Departament de Química, Facultat de CiènciesUniversitat Autònoma de Barcelona (UAB)Cerdanyola del VallèsSpain
| | - Oscar Palacios
- Departament de Química, Facultat de CiènciesUniversitat Autònoma de Barcelona (UAB)Cerdanyola del VallèsSpain
| | - Ricard Albalat
- Departament de Genètica, Microbiologia i Estadística, Facultat de BiologiaUniversitat de Barcelona (UB)BarcelonaSpain
- Institut de Recerca de la Biodiversitat (IRBio)Universitat de Barcelona (UB)BarcelonaSpain
| |
Collapse
|
3
|
Mikhailova AA, Dohmen E, Harrison MC. Major changes in domain arrangements are associated with the evolution of termites. J Evol Biol 2024; 37:758-769. [PMID: 38630634 DOI: 10.1093/jeb/voae047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 12/18/2023] [Accepted: 04/12/2024] [Indexed: 04/19/2024]
Abstract
Domains as functional protein units and their rearrangements along the phylogeny can shed light on the functional changes of proteomes associated with the evolution of complex traits like eusociality. This complex trait is associated with sterile soldiers and workers, and long-lived, highly fecund reproductives. Unlike in Hymenoptera (ants, bees, and wasps), the evolution of eusociality within Blattodea, where termites evolved from within cockroaches, was accompanied by a reduction in proteome size, raising the question of whether functional novelty was achieved with existing rather than novel proteins. To address this, we investigated the role of domain rearrangements during the evolution of termite eusociality. Analysing domain rearrangements in the proteomes of three solitary cockroaches and five eusocial termites, we inferred more than 5,000 rearrangements over the phylogeny of Blattodea. The 90 novel domain arrangements that emerged at the origin of termites were enriched for several functions related to longevity, such as protein homeostasis, DNA repair, mitochondrial activity, and nutrient sensing. Many domain rearrangements were related to changes in developmental pathways, important for the emergence of novel castes. Along with the elaboration of social complexity, including permanently sterile workers and larger, foraging colonies, we found 110 further domain arrangements with functions related to protein glycosylation and ion transport. We found an enrichment of caste-biased expression and splicing within rearranged genes, highlighting their importance for the evolution of castes. Furthermore, we found increased levels of DNA methylation among rearranged compared to non-rearranged genes suggesting fundamental differences in their regulation. Our findings indicate the importance of domain rearrangements in the generation of functional novelty necessary for termite eusociality to evolve.
Collapse
Affiliation(s)
- Alina A Mikhailova
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Elias Dohmen
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Mark C Harrison
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| |
Collapse
|
4
|
Ahmed MH, Samia NSN, Singh G, Gupta V, Mishal MFM, Hossain A, Suman KH, Raza A, Dutta AK, Labony MA, Sultana J, Faysal EH, Alnasser SM, Alam P, Azam F. An immuno-informatics approach for annotation of hypothetical proteins and multi-epitope vaccine designed against the Mpox virus. J Biomol Struct Dyn 2024; 42:5288-5307. [PMID: 37519185 DOI: 10.1080/07391102.2023.2239921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 06/09/2023] [Indexed: 08/01/2023]
Abstract
A worrying new outbreak of Monkeypox (Mpox) in humans is caused by the Mpox virus (MpoxV). The pathogen has roughly 28 hypothetical proteins of unknown structure, function, and pathogenicity. Using reliable bioinformatics tools, we attempted to analyze the MpoxV genome, identify the role of hypothetical proteins (HPs), and design a potential candidate vaccine. Out of 28, we identified seven hypothetical proteins using multi-server validation with high confidence for the occurrence of conserved domains. Their physical, chemical, and functional characterizations, including molecular weight, theoretical isoelectric point, 3D structures, GRAVY value, subcellular localization, functional motifs, antigenicity, and virulence factors, were performed. We predicted possible cytotoxic T cell (CTL), helper T cell (HTL) and linear and conformational B cell epitopes, which were combined in a 219 amino acid multiepitope vaccine with human β defensin as a linker. This multi-epitopic vaccine was structurally modelled and docked with toll-like receptor-3 (TLR-3). The dynamical stability of the vaccine-TLR-3 docked complexes exhibited stable interactions based on RMSD and RMSF tests. Additionally, the modelled vaccine was cloned in-silico in an E. coli host to check the appropriate expression of the final vaccine built. Our results might conform to an immunogenic and safe vaccine, which would require further experimental validation.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Md Hridoy Ahmed
- Department of Genetic Engineering and Biotechnology, University of Chittagong, Chittagong, Bangladesh
| | - Nure Sharaf Nower Samia
- Department of Life Sciences (DLS), School of Environment and Life Sciences (SELS), Independent University, Dhaka, Bangladesh
| | - Gagandeep Singh
- Kusuma School of Biological Sciences, Indian Institute of Technology, New Delhi, India
- Section of Microbiology, Central Ayurveda Research Institute, Jhansi CCRAS, Ministry of Ayush, India
| | - Vandana Gupta
- Department of Microbiology, Ram Lal Anand College, University of Delhi, New Delhi, India
| | | | - Alomgir Hossain
- Department of Genetic Engineering and Biotechnology, University of Rajshahi, Rajshahi, Bangladesh
| | | | - Adnan Raza
- Bioscience department, COMSATS University of Islamabad, Islamabad, Pakistan
| | - Amit Kumar Dutta
- Department of Microbiology, University of Rajshahi, Rajshahi, Bangladesh
| | - Moriom Akhter Labony
- Department of Genetic Engineering and Biotechnology, University of Chittagong, Chittagong, Bangladesh
| | - Jakia Sultana
- Department of Botany, University of Rajshahi, Rajshahi, Bangladesh
| | | | - Sulaiman Mohammed Alnasser
- Department of Pharmacology and Toxicology, Unaizah College of Pharmacy, Qassim University, Buraydah, Saudi Arabia
| | - Prawez Alam
- Department of Pharmacognosy, College of Pharmacy, Prince Sattam Bin Abdulaziz University, Al Kharj, Saudi Arabia
| | - Faizul Azam
- Department of Pharmaceutical Chemistry and Pharmacognosy, Unaizah College of Pharmacy, Qassim University, Buraydah, Saudi Arabia
| |
Collapse
|
5
|
Machulin AV, Deryusheva EI, Galzitskaya OV. Variation in base composition, structure-function relationships, and origins of structural repetition in bacterial rpsA gene. Biosystems 2024; 238:105196. [PMID: 38537772 DOI: 10.1016/j.biosystems.2024.105196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 03/22/2024] [Accepted: 03/22/2024] [Indexed: 04/12/2024]
Abstract
Protein domain repeats are known to arise due to tandem duplications of internal genes. However, the understanding of the underlying mechanisms of this process is incomplete. The goal of this work was to investigate the mechanism of occurrence of repeat expansion based on studying the sequences of 1324 rpsA genes of bacterial S1 ribosomal proteins containing different numbers of S1 structural domains. The rpsA gene encodes ribosomal S1 protein, which is essential for cell viability as it interacts with both mRNA and proteins. Gene ontology (GO) analysis of S1 domains in ribosomal S1 proteins revealed that bacterial protein sequences in S1 mainly have 3 types of molecular functions: RNA binding activity, nucleic acid activity, and ribosome structural component. Our results show that the maximum value of rpsA gene identity for full-length proteins was found for S1 proteins containing six structural domains (58%). Analysis of consensus sequences showed that parts of the rpsA gene encoding separate S1 domains have no a strictly repetitive structure between groups containing different numbers of S1 domains. At the same time, gene regions encoding some conserved residues that form the RNA-binding site remain conserved. The detected phylogenetic similarity suggests that the proposed fold of the rpsA translation initiation region of Escherichia coli has functional value and is important for translational control of rpsA gene expression in other bacterial phyla, but not only in gamma Proteobacteria.
Collapse
Affiliation(s)
- Andrey V Machulin
- Skryabin Institute of Biochemistry and Physiology of Microorganisms, Russian Academy of Sciences, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", 142290, Pushchino, Moscow Region, Russia
| | - Evgeniya I Deryusheva
- Institute for Biological Instrumentation, Federal Research Center "Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences", 142290, Pushchino, Moscow Region, Russia
| | - Oxana V Galzitskaya
- Institute of Protein Research, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia; Institute of Theoretical and Experimental Biophysics, Russian Academy of Sciences, 142290, Pushchino, Moscow Region, Russia.
| |
Collapse
|
6
|
Garber ME, Frank V, Kazakov AE, Incha MR, Nava AA, Zhang H, Valencia LE, Keasling JD, Rajeev L, Mukhopadhyay A. REC protein family expansion by the emergence of a new signaling pathway. mBio 2023; 14:e0262223. [PMID: 37991384 PMCID: PMC10746176 DOI: 10.1128/mbio.02622-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 10/20/2023] [Indexed: 11/23/2023] Open
Abstract
IMPORTANCE We explore when and why large classes of proteins expand into new sequence space. We used an unsupervised machine learning approach to observe the sequence landscape of REC domains of bacterial response regulator proteins. We find that within-gene recombination can switch effector domains and, consequently, change the regulatory context of the duplicated protein.
Collapse
Affiliation(s)
- Megan E. Garber
- Department of Comparative Biochemistry, University of California, Berkeley, California, USA
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Vered Frank
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Alexey E. Kazakov
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Matthew R. Incha
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California, USA
| | - Alberto A. Nava
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, California, USA
| | - Hanqiao Zhang
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
- Department of Bioengineering, University of California, Berkeley, California, USA
| | - Luis E. Valencia
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
- Department of Bioengineering, University of California, Berkeley, California, USA
| | - Jay D. Keasling
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, California, USA
- Department of Chemical and Biomolecular Engineering, University of California, Berkeley, California, USA
- Department of Bioengineering, University of California, Berkeley, California, USA
- Center for Biosustainability, Danish Technical University, Lyngby, Denmark
- Center for Synthetic Biochemistry, Shenzhen Institutes for Advanced Technologies, Shenzhen, China
| | - Lara Rajeev
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Aindrila Mukhopadhyay
- Department of Comparative Biochemistry, University of California, Berkeley, California, USA
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| |
Collapse
|
7
|
Zhu HT, Xia YH, Zhang GJ. E2EDA: Protein Domain Assembly Based on End-to-End Deep Learning. J Chem Inf Model 2023; 63:6451-6461. [PMID: 37788318 DOI: 10.1021/acs.jcim.3c01387] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
With the development of deep learning, almost all single-domain proteins can be predicted at experimental resolution. However, the structure prediction of multi-domain proteins remains a challenge. Achieving end-to-end protein domain assembly and further improving the accuracy of the full-chain modeling by accurately predicting inter-domain orientation while improving the assembly efficiency will provide significant insights into structure-based drug discovery. In this work, we propose an End-to-End Domain Assembly method based on deep learning, named E2EDA. We first develop RMNet, an EfficientNetV2-based deep learning model that fuses multiple features using an attention mechanism to predict inter-domain rigid motion. Then, the predicted rigid motions are transformed into inter-domain spatial transformations to directly assemble the full-chain model. Finally, the scoring strategy RMscore is designed to select the best model from multiple assembled models. The experimental results show that the average TM-score of the model assembled by E2EDA on the benchmark set (282) is 0.827, which is better than those of other domain assembly methods SADA (0.792) and DEMO (0.730). Meanwhile, on our constructed multi-domain data set from AlphaFold DB, the model reassembled by E2EDA is 7.0% higher in TM-score compared to the full-chain model predicted by AlphaFold2, indicating that E2EDA can capture more accurate inter-domain orientations to improve the quality of the model predicted by AlphaFold2. Furthermore, compared to SADA and AlphaFold2, E2EDA reduced the average runtime on the benchmark by 64.7% and 19.2%, respectively, indicating that E2EDA can significantly improve assembly efficiency through an end-to-end approach. The online server is available at http://zhanglab-bioinf.com/E2EDA.
Collapse
Affiliation(s)
- Hai-Tao Zhu
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Yu-Hao Xia
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| | - Gui-Jun Zhang
- College of Information Engineering, Zhejiang University of Technology, Hangzhou, 310023, China
| |
Collapse
|
8
|
Gollapalli P, Rudrappa S, Kumar V, Santosh Kumar HS. Domain Architecture Based Methods for Comparative Functional Genomics Toward Therapeutic Drug Target Discovery. J Mol Evol 2023; 91:598-615. [PMID: 37626222 DOI: 10.1007/s00239-023-10129-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 08/06/2023] [Indexed: 08/27/2023]
Abstract
Genes duplicate, mutate, recombine, fuse or fission to produce new genes, or when genes are formed from de novo, novel functions arise during evolution. Researchers have tried to quantify the causes of these molecular diversification processes to know how these genes increase molecular complexity over a period of time, for instance protein domain organization. In contrast to global sequence similarity, protein domain architectures can capture key structural and functional characteristics, making them better proxies for describing functional equivalence. In Prokaryotes and eukaryotes it has proven that, domain designs are retained over significant evolutionary distances. Protein domain architectures are now being utilized to categorize and distinguish evolutionarily related proteins and find homologs among species that are evolutionarily distant from one another. Additionally, structural information stored in domain structures has accelerated homology identification and sequence search methods. Tools for functional protein annotation have been developed to discover, protein domain content, domain order, domain recurrence, and domain position as all these contribute to the prediction of protein functional accuracy. In this review, an attempt is made to summarise facts and speculations regarding the use of protein domain architecture and modularity to identify possible therapeutic targets among cellular activities based on the understanding their linked biological processes.
Collapse
Affiliation(s)
- Pavan Gollapalli
- Center for Bioinformatics and Biostatistics, Nitte (Deemed to be University), Mangalore, Karnataka, 575018, India
| | - Sushmitha Rudrappa
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India
| | - Vadlapudi Kumar
- Department of Biochemistry, Davangere University, Shivagangothri, Davangere, Karnataka, 577007, India
| | - Hulikal Shivashankara Santosh Kumar
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India.
| |
Collapse
|
9
|
Deryusheva EI, Machulin AV, Galzitskaya OV. Diversity and features of proteins with structural repeats. Biophys Rev 2023; 15:1159-1169. [PMID: 37974986 PMCID: PMC10643770 DOI: 10.1007/s12551-023-01130-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Accepted: 08/28/2023] [Indexed: 11/19/2023] Open
Abstract
The review provides information on proteins with structural repeats, including their classification, characteristics, functions, and relevance in disease development. It explores methods for identifying structural repeats and specialized databases. The review also highlights the potential use of repeat proteins as drug design scaffolds and discusses their evolutionary mechanisms.
Collapse
Affiliation(s)
- Evgeniya I. Deryusheva
- Institute for Biological Instrumentation, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, Pushchino, Russia
| | - Andrey V. Machulin
- Skryabin Institute of Biochemistry and Physiology of Microorganisms, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, Pushchino, Russia
| | - Oxana V. Galzitskaya
- Institute of Protein Research of the Russian Academy of Sciences, Pushchino, Russia
- Institute of Theoretical and Experimental Biophysics of the Russian Academy of Sciences, Pushchino, Russia
| |
Collapse
|
10
|
Li Z, Hu Y, Ma X, Da L, She J, Liu Y, Yi X, Cao Y, Xu W, Jiao Y, Su Z. WheatCENet: A Database for Comparative Co-expression Networks Analysis of Allohexaploid Wheat and Its Progenitors. GENOMICS, PROTEOMICS & BIOINFORMATICS 2023; 21:324-336. [PMID: 35660007 PMCID: PMC10626052 DOI: 10.1016/j.gpb.2022.04.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 03/16/2022] [Accepted: 05/08/2022] [Indexed: 06/15/2023]
Abstract
Genetic and epigenetic changes after polyploidization events could result in variable gene expression and modified regulatory networks. Here, using large-scale transcriptome data, we constructed co-expression networks for diploid, tetraploid, and hexaploid wheat species, and built a platform for comparing co-expression networks of allohexaploid wheat and its progenitors, named WheatCENet. WheatCENet is a platform for searching and comparing specific functional co-expression networks, as well as identifying the related functions of the genes clustered therein. Functional annotations like pathways, gene families, protein-protein interactions, microRNAs (miRNAs), and several lines of epigenome data are integrated into this platform, and Gene Ontology (GO) annotation, gene set enrichment analysis (GSEA), motif identification, and other useful tools are also included. Using WheatCENet, we found that the network of WHEAT ABERRANT PANICLE ORGANIZATION 1 (WAPO1) has more co-expressed genes related to spike development in hexaploid wheat than its progenitors. We also found a novel motif of CCWWWWWWGG (CArG) specifically in the promoter region of WAPO-A1, suggesting that neofunctionalization of the WAPO-A1 gene affects spikelet development in hexaploid wheat. WheatCENet is useful for investigating co-expression networks and conducting other analyses, and thus facilitates comparative and functional genomic studies in wheat. WheatCENet is freely available at http://bioinformatics.cpolar.cn/WheatCENet and http://bioinformatics.cau.edu.cn/WheatCENet.
Collapse
Affiliation(s)
- Zhongqiu Li
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Yiheng Hu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xuelian Ma
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Lingling Da
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Jiajie She
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Yue Liu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Xin Yi
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China
| | - Yaxin Cao
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Wenying Xu
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Yuannian Jiao
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Zhen Su
- State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China.
| |
Collapse
|
11
|
Deori NM, Infant T, Thummer RP, Nagotu S. Characterization of the Multiple Domains of Pex30 Involved in Subcellular Localization of the Protein and Regulation of Peroxisome Number. Cell Biochem Biophys 2023; 81:39-47. [PMID: 36462131 DOI: 10.1007/s12013-022-01122-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Accepted: 11/22/2022] [Indexed: 12/05/2022]
Abstract
Pex30 is a peroxisomal protein whose role in peroxisome biogenesis via the endoplasmic reticulum has been established. It is a 58 KDa multi-domain protein that facilitates contact site formation between various organelles. The present study aimed to investigate the role of various domains of the protein in its sub-cellular localization and regulation of peroxisome number. For this, we created six truncations of the protein (1-87, 1-250, 1-352, 88-523, 251-523 and 353-523) and tagged GFP at the C-terminus. Biochemical methods and fluorescence microscopy were used to characterize the effect of truncation on expression and localization of the protein. Quantitative analysis was performed to determine the effect of truncation on peroxisome number in these cells. Expression of the truncated variants in cells lacking PEX30 did not cause any effect on cell growth. Interestingly, variable expression and localization of the truncated variants in both peroxisome-inducing and non-inducing medium was observed. Truncated variants depicted different distribution patterns such as punctate, reticulate and cytosolic fluorescence. Interestingly, lack of the complete dysferlin domain or C-Dysf resulted in increased peroxisome number similar to as reported for cells lacking Pex30. No contribution of this domain in the reticulate distribution of the proteins was also observed. Our results show an interesting role for the various domains of Pex30 in localization and regulation of peroxisome number.
Collapse
Affiliation(s)
- Nayan Moni Deori
- Organelle Biology and Cellular Ageing Lab, Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, 781039, India
| | - Terence Infant
- Organelle Biology and Cellular Ageing Lab, Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, 781039, India
| | - Rajkumar P Thummer
- Laboratory for Stem Cell Engineering and Regenerative Medicine, Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, 781039, India
| | - Shirisha Nagotu
- Organelle Biology and Cellular Ageing Lab, Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, 781039, India.
| |
Collapse
|
12
|
Calatayud S, Garcia-Risco M, Pedrini-Martha V, Niederwanger M, Dallinger R, Palacios Ò, Capdevila M, Albalat R. The Modular Architecture of Metallothioneins Facilitates Domain Rearrangements and Contributes to Their Evolvability in Metal-Accumulating Mollusks. Int J Mol Sci 2022; 23:15824. [PMID: 36555472 PMCID: PMC9781358 DOI: 10.3390/ijms232415824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 12/05/2022] [Accepted: 12/10/2022] [Indexed: 12/15/2022] Open
Abstract
Protein domains are independent structural and functional modules that can rearrange to create new proteins. While the evolution of multidomain proteins through the shuffling of different preexisting domains has been well documented, the evolution of domain repeat proteins and the origin of new domains are less understood. Metallothioneins (MTs) provide a good case study considering that they consist of metal-binding domain repeats, some of them with a likely de novo origin. In mollusks, for instance, most MTs are bidomain proteins that arose by lineage-specific rearrangements between six putative domains: α, β1, β2, β3, γ and δ. Some domains have been characterized in bivalves and gastropods, but nothing is known about the MTs and their domains of other Mollusca classes. To fill this gap, we investigated the metal-binding features of NpoMT1 of Nautilus pompilius (Cephalopoda class) and FcaMT1 of Falcidens caudatus (Caudofoveata class). Interestingly, whereas NpoMT1 consists of α and β1 domains and has a prototypical Cd2+ preference, FcaMT1 has a singular preference for Zn2+ ions and a distinct domain composition, including a new Caudofoveata-specific δ domain. Overall, our results suggest that the modular architecture of MTs has contributed to MT evolution during mollusk diversification, and exemplify how modularity increases MT evolvability.
Collapse
Affiliation(s)
- Sara Calatayud
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, E-08028 Barcelona, Spain
| | - Mario Garcia-Risco
- Departament de Química, Facultat de Ciències, Universitat Autònoma de Barcelona, E-08193 Cerdanyola del Vallès, Spain
| | - Veronika Pedrini-Martha
- Center for Molecular Biosciences Innsbruck (CMBI), Department of Zoology, University of Innsbruck, A-6020 Innsbruck, Austria
| | - Michael Niederwanger
- Center for Molecular Biosciences Innsbruck (CMBI), Department of Zoology, University of Innsbruck, A-6020 Innsbruck, Austria
| | - Reinhard Dallinger
- Center for Molecular Biosciences Innsbruck (CMBI), Department of Zoology, University of Innsbruck, A-6020 Innsbruck, Austria
| | - Òscar Palacios
- Departament de Química, Facultat de Ciències, Universitat Autònoma de Barcelona, E-08193 Cerdanyola del Vallès, Spain
| | - Mercè Capdevila
- Departament de Química, Facultat de Ciències, Universitat Autònoma de Barcelona, E-08193 Cerdanyola del Vallès, Spain
| | - Ricard Albalat
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Universitat de Barcelona, E-08028 Barcelona, Spain
- Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, E-08028 Barcelona, Spain
| |
Collapse
|
13
|
Fong SL, Capra JA. Function and Constraint in Enhancer Sequences with Multiple Evolutionary Origins. Genome Biol Evol 2022; 14:evac159. [PMID: 36314566 PMCID: PMC9673499 DOI: 10.1093/gbe/evac159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2022] [Indexed: 11/04/2022] Open
Abstract
Thousands of human gene regulatory enhancers are composed of sequences with multiple evolutionary origins. These evolutionarily "complex" enhancers consist of older "core" sequences and younger "derived" sequences. However, the functional relationship between the sequences of different evolutionary origins within complex enhancers is poorly understood. We evaluated the function, selective pressures, and sequence variation across core and derived components of human complex enhancers. We find that both components are older than expected from the genomic background, and complex enhancers are enriched for core and derived sequences of similar evolutionary ages. Both components show strong evidence of biochemical activity in massively parallel report assays. However, core and derived sequences have distinct transcription factor (TF)-binding preferences that are largely similar across evolutionary origins. As expected, given these signatures of function, both core and derived sequences have substantial evidence of purifying selection. Nonetheless, derived sequences exhibit weaker purifying selection than adjacent cores. Derived sequences also tolerate more common genetic variation and are enriched compared with cores for expression quantitative trait loci associated with gene expression variability in human populations. In conclusion, both core and derived sequences have strong evidence of gene regulatory function, but derived sequences have distinct constraint profiles, TF-binding preferences, and tolerance to variation compared with cores. We propose that the step-wise integration of younger derived with older core sequences has generated regulatory substrates with robust activity and the potential for functional variation. Our analyses demonstrate that synthesizing study of enhancer evolution and function can aid interpretation of regulatory sequence activity and functional variation across human populations.
Collapse
Affiliation(s)
- Sarah L Fong
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, Tennessee
| | - John A Capra
- Department of Biological Sciences, Vanderbilt University, Nashville, Tennessee
- Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco
| |
Collapse
|
14
|
Martyn JE, Gomez-Valero L, Buchrieser C. The evolution and role of eukaryotic-like domains in environmental intracellular bacteria: the battle with a eukaryotic cell. FEMS Microbiol Rev 2022; 46:6529235. [DOI: 10.1093/femsre/fuac012] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 02/09/2022] [Accepted: 02/14/2022] [Indexed: 11/14/2022] Open
Abstract
Abstract
Intracellular pathogens that are able to thrive in different environments, such as Legionella spp. which preferentially live in protozoa in aquatic environments or environmental Chlamydiae which replicate either within protozoa or a range of animals, possess a plethora of cellular biology tools to influence their eukaryotic host. The host manipulation tools that evolved in the interaction with protozoa, confer these bacteria the capacity to also infect phylogenetically distinct eukaryotic cells, such as macrophages and thus they can also be human pathogens. To manipulate the host cell, bacteria use protein secretion systems and molecular effectors. Although these molecular effectors are encoded in bacteria, they are expressed and function in a eukaryotic context often mimicking or inhibiting eukaryotic proteins. Indeed, many of these effectors have eukaryotic-like domains. In this review we propose that the main pathways environmental intracellular bacteria need to subvert in order to establish the host eukaryotic cell as a replication niche are chromatin remodelling, ubiquitination signalling, and modulation of protein-protein interactions via tandem repeat domains. We then provide mechanistic insight into how these proteins might have evolved as molecular weapons. Finally, we highlight that in environmental intracellular bacteria the number of eukaryotic-like domains and proteins is considerably higher than in intracellular bacteria specialised to an isolated niche, such as obligate intracellular human pathogens. As mimics of eukaryotic proteins are critical components of host pathogen interactions, this distribution of eukaryotic-like domains suggests that the environment has selected them.
Collapse
Affiliation(s)
- Jessica E Martyn
- Institut Pasteur, Biologie des Bactéries Intracellulaires and CNRS UMR 3525, Paris, France
| | - Laura Gomez-Valero
- Institut Pasteur, Biologie des Bactéries Intracellulaires and CNRS UMR 3525, Paris, France
| | - Carmen Buchrieser
- Institut Pasteur, Biologie des Bactéries Intracellulaires and CNRS UMR 3525, Paris, France
| |
Collapse
|
15
|
Rahman ASMZ, Timmerman L, Gallardo F, Cardona ST. Identification of putative essential protein domains from high-density transposon insertion sequencing. Sci Rep 2022; 12:962. [PMID: 35046497 PMCID: PMC8770471 DOI: 10.1038/s41598-022-05028-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 12/29/2021] [Indexed: 12/24/2022] Open
Abstract
A first clue to gene function can be obtained by examining whether a gene is required for life in certain standard conditions, that is, whether a gene is essential. In bacteria, essential genes are usually identified by high-density transposon mutagenesis followed by sequencing of insertion sites (Tn-seq). These studies assign the term "essential" to whole genes rather than the protein domain sequences that encode the essential functions. However, genes can code for multiple protein domains that evolve their functions independently. Therefore, when essential genes code for more than one protein domain, only one of them could be essential. In this study, we defined this subset of genes as "essential domain-containing" (EDC) genes. Using a Tn-seq data set built-in Burkholderia cenocepacia K56-2, we developed an in silico pipeline to identify EDC genes and the essential protein domains they encode. We found forty candidate EDC genes and demonstrated growth defect phenotypes using CRISPR interference (CRISPRi). This analysis included two knockdowns of genes encoding the protein domains of unknown function DUF2213 and DUF4148. These putative essential domains are conserved in more than two hundred bacterial species, including human and plant pathogens. Together, our study suggests that essentiality should be assigned to individual protein domains rather than genes, contributing to a first functional characterization of protein domains of unknown function.
Collapse
Affiliation(s)
| | - Lukas Timmerman
- Department of Computer Science, University of Manitoba, Winnipeg, MB, Canada
| | - Flyn Gallardo
- Department of Microbiology, University of Manitoba, Winnipeg, MB, Canada
| | - Silvia T Cardona
- Department of Microbiology, University of Manitoba, Winnipeg, MB, Canada.
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Canada.
| |
Collapse
|
16
|
Caetano-Anollés G, Aziz MF, Mughal F, Caetano-Anollés D. Tracing protein and proteome history with chronologies and networks: folding recapitulates evolution. Expert Rev Proteomics 2021; 18:863-880. [PMID: 34628994 DOI: 10.1080/14789450.2021.1992277] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
INTRODUCTION While the origin and evolution of proteins remain mysterious, advances in evolutionary genomics and systems biology are facilitating the historical exploration of the structure, function and organization of proteins and proteomes. Molecular chronologies are series of time events describing the history of biological systems and subsystems and the rise of biological innovations. Together with time-varying networks, these chronologies provide a window into the past. AREAS COVERED Here, we review molecular chronologies and networks built with modern methods of phylogeny reconstruction. We discuss how chronologies of structural domain families uncover the explosive emergence of metabolism, the late rise of translation, the co-evolution of ribosomal proteins and rRNA, and the late development of the ribosomal exit tunnel; events that coincided with a tendency to shorten folding time. Evolving networks described the early emergence of domains and a late 'big bang' of domain combinations. EXPERT OPINION Two processes, folding and recruitment appear central to the evolutionary progression. The former increases protein persistence. The later fosters diversity. Chronologically, protein evolution mirrors folding by combining supersecondary structures into domains, developing translation machinery to facilitate folding speed and stability, and enhancing structural complexity by establishing long-distance interactions in novel structural and architectural designs.
Collapse
Affiliation(s)
- Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA.,C. R. Woese Institute for Genomic Biology, University of Illinois, Urbana, Illinois, USA
| | - M Fayez Aziz
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA
| | - Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, Illinois, USA
| | - Derek Caetano-Anollés
- Data Science Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| |
Collapse
|
17
|
Yang K, Xie D, Lin W, Xiang P, Peng C. Adipose mesenchymal stem cells and gingival mesenchymal stem cells have a comparable effect in endothelium repair. Exp Ther Med 2021; 22:1415. [PMID: 34676008 PMCID: PMC8524764 DOI: 10.3892/etm.2021.10851] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Accepted: 07/14/2021] [Indexed: 12/14/2022] Open
Abstract
Restenosis is the major factor influencing the long-term success rate of angioplasty and stent implantation and effective strategies to prevent restenosis remain limited. Mesenchymal stem cells (MSCs) are pluripotent stem cells capable of self-renewal and multidirectional differentiation, which may be able to promote endothelium repair, thereby reducing restenosis. The present study aimed to evaluate the effects of adipose MSCs (AMSCs) and gingival MSCs (GMSCs) on endothelium repair. MSCs were isolated from two human tissue types, namely adipose tissue and gingival tissue, and the effects of AMSCs and GMSCs in ex vivo endothelium repair and on vascular smooth muscle cell (SMC) growth were examined. To compare the feasibility of using AMSCs and GMSCs for the repair of endothelium damage in endothelial cell (EC) damage and vasoproliferative disorders, an ex vivo model of endothelium repair in a co-culture system was developed. It was indicated that AMSCs and GMSCs expressed characteristic MSC markers (CD105 and CD166). 3H-thymidine incorporation in the co-culture group of AMSCs and SMCs in the presence of ECs was lower compared with that in the GMSC and SMC co-culture group. The protein expression level of proliferating cell nuclear antigen in the co-culture group of AMSCs and SMCs in the presence of ECs were lower compared with that in the GMSC and SMC co-culture group. After co-culture with ECs for 5 days, 25.71±3.08% of AMSCs began to express CD31 protein and 20.06±2.09% of GMSCs began to express CD31 protein. Furthermore, anti-VEGF antibody was able to inhibit MSC differentiation. Collectively, the present results suggested that seeding of AMSCs had a stronger effect to inhibit the proliferation and migration of SMCs compared with GMSCs.
Collapse
Affiliation(s)
- Ke Yang
- Department of Cardiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, P.R. China
| | - Dongmei Xie
- Department of Cardiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, P.R. China
| | - Wanwen Lin
- Department of Cardiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, P.R. China
| | - Peng Xiang
- Key Laboratory for Stem Cells and Tissue Engineering, Center for Stem Cell Biology and Tissue Engineering, Ministry of Education, Sun Yat-sen University, Guangzhou, Guangdong 510600, P.R. China
| | - Chaoquan Peng
- Department of Cardiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, Guangdong 510630, P.R. China
| |
Collapse
|
18
|
Deryusheva EI, Machulin AV, Galzitskaya OV. Structural, Functional, and Evolutionary Characteristics of Proteins with Repeats. Mol Biol 2021. [DOI: 10.1134/s0026893321040038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
19
|
Lakshmanan Mangalath D, Hassan Mohammed SA. Ligand Binding Domain of Estrogen Receptor Alpha Preserve a Conserved Structural Architecture Similar to Bacterial Taxis Receptors. Front Ecol Evol 2021. [DOI: 10.3389/fevo.2021.681913] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
It remains a mystery why estrogen hormone receptors (ERs), which are highly specific toward its endogenous hormones, are responsive to chemically distinct exogenous agents. Does it indicate that ERs are environmentally regulated? Here, we speculate that ERs would have some common structural features with prokaryotic taxis receptor responsive toward environmental signals. This study addresses the low specificity and high responsiveness of ERs toward chemically distinct exogenous substances, from an evolutionary point of view. Here, we compared the ligand binding domain (LBD) of ER alpha (α) with the LBDs of prokaryotic taxis receptors to check if LBDs share any structural similarity. Interestingly, a high degree of similarity in the domain structural fold architecture of ERα and bacterial taxis receptors was observed. The pharmacophore modeling focused on ligand molecules of both receptors suggest that these ligands share common pharmacophore features. The molecular docking studies suggest that the natural ligands of bacterial chemotaxis receptors exhibit strong interaction with human ER as well. Although phylogenetic analysis proved that these proteins are unrelated, they would have evolved independently, suggesting a possibility of convergent molecular evolution. Nevertheless, a remarkable sequence divergence was seen between these proteins even when they shared common domain structural folds and common ligand-based pharmacophore features, suggesting that the protein architecture remains conserved within the structure for a specific function irrespective of sequence identity.
Collapse
|
20
|
Dutta A, Chandravanshi M, Kanaujia SP. Conserved features of the MlaD domain aid the trafficking of hydrophobic molecules. Proteins 2021; 89:1473-1488. [PMID: 34196044 DOI: 10.1002/prot.26168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 06/18/2021] [Accepted: 06/25/2021] [Indexed: 11/09/2022]
Abstract
In Gram-negative bacteria, the maintenance of lipid asymmetry (Mla) system is involved in the transport of phospholipids between the inner (IM) and outer membrane. The Mla system utilizes a unique IM-associated periplasmic solute-binding protein, MlaD, which possesses a conserved domain, MlaD domain. While proteins carrying the MlaD domain are known to be primarily involved in the trafficking of hydrophobic molecules, not much is known about this domain itself. Thus, in this study, the characterization of the MlaD domain employing bioinformatics analysis is reported. The profiling of the MlaD domain of different architectures reveals the abundance of glycine and hydrophobic residues and the lack of cysteine residues. The domain possesses a conserved N-terminal region and a well-preserved glycine residue that constitutes a consensus motif across different architectures. Phylogenetic analysis shows that the MlaD domain archetypes are evolutionarily closer and marked by the conservation of a functionally crucial pore loop located at the C-terminal region. The study also establishes the critical role of the domain-associated permeases and the driving forces governing the transport of hydrophobic molecules. This sheds sufficient light on the structure-function-evolutionary relationship of MlaD domain. The hexameric interface analysis reveals that the MlaD domain itself is not a sole player in the oligomerization of the proteins. Further, an operonic and interactome map analysis reveals that the Mla and the Mce systems are dependent on the structural homologs of the nuclear transport factor 2 superfamily.
Collapse
Affiliation(s)
- Angshu Dutta
- Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India
| | - Monika Chandravanshi
- Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India
| | - Shankar Prasad Kanaujia
- Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India
| |
Collapse
|
21
|
Abstract
Domains are the structural, functional and evolutionary units of proteins. They combine to form multidomain proteins. The evolutionary history of this molecular combinatorics has been studied with phylogenomic methods. Here, we construct networks of domain organization and explore their evolution. A time series of networks revealed two ancient waves of structural novelty arising from ancient 'p-loop' and 'winged helix' domains and a massive 'big bang' of domain organization. The evolutionary recruitment of domains was highly modular, hierarchical and ongoing. Domain rearrangements elicited non-random and scale-free network structure. Comparative analyses of preferential attachment, randomness and modularity showed yin-and-yang complementary transition and biphasic patterns along the structural chronology. Remarkably, the evolving networks highlighted a central evolutionary role of cofactor-supporting structures of non-ribosomal peptide synthesis pathways, likely crucial to the early development of the genetic code. Some highly modular domains featured dual response regulation in two-component signal transduction systems with DNA-binding activity linked to transcriptional regulation of responses to environmental change. Interestingly, hub domains across the evolving networks shared the historical role of DNA binding and editing, an ancient protein function in molecular evolution. Our investigation unfolds historical source-sink patterns of evolutionary recruitment that further our understanding of protein architectures and functions.
Collapse
|
22
|
Kolodny R, Nepomnyachiy S, Tawfik DS, Ben-Tal N. Bridging Themes: Short Protein Segments Found in Different Architectures. Mol Biol Evol 2021; 38:2191-2208. [PMID: 33502503 PMCID: PMC8136508 DOI: 10.1093/molbev/msab017] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The vast majority of theoretically possible polypeptide chains do not fold, let alone confer function. Hence, protein evolution from preexisting building blocks has clear potential advantages over ab initio emergence from random sequences. In support of this view, sequence similarities between different proteins is generally indicative of common ancestry, and we collectively refer to such homologous sequences as "themes." At the domain level, sequence homology is routinely detected. However, short themes which are segments, or fragments of intact domains, are particularly interesting because they may provide hints about the emergence of domains, as opposed to divergence of preexisting domains, or their mixing-and-matching to form multi-domain proteins. Here we identified 525 representative short themes, comprising 20-80 residues that are unexpectedly shared between domains considered to have emerged independently. Among these "bridging themes" are ones shared between the most ancient domains, for example, Rossmann, P-loop NTPase, TIM-barrel, flavodoxin, and ferredoxin-like. We elaborate on several particularly interesting cases, where the bridging themes mediate ligand binding. Ligand binding may have contributed to the stability and the plasticity of these building blocks, and to their ability to invade preexisting domains or serve as starting points for completely new domains.
Collapse
Affiliation(s)
- Rachel Kolodny
- Department of Computer Science, University of Haifa, Haifa, Israel
| | | | - Dan S Tawfik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Nir Ben-Tal
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
23
|
Wang CK, Craik DJ. Linking molecular evolution to molecular grafting. J Biol Chem 2021; 296:100425. [PMID: 33600801 PMCID: PMC8005815 DOI: 10.1016/j.jbc.2021.100425] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/09/2021] [Accepted: 02/13/2021] [Indexed: 12/01/2022] Open
Abstract
Molecular grafting is a strategy for the engineering of molecular scaffolds into new functional agents, such as next-generation therapeutics. Despite its wide use, studies so far have focused almost exclusively on demonstrating its utility rather than understanding the factors that lead to either poor or successful grafting outcomes. Here, we examine protein evolution and identify parallels between the natural process of protein functional diversification and the artificial process of molecular grafting. We discuss features of natural proteins that are correlated to innovability-the capacity to acquire new functions-and describe their implications to molecular grafting scaffolds. Disulfide-rich peptides are used as exemplars because they are particularly promising scaffolds onto which new functions can be grafted. This article provides a perspective on why some scaffolds are more suitable for grafting than others, identifying opportunities on how molecular grafting might be improved.
Collapse
Affiliation(s)
- Conan K Wang
- Institute for Molecular Bioscience and Australian Research Council Centre of Excellence for Innovations in Peptide and Protein Science, The University of Queensland, Brisbane, Queensland, Australia.
| | - David J Craik
- Institute for Molecular Bioscience and Australian Research Council Centre of Excellence for Innovations in Peptide and Protein Science, The University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
24
|
Cisneros-Martínez AM, Becerra A, Lazcano A. Ancient gene duplications in RNA viruses revealed by protein tertiary structure comparisons. Virus Evol 2021; 7:veab019. [PMID: 33758672 PMCID: PMC7967035 DOI: 10.1093/ve/veab019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
To date only a handful of duplicated genes have been described in RNA viruses. This shortage can be attributed to different factors, including the RNA viruses with high mutation rate that would make a large genome more prone to acquire deleterious mutations. This may explain why sequence-based approaches have only found duplications in their most recent evolutionary history. To detect earlier duplications, we performed protein tertiary structure comparisons for every RNA virus family represented in the Protein Data Bank. We present a list of thirty pairs of possible paralogs with <30 per cent sequence identity. It is argued that these pairs are the outcome of six duplication events. These include the α and β subunits of the fungal toxin KP6 present in the dsRNA Ustilago maydis virus (family Totiviridae), the SARS-CoV (Coronaviridae) nsp3 domains SUD-N, SUD-M and X-domain, the Picornavirales (families Picornaviridae, Dicistroviridae, Iflaviridae and Secoviridae) capsid proteins VP1, VP2 and VP3, and the Enterovirus (family Picornaviridae) 3C and 2A cysteine-proteases. Protein tertiary structure comparisons may reveal more duplication events as more three-dimensional protein structures are determined and suggests that, although still rare, gene duplications may be more frequent in RNA viruses than previously thought. Keywords: gene duplications; RNA viruses.
Collapse
Affiliation(s)
| | - Arturo Becerra
- Facultad de Ciencias, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Antonio Lazcano
- Facultad de Ciencias, Universidad Nacional Autónoma de México, Mexico City, Mexico
- El Colegio Nacional, Donceles 104, Centro Histórico, Mexico City, Mexico
| |
Collapse
|
25
|
Gumerov VM, Zhulin IB. TREND: a platform for exploring protein function in prokaryotes based on phylogenetic, domain architecture and gene neighborhood analyses. Nucleic Acids Res 2020; 48:W72-W76. [PMID: 32282909 PMCID: PMC7319448 DOI: 10.1093/nar/gkaa243] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2020] [Revised: 03/16/2020] [Accepted: 04/01/2020] [Indexed: 01/16/2023] Open
Abstract
Key steps in a computational study of protein function involve analysis of (i) relationships between homologous proteins, (ii) protein domain architecture and (iii) gene neighborhoods the corresponding proteins are encoded in. Each of these steps requires a separate computational task and sets of tools. Currently in order to relate protein features and gene neighborhoods information to phylogeny, researchers need to prepare all the necessary data and combine them by hand, which is time-consuming and error-prone. Here, we present a new platform, TREND (tree-based exploration of neighborhoods and domains), which can perform all the necessary steps in automated fashion and put the derived information into phylogenomic context, thus making evolutionary based protein function analysis more efficient. A rich set of adjustable components allows a user to run the computational steps specific to his task. TREND is freely available at http://trend.zhulinlab.org.
Collapse
Affiliation(s)
- Vadim M Gumerov
- Department of Microbiology and Translational Data Analytics Institute, The Ohio State University, Columbus, OH, USA
| | - Igor B Zhulin
- Department of Microbiology and Translational Data Analytics Institute, The Ohio State University, Columbus, OH, USA
| |
Collapse
|