Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Bashton M, Chothia C. The generation of new protein functions by the combination of domains. Structure 2007;15:85-99. [PMID: 17223535 DOI: 10.1016/j.str.2006.11.009] [Citation(s) in RCA: 128] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2006] [Revised: 11/21/2006] [Accepted: 11/21/2006] [Indexed: 11/21/2022]

For:	Bashton M, Chothia C. The generation of new protein functions by the combination of domains. Structure 2007;15:85-99. [PMID: 17223535 DOI: 10.1016/j.str.2006.11.009] [Citation(s) in RCA: 128] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2006] [Revised: 11/21/2006] [Accepted: 11/21/2006] [Indexed: 11/21/2022]

Number

Cited by Other Article(s)

Ulusoy E, Doğan T. Mutual annotation-based prediction of protein domain functions with Domain2GO. Protein Sci 2024;33:e4988. [PMID: 38757367 PMCID: PMC11099699 DOI: 10.1002/pro.4988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/25/2024] [Accepted: 03/30/2024] [Indexed: 05/18/2024]

Abstract

Identifying unknown functional properties of proteins is essential for understanding their roles in both health and disease states. The domain composition of a protein can reveal critical information in this context, as domains are structural and functional units that dictate how the protein should act at the molecular level. The expensive and time-consuming nature of wet-lab experimental approaches prompted researchers to develop computational strategies for predicting the functions of proteins. In this study, we proposed a new method called Domain2GO that infers associations between protein domains and function-defining gene ontology (GO) terms, thus redefining the problem as domain function prediction. Domain2GO uses documented protein-level GO annotations together with proteins' domain annotations. Co-annotation patterns of domains and GO terms in the same proteins are examined using statistical resampling to obtain reliable associations. As a use-case study, we evaluated the biological relevance of examples selected from the Domain2GO-generated domain-GO term mappings via literature review. Then, we applied Domain2GO to predict unknown protein functions by propagating domain-associated GO terms to proteins annotated with these domains. For function prediction performance evaluation and comparison against other methods, we employed Critical Assessment of Function Annotation 3 (CAFA3) challenge datasets. The results demonstrated the high potential of Domain2GO, particularly for predicting molecular function and biological process terms, along with advantages such as producing interpretable results and having an exceptionally low computational cost. The approach presented here can be extended to other ontologies and biological entities to investigate unknown relationships in complex and large-scale biological data. The source code, datasets, results, and user instructions for Domain2GO are available at https://github.com/HUBioDataLab/Domain2GO. Additionally, we offer a user-friendly online tool at https://huggingface.co/spaces/HUBioDataLab/Domain2GO, which simplifies the prediction of functions of previously unannotated proteins solely using amino acid sequences.

Collapse

Zimmerman L, Alon N, Levin I, Koganitsky A, Shpigel N, Brestel C, Lapidoth GD. Context-dependent design of induced-fit enzymes using deep learning generates well-expressed, thermally stable and active enzymes. Proc Natl Acad Sci U S A 2024;121:e2313809121. [PMID: 38437538 PMCID: PMC10945820 DOI: 10.1073/pnas.2313809121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 02/09/2024] [Indexed: 03/06/2024] Open

García-Paz FDM, Del Moral S, Morales-Arrieta S, Ayala M, Treviño-Quintanilla LG, Olvera-Carranza C. Multidomain chimeric enzymes as a promising alternative for biocatalysts improvement: a minireview. Mol Biol Rep 2024;51:410. [PMID: 38466518 DOI: 10.1007/s11033-024-09332-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 02/07/2024] [Indexed: 03/13/2024]

Bonello J, Orengo C. FunPredCATH: An ensemble method for predicting protein function using CATH. BIOCHIMICA ET BIOPHYSICA ACTA. PROTEINS AND PROTEOMICS 2024;1872:140985. [PMID: 38122964 DOI: 10.1016/j.bbapap.2023.140985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 12/05/2023] [Accepted: 12/06/2023] [Indexed: 12/23/2023]

Abstract

MOTIVATION

The growth of unannotated proteins in UniProt increases at a very high rate every year due to more efficient sequencing methods. However, the experimental annotation of proteins is a lengthy and expensive process. Using computational techniques to narrow the search can speed up the process by providing highly specific Gene Ontology (GO) terms.

METHODOLOGY

We propose an ensemble approach that combines three generic base predictors that predict Gene Ontology (BP, CC and MF) terms from sequences across different species. We train our models on UniProtGOA annotation data and use the CATH domain resources to identify the protein families. We then calculate a score based on the prevalence of individual GO terms in the functional families that is then used as an indicator of confidence when assigning the GO term to an uncharacterised protein.

METHODS

In the ensemble, we use a statistics-based method that scores the occurrence of GO terms in a CATH FunFam against a background set of proteins annotated by the same GO term. We also developed a set-based method that uses Set Intersection and Set Union to score the occurrence of GO terms within the same CATH FunFam. Finally, we also use FunFams-Plus, a predictor method developed by the Orengo Group at UCL to predict GO terms for uncharacterised proteins in the CAFA3 challenge.

EVALUATION

We evaluated the methods against the CAFA3 benchmark and DomFun. We used the Precision, Recall and Fmax metrics and the benchmark datasets that are used in CAFA3 to evaluate our models and compare them to the CAFA3 results. Our results show that FunPredCATH compares well with top CAFA methods in the different ontologies and benchmarks.

CONTRIBUTIONS

FunPredCATH compares well with other prediction methods on CAFA3, and the ensemble approach outperforms the base methods. We show that non-IEA models obtain higher Fmax scores than the IEA counterparts, while the models including IEA annotations have higher coverage at the expense of a lower Fmax score.

Collapse

Ribeiro AJM, Riziotis IG, Borkakoti N, Thornton JM. Enzyme function and evolution through the lens of bioinformatics. Biochem J 2023;480:1845-1863. [PMID: 37991346 PMCID: PMC10754289 DOI: 10.1042/bcj20220405] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 11/09/2023] [Accepted: 11/14/2023] [Indexed: 11/23/2023]

Dias RVR, Pedro RP, Sanches MN, Moreira GC, Leite VBP, Caruso IP, de Melo FA, de Oliveira LC. Unveiling Metastable Ensembles of GRB2 and the Relevance of Interdomain Communication during Folding. J Chem Inf Model 2023;63:6344-6353. [PMID: 37824286 DOI: 10.1021/acs.jcim.3c00955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2023]

Abstract

The folding process of multidomain proteins is a highly intricate phenomenon involving the assembly of distinct domains into a functional three-dimensional structure. During this process, each domain may fold independently while interacting with others. The folding of multidomain proteins can be influenced by various factors, including their composition, the structure of each domain, or the presence of disordered regions, as well as the surrounding environment. Misfolding of multidomain proteins can lead to the formation of nonfunctional structures associated with a range of diseases, including cancers or neurodegenerative disorders. Understanding this process is an important step for many biophysical analyses such as stability, interaction, malfunctioning, and rational drug design. One such multidomain protein is growth factor receptor-bound protein 2 (GRB2), an adaptor protein that is essential in regulating cell survival. GRB2 consists of one central Src homology 2 (SH2) domain flanked by two Src homology 3 (SH3) domains. The SH2 domain interacts with phosphotyrosine regions in other proteins, while the SH3 domains recognize proline-rich regions on protein partners during cell signaling. Here, we combined computational and experimental techniques to investigate the folding process of GRB2. Through computational simulations, we sampled the conformational space and mapped the mechanisms involved by the free energy profiles, which may indicate possible intermediate states. From the molecular dynamics trajectories, we used the energy landscape visualization method (ELViM), which allowed us to visualize a three-dimensional (3D) representation of the overall energy surface. We identified two possible parallel folding routes that cannot be seen in a one-dimensional analysis, with one occurring more frequently during folding. Supporting these results, we used differential scanning calorimetry (DSC) and fluorescence spectroscopy techniques to confirm these intermediate states in vitro. Finally, we analyzed the deletion of domains to compare our model outputs to previously published results, supporting the presence of interdomain modulation. Overall, our study highlights the significance of interdomain communication within the GRB2 protein and its impact on the formation, stability, and structural plasticity of the protein, which are crucial for its interaction with other proteins in key signaling pathways.

Collapse

Affiliation(s)

Raphael V R Dias Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities, and Exact Sciences, São José do Rio Preto, SP 15054-000, Brazil Multiuser Center for Biomolecular Innovation (CMIB), São Paulo State University (UNESP), São José do Rio Preto, SP 15054-000, Brazil
Renan P Pedro Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities, and Exact Sciences, São José do Rio Preto, SP 15054-000, Brazil Multiuser Center for Biomolecular Innovation (CMIB), São Paulo State University (UNESP), São José do Rio Preto, SP 15054-000, Brazil
Murilo N Sanches Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities, and Exact Sciences, São José do Rio Preto, SP 15054-000, Brazil
Giovana C Moreira Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities, and Exact Sciences, São José do Rio Preto, SP 15054-000, Brazil Multiuser Center for Biomolecular Innovation (CMIB), São Paulo State University (UNESP), São José do Rio Preto, SP 15054-000, Brazil
Vitor B P Leite Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities, and Exact Sciences, São José do Rio Preto, SP 15054-000, Brazil
Icaro P Caruso Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities, and Exact Sciences, São José do Rio Preto, SP 15054-000, Brazil Multiuser Center for Biomolecular Innovation (CMIB), São Paulo State University (UNESP), São José do Rio Preto, SP 15054-000, Brazil
Fernando A de Melo Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities, and Exact Sciences, São José do Rio Preto, SP 15054-000, Brazil Multiuser Center for Biomolecular Innovation (CMIB), São Paulo State University (UNESP), São José do Rio Preto, SP 15054-000, Brazil
Leandro C de Oliveira Department of Physics, São Paulo State University (UNESP), Institute of Biosciences, Humanities, and Exact Sciences, São José do Rio Preto, SP 15054-000, Brazil

Collapse

da Silva Dambroz CM, Aono AH, de Andrade Silva EM, Pereira WA. Genome-wide analysis and characterization of the LRR-RLK gene family provides insights into anthracnose resistance in common bean. Sci Rep 2023;13:13455. [PMID: 37596307 PMCID: PMC10439169 DOI: 10.1038/s41598-023-40054-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Accepted: 08/03/2023] [Indexed: 08/20/2023] Open

Abstract

Anthracnose, caused by the hemibiotrophic fungus Colletotrichum lindemuthianum, is a damaging disease of common beans that can drastically reduce crop yield. The most effective strategy to manage anthracnose is the use of resistant cultivars. There are many resistance loci that have been identified, mapped and associated with markers in common bean chromosomes. The Leucine-rich repeat kinase receptor protein (LRR-RLK) family is a diverse group of transmembrane receptors, which potentially recognizes pathogen-associated molecular patterns and activates an immune response. In this study, we performed in silico analyses to identify, classify, and characterize common bean LRR-RLKs, also evaluating their expression profile in response to the infection by C. lindemuthianum. By analyzing the entire genome of Phaseolus vulgaris, we could identify and classify 230 LRR-RLKs into 15 different subfamilies. The analyses of gene structures, conserved domains and motifs suggest that LRR-RLKs from the same subfamily are consistent in their exon/intron organization and composition. LRR-RLK genes were found along the 11 chromosomes of the species, including regions of proximity with anthracnose resistance markers. By investigating the duplication events within the LRR-RLK family, we associated the importance of such a family with an expansion resulting from a strong stabilizing selection. Promoter analysis was also performed, highlighting cis-elements associated with the plant response to biotic stress. With regard to the expression pattern of LRR-RLKs in response to the infection by C. lindemuthianum, we could point out several differentially expressed genes in this subfamily, which were associated to specific molecular patterns of LRR-RLKs. Our work provides a broad analysis of the LRR-RLK family in P. vulgaris, allowing an in-depth structural and functional characterization of genes and proteins of this family. From specific expression patterns related to anthracnose response, we could infer a direct participation of RLK-LRR genes in the mechanisms of resistance to anthracnose, highlighting important subfamilies for further investigations.

Collapse

Maatouk M, Merhej V, Pontarotti P, Ibrahim A, Rolain JM, Bittar F. Metallo-Beta-Lactamase-like Encoding Genes in Candidate Phyla Radiation: Widespread and Highly Divergent Proteins with Potential Multifunctionality. Microorganisms 2023;11:1933. [PMID: 37630493 PMCID: PMC10459063 DOI: 10.3390/microorganisms11081933] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 07/22/2023] [Accepted: 07/27/2023] [Indexed: 08/27/2023] Open

Affiliation(s)

Mohamad Maatouk Microbes, Evolution, Phylogénie et Infection (MEPHI), Institut de Recherche pour le Développement (IRD), Assistance Publique-Hôpitaux de Marseille (AP-HM), Aix-Marseille University, 13005 Marseille, France; (M.M.); (P.P.); (A.I.); (J.-M.R.) Institut Hospitalo-Universitaire (IHU) Méditerranée Infection, 13005 Marseille, France
Vicky Merhej Microbes, Evolution, Phylogénie et Infection (MEPHI), Institut de Recherche pour le Développement (IRD), Assistance Publique-Hôpitaux de Marseille (AP-HM), Aix-Marseille University, 13005 Marseille, France; (M.M.); (P.P.); (A.I.); (J.-M.R.) Institut Hospitalo-Universitaire (IHU) Méditerranée Infection, 13005 Marseille, France
Pierre Pontarotti Microbes, Evolution, Phylogénie et Infection (MEPHI), Institut de Recherche pour le Développement (IRD), Assistance Publique-Hôpitaux de Marseille (AP-HM), Aix-Marseille University, 13005 Marseille, France; (M.M.); (P.P.); (A.I.); (J.-M.R.) Institut Hospitalo-Universitaire (IHU) Méditerranée Infection, 13005 Marseille, France Centre National de la Recherche Scientifique (CNRS-SNC5039), 13009 Marseille, France
Ahmad Ibrahim Microbes, Evolution, Phylogénie et Infection (MEPHI), Institut de Recherche pour le Développement (IRD), Assistance Publique-Hôpitaux de Marseille (AP-HM), Aix-Marseille University, 13005 Marseille, France; (M.M.); (P.P.); (A.I.); (J.-M.R.) Institut Hospitalo-Universitaire (IHU) Méditerranée Infection, 13005 Marseille, France
Jean-Marc Rolain Microbes, Evolution, Phylogénie et Infection (MEPHI), Institut de Recherche pour le Développement (IRD), Assistance Publique-Hôpitaux de Marseille (AP-HM), Aix-Marseille University, 13005 Marseille, France; (M.M.); (P.P.); (A.I.); (J.-M.R.) Institut Hospitalo-Universitaire (IHU) Méditerranée Infection, 13005 Marseille, France
Fadi Bittar Microbes, Evolution, Phylogénie et Infection (MEPHI), Institut de Recherche pour le Développement (IRD), Assistance Publique-Hôpitaux de Marseille (AP-HM), Aix-Marseille University, 13005 Marseille, France; (M.M.); (P.P.); (A.I.); (J.-M.R.) Institut Hospitalo-Universitaire (IHU) Méditerranée Infection, 13005 Marseille, France

Collapse

Iruegas R, Pfefferle K, Göttig S, Averhoff B, Ebersberger I. Feature architecture aware phylogenetic profiling indicates a functional diversification of type IVa pili in the nosocomial pathogen Acinetobacter baumannii. PLoS Genet 2023;19:e1010646. [PMID: 37498819 PMCID: PMC10374093 DOI: 10.1371/journal.pgen.1010646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 06/06/2023] [Indexed: 07/29/2023] Open

Dosch J, Bergmann H, Tran V, Ebersberger I. FAS: assessing the similarity between proteins using multi-layered feature architectures. Bioinformatics 2023;39:btad226. [PMID: 37084276 PMCID: PMC10185405 DOI: 10.1093/bioinformatics/btad226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 02/23/2023] [Accepted: 04/13/2023] [Indexed: 04/23/2023] Open

Saco A, Suárez H, Novoa B, Figueras A. A Genomic and Transcriptomic Analysis of the C-Type Lectin Gene Family Reveals Highly Expanded and Diversified Repertoires in Bivalves. Mar Drugs 2023;21:md21040254. [PMID: 37103393 PMCID: PMC10140915 DOI: 10.3390/md21040254] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 04/17/2023] [Accepted: 04/18/2023] [Indexed: 04/28/2023] Open

Diaz-Parga P, Gould A, de Alba E. Natural and engineered inflammasome adapter proteins reveal optimum linker length for self-assembly. J Biol Chem 2022;298:102501. [PMID: 36116550 PMCID: PMC9640978 DOI: 10.1016/j.jbc.2022.102501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 08/31/2022] [Accepted: 09/13/2022] [Indexed: 11/16/2022] Open

Gilchrist CLM, Chooi YH. Synthaser: a CD-Search enabled Python toolkit for analysing domain architecture of fungal secondary metabolite megasynth(et)ases. Fungal Biol Biotechnol 2021;8:13. [PMID: 34763725 PMCID: PMC8582187 DOI: 10.1186/s40694-021-00120-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 10/29/2021] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Fungi are prolific producers of secondary metabolites (SMs), which are bioactive small molecules with important applications in medicine, agriculture and other industries. The backbones of a large proportion of fungal SMs are generated through the action of large, multi-domain megasynth(et)ases such as polyketide synthases (PKSs) and nonribosomal peptide synthetases (NRPSs). The structure of these backbones is determined by the domain architecture of the corresponding megasynth(et)ase, and thus accurate annotation and classification of these architectures is an important step in linking SMs to their biosynthetic origins in the genome.

RESULTS

Here we report synthaser, a Python package leveraging the NCBI's conserved domain search tool for remote prediction and classification of fungal megasynth(et)ase domain architectures. Synthaser is capable of batch sequence analysis, and produces rich textual output and interactive visualisations which allow for quick assessment of the megasynth(et)ase diversity of a fungal genome. Synthaser uses a hierarchical rule-based classification system, which can be extensively customised by the user through a web application ( http://gamcil.github.io/synthaser ). We show that synthaser provides more accurate domain architecture predictions than comparable tools which rely on curated profile hidden Markov model (pHMM)-based approaches; the utilisation of the NCBI conserved domain database also allows for significantly greater flexibility compared to pHMM approaches. In addition, we demonstrate how synthaser can be applied to large scale genome mining pipelines through the construction of an Aspergillus PKS similarity network.

CONCLUSIONS

Synthaser is an easy to use tool that represents a significant upgrade to previous domain architecture analysis tools. It is freely available under a MIT license from PyPI ( https://pypi.org/project/synthaser ) and GitHub ( https://github.com/gamcil/synthaser ).

Collapse

Zhao VY, Rodrigues JV, Lozovsky ER, Hartl DL, Shakhnovich EI. Switching an active site helix in dihydrofolate reductase reveals limits to subdomain modularity. Biophys J 2021;120:4738-4750. [PMID: 34571014 PMCID: PMC8595743 DOI: 10.1016/j.bpj.2021.09.032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Revised: 09/14/2021] [Accepted: 09/22/2021] [Indexed: 11/23/2022] Open

Álvarez-Lugo A, Becerra A. The Role of Gene Duplication in the Divergence of Enzyme Function: A Comparative Approach. Front Genet 2021;12:641817. [PMID: 34335678 PMCID: PMC8318041 DOI: 10.3389/fgene.2021.641817] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 06/21/2021] [Indexed: 11/13/2022] Open

Caetano-Anollés G. The Compressed Vocabulary of Microbial Life. Front Microbiol 2021;12:655990. [PMID: 34305827 PMCID: PMC8292947 DOI: 10.3389/fmicb.2021.655990] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 04/27/2021] [Indexed: 12/22/2022] Open

Abstract

Communication is an undisputed central activity of life that requires an evolving molecular language. It conveys meaning through messages and vocabularies. Here, I explore the existence of a growing vocabulary in the molecules and molecular functions of the microbial world. There are clear correspondences between the lexicon, syntax, semantics, and pragmatics of language organization and the module, structure, function, and fitness paradigms of molecular biology. These correspondences are constrained by universal laws and engineering principles. Macromolecular structure, for example, follows quantitative linguistic patterns arising from statistical laws that are likely universal, including the Zipf's law, a special case of the scale-free distribution, the Heaps' law describing sublinear growth typical of economies of scales, and the Menzerath-Altmann's law, which imposes size-dependent patterns of decreasing returns. Trade-off solutions between principles of economy, flexibility, and robustness define a "triangle of persistence" describing the impact of the environment on a biological system. The pragmatic landscape of the triangle interfaces with the syntax and semantics of molecular languages, which together with comparative and evolutionary genomic data can explain global patterns of diversification of cellular life. The vocabularies of proteins (proteomes) and functions (functionomes) revealed a significant universal lexical core supporting a universal common ancestor, an ancestral evolutionary link between Bacteria and Eukarya, and distinct reductive evolutionary strategies of language compression in Archaea and Bacteria. A "causal" word cloud strategy inspired by the dependency grammar paradigm used in catenae unfolded the evolution of lexical units associated with Gene Ontology terms at different levels of ontological abstraction. While Archaea holds the smallest, oldest, and most homogeneous vocabulary of all superkingdoms, Bacteria heterogeneously apportions a more complex vocabulary, and Eukarya pushes functional innovation through mechanisms of flexibility and robustness.

Collapse

Rauer C, Sen N, Waman VP, Abbasian M, Orengo CA. Computational approaches to predict protein functional families and functional sites. Curr Opin Struct Biol 2021;70:108-122. [PMID: 34225010 DOI: 10.1016/j.sbi.2021.05.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 05/13/2021] [Accepted: 05/25/2021] [Indexed: 01/06/2023]

de Rond T, Asay JE, Moore BS. Co-occurrence of enzyme domains guides the discovery of an oxazolone synthetase. Nat Chem Biol 2021;17:794-799. [PMID: 34099916 PMCID: PMC8238888 DOI: 10.1038/s41589-021-00808-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Accepted: 04/29/2021] [Indexed: 02/04/2023]

Das S, Scholes HM, Sen N, Orengo C. CATH functional families predict functional sites in proteins. Bioinformatics 2021;37:1099-1106. [PMID: 33135053 PMCID: PMC8150129 DOI: 10.1093/bioinformatics/btaa937] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 09/30/2020] [Accepted: 10/27/2020] [Indexed: 01/12/2023] Open

Bordin N, Sillitoe I, Lees JG, Orengo C. Tracing Evolution Through Protein Structures: Nature Captured in a Few Thousand Folds. Front Mol Biosci 2021;8:668184. [PMID: 34041266 PMCID: PMC8141709 DOI: 10.3389/fmolb.2021.668184] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 04/27/2021] [Indexed: 11/13/2022] Open

One of Nature’s Basic Laws: Combination-Sharing. HUMAN ARENAS 2021. [DOI: 10.1007/s42087-021-00215-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Allosteric cooperation in a de novo-designed two-domain protein. Proc Natl Acad Sci U S A 2020;117:33246-33253. [PMID: 33318174 PMCID: PMC7776816 DOI: 10.1073/pnas.2017062117] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Aledo JC, Aledo P. Susceptibility of Protein Methionine Oxidation in Response to Hydrogen Peroxide Treatment-Ex Vivo Versus In Vitro: A Computational Insight. Antioxidants (Basel) 2020;9:antiox9100987. [PMID: 33066324 PMCID: PMC7602125 DOI: 10.3390/antiox9100987] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Revised: 10/08/2020] [Accepted: 10/09/2020] [Indexed: 11/25/2022] Open

Wen Z, He J, Huang SY. Topology-independent and global protein structure alignment through an FFT-based algorithm. Bioinformatics 2020;36:478-486. [PMID: 31384919 DOI: 10.1093/bioinformatics/btz609] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 07/22/2019] [Accepted: 08/02/2019] [Indexed: 12/12/2022] Open

Czubat B, Minias A, Brzostek A, Żaczek A, Struś K, Zakrzewska-Czerwińska J, Dziadek J. Functional Disassociation Between the Protein Domains of MSMEG_4305 of Mycolicibacterium smegmatis (Mycobacterium smegmatis) in vivo. Front Microbiol 2020;11:2008. [PMID: 32973726 PMCID: PMC7466739 DOI: 10.3389/fmicb.2020.02008] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 07/29/2020] [Indexed: 12/02/2022] Open

Harish A, Morrison D. The deep(er) roots of Eukaryotes and Akaryotes. F1000Res 2020;9:112. [PMID: 32685134 PMCID: PMC7336049 DOI: 10.12688/f1000research.22338.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/16/2020] [Indexed: 02/05/2023] Open

Abstract

Background: Locating the root node of the "tree of life" (ToL) is one of the hardest problems in phylogenetics, given the time depth. The root-node, or the universal common ancestor (UCA), groups descendants into organismal clades/domains. Two notable variants of the two-domains ToL (2D-ToL) have gained support recently. One 2D-ToL posits that eukaryotes (organisms with nuclei) and akaryotes (organisms without nuclei) are sister clades that diverged from the UCA, and that Asgard archaea are sister to other archaea. The other 2D-ToL proposes that eukaryotes emerged from within archaea and places Asgard archaea as sister to eukaryotes. Williams et al. ( Nature Ecol. Evol. 4: 138-147; 2020) re-evaluated the data and methods that support the competing two-domains proposals and concluded that eukaryotes are the closest relatives of Asgard archaea. Critique: The poor resolution of the archaea in their analysis, despite employing amino acid alignments from thousands of proteins and the best-fitting substitution models, contradicts their conclusions. We argue that they overlooked important aspects of estimating evolutionary relatedness and assessing phylogenetic signal in empirical data. Which 2D-ToL is better supported depends on which kind of molecular features are better for resolving common ancestors at the roots of clades - protein-domains or their component amino acids. We focus on phylogenetic character reconstructions necessary to describe the UCA or its closest descendants in the absence of reliable fossils. Clarifications: It is well known that different character types present different perspectives on evolutionary history that relate to different phylogenetic depths. We show that protein structural-domains support more reliable phylogenetic reconstructions of deep-diverging clades in the ToL. Accordingly, Eukaryotes and Akaryotes are better supported clades in a 2D-ToL.

Collapse

Carrillo-Campos J. Estructura y función de las oxigenasas tipo Rieske/mononuclear. TIP REVISTA ESPECIALIZADA EN CIENCIAS QUÍMICO-BIOLÓGICAS 2019. [DOI: 10.22201/fesz.23958723e.2019.0.196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Mechanisms of noncanonical binding dynamics in multivalent protein-protein interactions. Proc Natl Acad Sci U S A 2019;116:25659-25667. [PMID: 31776263 DOI: 10.1073/pnas.1902909116] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract

Protein multivalency can provide increased affinity and specificity relative to monovalent counterparts, but these emergent biochemical properties and their mechanistic underpinnings are difficult to predict as a function of the biophysical properties of the multivalent binding partners. Here, we present a mathematical model that accurately simulates binding kinetics and equilibria of multivalent protein-protein interactions as a function of the kinetics of monomer-monomer binding, the structure and topology of the multidomain interacting partners, and the valency of each partner. These properties are all experimentally or computationally estimated a priori, including approximating topology with a worm-like chain model applicable to a variety of structurally disparate systems, thus making the model predictive without parameter fitting. We conceptualize multivalent binding as a protein-protein interaction network: ligand and receptor valencies determine the number of interacting species in the network, with monomer kinetics and structural properties dictating the dynamics of each species. As predicted by the model and validated by surface plasmon resonance experiments, multivalent interactions can generate several noncanonical macroscopic binding dynamics, including a transient burst of high-energy configurations during association, biphasic equilibria resulting from interligand competition at high concentrations, and multiexponential dissociation arising from differential lifetimes of distinct network species. The transient burst was only uncovered when extending our analysis to trivalent interactions due to the significantly larger network, and we were able to predictably tune burst magnitude by altering linker rigidity. This study elucidates mechanisms of multivalent binding and establishes a framework for model-guided analysis and engineering of such interactions.

Collapse

Pascarella S. Computational classification of MocR transcriptional regulators into subgroups as a support for experimental and functional characterization. Bioinformation 2019;15:151-159. [PMID: 31435161 PMCID: PMC6677901 DOI: 10.6026/97320630015151] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 02/03/2019] [Indexed: 11/23/2022] Open

Baker EP, Hittinger CT. Evolution of a novel chimeric maltotriose transporter in Saccharomyces eubayanus from parent proteins unable to perform this function. PLoS Genet 2019;15:e1007786. [PMID: 30946740 PMCID: PMC6448821 DOI: 10.1371/journal.pgen.1007786] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Accepted: 10/25/2018] [Indexed: 11/23/2022] Open

Abstract

At the molecular level, the evolution of new traits can be broadly divided between changes in gene expression and changes in protein-coding sequence. For proteins, the evolution of novel functions is generally thought to proceed through sequential point mutations or recombination of whole functional units. In Saccharomyces, the uptake of the sugar maltotriose into the cell is the primary limiting factor in its utilization, but maltotriose transporters are relatively rare, except in brewing strains. No known wild strains of Saccharomyces eubayanus, the cold-tolerant parent of hybrid lager-brewing yeasts (Saccharomyces cerevisiae x S. eubayanus), are able to consume maltotriose, which limits their ability to fully ferment malt extract. In one strain of S. eubayanus, we found a gene closely related to a known maltotriose transporter and were able to confer maltotriose consumption by overexpressing this gene or by passaging the strain on maltose. Even so, most wild strains of S. eubayanus lack native maltotriose transporters. To determine how this rare trait could evolve in naive genetic backgrounds, we performed an adaptive evolution experiment for maltotriose consumption, which yielded a single strain of S. eubayanus able to grow on maltotriose. We mapped the causative locus to a gene encoding a novel chimeric transporter that was formed by an ectopic recombination event between two genes encoding transporters that are unable to import maltotriose. In contrast to classic models of the evolution of novel protein functions, the recombination breakpoints occurred within a single functional domain. Thus, the ability of the new protein to carry maltotriose was likely acquired through epistatic interactions between independently evolved substitutions. By acquiring multiple mutations at once, the transporter rapidly gained a novel function, while bypassing potentially deleterious intermediate steps. This study provides an illuminating example of how recombination between paralogs can establish novel interactions among substitutions to create adaptive functions.

Hybrids of the yeasts Saccharomyces cerevisiae and Saccharomyces eubayanus (lager-brewing yeasts) dominate the modern brewing industry. S. cerevisiae, also known as baker’s yeast, is well-known for its role in industry and scientific research. Less well recognized is S. eubayanus, which was only discovered as a pure species in 2011. While most lager-brewing yeasts rapidly and completely utilize the important brewing sugar maltotriose, no strain of S. eubayanus isolated to date is known to do so. Despite being unable to consume maltotriose, we identified one strain of S. eubayanus carrying a gene for a functional maltotriose transporter, although most strains lack this gene. During an adaptive evolution experiment, a strain of S. eubayanus without native maltotriose transporters evolved the ability to grow on maltotriose. Maltotriose consumption in the evolved strain resulted from a chimeric transporter that arose by shuffling genes encoding parent proteins that were unable to transport maltotriose. Traditionally, functional chimeric proteins are thought to evolve by shuffling discrete functional domains or modules, but the breakpoints in the chimera studied here occurred within the single functional module of the protein. These results support the less well-recognized role of shuffling duplicate gene sequences to generate novel proteins with adaptive functions.

Collapse

Debiec KT, Whitley MJ, Koharudin LMI, Chong LT, Gronenborn AM. Integrating NMR, SAXS, and Atomistic Simulations: Structure and Dynamics of a Two-Domain Protein. Biophys J 2019;114:839-855. [PMID: 29490245 DOI: 10.1016/j.bpj.2018.01.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Revised: 12/19/2017] [Accepted: 01/02/2018] [Indexed: 12/21/2022] Open

Byrne R, Schneider G. In Silico Target Prediction for Small Molecules. Methods Mol Biol 2019;1888:273-309. [PMID: 30519953 DOI: 10.1007/978-1-4939-8891-4_16] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Swaroop Srivastava S, Raman R, Kiran U, Garg R, Chadalawada S, Pawar AD, Sankaranarayanan R, Sharma Y. Interface interactions between βγ-crystallin domain and Ig-like domain render Ca²⁺ -binding site inoperative in abundant perithecial protein of Neurospora crassa. Mol Microbiol 2018;110:955-972. [PMID: 30216631 DOI: 10.1111/mmi.14130] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/10/2018] [Indexed: 11/30/2022]

Jakubec D, Kratochvíl M, Vymĕtal J, Vondrášek J. Widespread evolutionary crosstalk among protein domains in the context of multi-domain proteins. PLoS One 2018;13:e0203085. [PMID: 30169546 PMCID: PMC6118372 DOI: 10.1371/journal.pone.0203085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Accepted: 08/14/2018] [Indexed: 11/20/2022] Open

Mehrotra P, Ami VKG, Srinivasan N. Clustering of multi-domain protein sequences. Proteins 2018;86:759-776. [PMID: 29675880 DOI: 10.1002/prot.25510] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Revised: 04/09/2018] [Accepted: 04/16/2018] [Indexed: 11/06/2022]

Exploring modular allostery via interchangeable regulatory domains. Proc Natl Acad Sci U S A 2018;115:3006-3011. [PMID: 29507215 DOI: 10.1073/pnas.1717621115] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open

Blacklock KM, Yang L, Mulligan VK, Khare SD. A computational method for the design of nested proteins by loop-directed domain insertion. Proteins 2018;86:354-369. [PMID: 29250820 DOI: 10.1002/prot.25445] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Revised: 12/04/2017] [Accepted: 12/15/2017] [Indexed: 12/23/2022]

Sasnauskas G, Tamulaitienė G, Tamulaitis G, Čalyševa J, Laime M, Rimšelienė R, Lubys A, Siksnys V. UbaLAI is a monomeric Type IIE restriction enzyme. Nucleic Acids Res 2017;45:9583-9594. [PMID: 28934493 PMCID: PMC5766183 DOI: 10.1093/nar/gkx634] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Revised: 07/08/2017] [Accepted: 07/11/2017] [Indexed: 01/11/2023] Open

Adaptive evolution by spontaneous domain fusion and protein relocalization. Nat Ecol Evol 2017;1:1562-1568. [PMID: 29185504 DOI: 10.1038/s41559-017-0283-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2016] [Accepted: 07/18/2017] [Indexed: 11/08/2022]

Lechno-Yossef S, Melnicki MR, Bao H, Montgomery BL, Kerfeld CA. Synthetic OCP heterodimers are photoactive and recapitulate the fusion of two primitive carotenoproteins in the evolution of cyanobacterial photoprotection. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017;91:646-656. [PMID: 28503830 DOI: 10.1111/tpj.13593] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2017] [Revised: 04/25/2017] [Accepted: 05/03/2017] [Indexed: 06/07/2023]

Esch L, Schaffrath U. An Update on Jacalin-Like Lectins and Their Role in Plant Defense. Int J Mol Sci 2017;18:ijms18071592. [PMID: 28737678 PMCID: PMC5536079 DOI: 10.3390/ijms18071592] [Citation(s) in RCA: 49] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 07/17/2017] [Accepted: 07/20/2017] [Indexed: 12/11/2022] Open

Yang F, Sun S, Tan G, Costanzo M, Hill DE, Vidal M, Andrews BJ, Boone C, Roth FP. Identifying pathogenicity of human variants via paralog-based yeast complementation. PLoS Genet 2017;13:e1006779. [PMID: 28542158 PMCID: PMC5466341 DOI: 10.1371/journal.pgen.1006779] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Revised: 06/09/2017] [Accepted: 04/25/2017] [Indexed: 11/21/2022] Open

Abstract

To better understand the health implications of personal genomes, we now face a largely unmet challenge to identify functional variants within disease-associated genes. Functional variants can be identified by trans-species complementation, e.g., by failure to rescue a yeast strain bearing a mutation in an orthologous human gene. Although orthologous complementation assays are powerful predictors of pathogenic variation, they are available for only a few percent of human disease genes. Here we systematically examine the question of whether complementation assays based on paralogy relationships can expand the number of human disease genes with functional variant detection assays. We tested over 1,000 paralogous human-yeast gene pairs for complementation, yielding 34 complementation relationships, of which 33 (97%) were novel. We found that paralog-based assays identified disease variants with success on par with that of orthology-based assays. Combining all homology-based assay results, we found that complementation can often identify pathogenic variants outside the homologous sequence region, presumably because of global effects on protein folding or stability. Within our search space, paralogy-based complementation more than doubled the number of human disease genes with a yeast-based complementation assay for disease variation.

Functional complementation assays of human disease-associated gene variants can reveal many more human disease variants at high confidence than current computational approaches, even using highly-diverged model organisms. However, this has generally only been possible for a minority of human disease genes for which orthologous complementation is known in the relevant model organism, so that alternative assays are urgently needed. Here we show that complementation relationships can be found for many additional human disease genes by exploiting paralogous human-yeast gene relationships, and that disease variant identification using paralogy-based assays performs on par with orthology-based assays.

Collapse

Affiliation(s)

Fan Yang Donnelly Centre, Toronto, Ontario, Canada Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada Department of Computer Science, University of Toronto, Toronto, Ontario, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada
Song Sun Donnelly Centre, Toronto, Ontario, Canada Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada Department of Computer Science, University of Toronto, Toronto, Ontario, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden
Guihong Tan Donnelly Centre, Toronto, Ontario, Canada Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
Michael Costanzo Donnelly Centre, Toronto, Ontario, Canada Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
David E. Hill Center for Cancer Systems Biology (CCSB), Dana- Farber Cancer Institute, Boston, Massachusetts, United States of America Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
Marc Vidal Center for Cancer Systems Biology (CCSB), Dana- Farber Cancer Institute, Boston, Massachusetts, United States of America Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
Brenda J. Andrews Donnelly Centre, Toronto, Ontario, Canada Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada
Charles Boone Donnelly Centre, Toronto, Ontario, Canada Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada Canadian Institute for Advanced Research, Toronto, Ontario, Canada
Frederick P. Roth Donnelly Centre, Toronto, Ontario, Canada Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada Department of Computer Science, University of Toronto, Toronto, Ontario, Canada Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Ontario, Canada Center for Cancer Systems Biology (CCSB), Dana- Farber Cancer Institute, Boston, Massachusetts, United States of America Canadian Institute for Advanced Research, Toronto, Ontario, Canada * E-mail:

Collapse

Van Holle S, De Schutter K, Eggermont L, Tsaneva M, Dang L, Van Damme EJM. Comparative Study of Lectin Domains in Model Species: New Insights into Evolutionary Dynamics. Int J Mol Sci 2017;18:ijms18061136. [PMID: 28587095 PMCID: PMC5485960 DOI: 10.3390/ijms18061136] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2017] [Revised: 05/20/2017] [Accepted: 05/22/2017] [Indexed: 01/07/2023] Open

Burkhart BJ, Schwalen CJ, Mann G, Naismith JH, Mitchell DA. YcaO-Dependent Posttranslational Amide Activation: Biosynthesis, Structure, and Function. Chem Rev 2017;117:5389-5456. [PMID: 28256131 DOI: 10.1021/acs.chemrev.6b00623] [Citation(s) in RCA: 138] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]

Van Holle S, Rougé P, Van Damme EJM. Evolution and structural diversification of Nictaba-like lectin genes in food crops with a focus on soybean (Glycine max). ANNALS OF BOTANY 2017;119:901-914. [PMID: 28087663 PMCID: PMC5379587 DOI: 10.1093/aob/mcw259] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2016] [Revised: 10/24/2016] [Accepted: 11/17/2016] [Indexed: 05/10/2023]

Abstract

Background and Aims

The Nictaba family groups all proteins that show homology to Nictaba, the tobacco lectin. So far, Nictaba and an Arabidopsis thaliana homologue have been shown to be implicated in the plant stress response. The availability of more than 50 sequenced plant genomes provided the opportunity for a genome-wide identification of Nictaba -like genes in 15 species, representing members of the Fabaceae, Poaceae, Solanaceae, Musaceae, Arecaceae, Malvaceae and Rubiaceae. Additionally, phylogenetic relationships between the different species were explored. Furthermore, this study included domain organization analysis, searching for orthologous genes in the legume family and transcript profiling of the Nictaba -like lectin genes in soybean.

Methods

Using a combination of BLASTp, InterPro analysis and hidden Markov models, the genomes of Medicago truncatula , Cicer arietinum , Lotus japonicus , Glycine max , Cajanus cajan , Phaseolus vulgaris , Theobroma cacao , Solanum lycopersicum , Solanum tuberosum , Coffea canephora , Oryza sativa , Zea mays, Sorghum bicolor , Musa acuminata and Elaeis guineensis were searched for Nictaba -like genes. Phylogenetic analysis was performed using RAxML and additional protein domains in the Nictaba-like sequences were identified using InterPro. Expression analysis of the soybean Nictaba -like genes was investigated using microarray data.

Key Results

Nictaba -like genes were identified in all studied species and analysis of the duplication events demonstrated that both tandem and segmental duplication contributed to the expansion of the Nictaba gene family in angiosperms. The single-domain Nictaba protein and the multi-domain F-box Nictaba architectures are ubiquitous among all analysed species and microarray analysis revealed differential expression patterns for all soybean Nictaba-like genes.

Conclusions

Taken together, the comparative genomics data contributes to our understanding of the Nictaba -like gene family in species for which the occurrence of Nictaba domains had not yet been investigated. Given the ubiquitous nature of these genes, they have probably acquired new functions over time and are expected to take on various roles in plant development and defence.

Collapse

Moar WJ, Evans AJ, Kessenich CR, Baum JA, Bowen DJ, Edrington TC, Haas JA, Kouadio JLK, Roberts JK, Silvanovich A, Yin Y, Weiner BE, Glenn KC, Odegaard ML. The sequence, structural, and functional diversity within a protein family and implications for specificity and safety: The case for ETX_MTX2 insecticidal proteins. J Invertebr Pathol 2017;142:50-59. [DOI: 10.1016/j.jip.2016.05.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2016] [Revised: 05/20/2016] [Accepted: 05/24/2016] [Indexed: 11/26/2022]

Cao H, Yang X, Jin L, Han W, Zhang Y. Module recombination and functional integration of oligosaccharide-producing multifunctional amylase. ACTA ACUST UNITED AC 2016. [DOI: 10.1016/j.molcatb.2016.08.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

List JM, Pathmanathan JS, Lopez P, Bapteste E. Unity and disunity in evolutionary sciences: process-based analogies open common research avenues for biology and linguistics. Biol Direct 2016;11:39. [PMID: 27544206 PMCID: PMC4992195 DOI: 10.1186/s13062-016-0145-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2016] [Accepted: 08/06/2016] [Indexed: 11/13/2022] Open

Abstract

Background

For a long time biologists and linguists have been noticing surprising similarities between the evolution of life forms and languages. Most of the proposed analogies have been rejected. Some, however, have persisted, and some even turned out to be fruitful, inspiring the transfer of methods and models between biology and linguistics up to today. Most proposed analogies were based on a comparison of the research objects rather than the processes that shaped their evolution. Focusing on process-based analogies, however, has the advantage of minimizing the risk of overstating similarities, while at the same time reflecting the common strategy to use processes to explain the evolution of complexity in both fields.

Results

We compared important evolutionary processes in biology and linguistics and identified processes specific to only one of the two disciplines as well as processes which seem to be analogous, potentially reflecting core evolutionary processes. These new process-based analogies support novel methodological transfer, expanding the application range of biological methods to the field of historical linguistics. We illustrate this by showing (i) how methods dealing with incomplete lineage sorting offer an introgression-free framework to analyze highly mosaic word distributions across languages; (ii) how sequence similarity networks can be used to identify composite and borrowed words across different languages; (iii) how research on partial homology can inspire new methods and models in both fields; and (iv) how constructive neutral evolution provides an original framework for analyzing convergent evolution in languages resulting from common descent (Sapir’s drift).

Conclusions

Apart from new analogies between evolutionary processes, we also identified processes which are specific to either biology or linguistics. This shows that general evolution cannot be studied from within one discipline alone. In order to get a full picture of evolution, biologists and linguists need to complement their studies, trying to identify cross-disciplinary and discipline-specific evolutionary processes. The fact that we found many process-based analogies favoring transfer from biology to linguistics further shows that certain biological methods and models have a broader scope than previously recognized. This opens fruitful paths for collaboration between the two disciplines.

Reviewers

This article was reviewed by W. Ford Doolittle and Eugene V. Koonin.

Electronic supplementary material

The online version of this article (doi:10.1186/s13062-016-0145-2) contains supplementary material, which is available to authorized users.

Collapse

Doğan T, MacDougall A, Saidi R, Poggioli D, Bateman A, O'Donovan C, Martin MJ. UniProt-DAAC: domain architecture alignment and classification, a new method for automatic functional annotation in UniProtKB. Bioinformatics 2016;32:2264-71. [PMID: 27153729 PMCID: PMC4965628 DOI: 10.1093/bioinformatics/btw114] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2015] [Revised: 01/22/2016] [Accepted: 02/25/2016] [Indexed: 11/17/2022] Open

Jacobs TM, Williams B, Williams T, Xu X, Eletsky A, Federizon JF, Szyperski T, Kuhlman B. Design of structurally distinct proteins using strategies inspired by evolution. Science 2016;352:687-90. [PMID: 27151863 DOI: 10.1126/science.aad8036] [Citation(s) in RCA: 105] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 03/14/2016] [Indexed: 12/25/2022]