1
|
Hourihane E, Hixon KR. Nanoparticles as Drug Delivery Vehicles for People with Cystic Fibrosis. Biomimetics (Basel) 2024; 9:574. [PMID: 39329596 PMCID: PMC11430251 DOI: 10.3390/biomimetics9090574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2024] [Revised: 08/29/2024] [Accepted: 09/11/2024] [Indexed: 09/28/2024] Open
Abstract
Cystic Fibrosis (CF) is a life-shortening, genetic disease that affects approximately 145,000 people worldwide. CF causes a dehydrated mucus layer in the lungs, leading to damaging infection and inflammation that eventually result in death. Nanoparticles (NPs), drug delivery vehicles intended for inhalation, have become a recent source of interest for treating CF and CF-related conditions, and many formulations have been created thus far. This paper is intended to provide an overview of CF and the effect it has on the lungs, the barriers in using NP drug delivery vehicles for treatment, and three common material class choices for these NP formulations: metals, polymers, and lipids. The materials to be discussed include gold, silver, and iron oxide metallic NPs; polyethylene glycol, chitosan, poly lactic-co-glycolic acid, and alginate polymeric NPs; and lipid-based NPs. The novelty of this review comes from a less specific focus on nanoparticle examples, with the focus instead being on the general theory behind material function, why or how a material might be used, and how it may be preferable to other materials used in treating CF. Finally, this paper ends with a short discussion of the two FDA-approved NPs for treatment of CF-related conditions and a recommendation for the future usage of NPs in people with Cystic Fibrosis (pwCF).
Collapse
Affiliation(s)
- Eoin Hourihane
- Thayer School of Engineering, Dartmouth College, Hanover, NH 03755, USA;
| | - Katherine R. Hixon
- Thayer School of Engineering, Dartmouth College, Hanover, NH 03755, USA;
- Geisel School of Medicine, Dartmouth College, Hanover, NH 03755, USA
| |
Collapse
|
2
|
Chen Y, Zhang Y, He Y. Enhancing Vaxign-DL for Vaccine Candidate Prediction with added ESM-Generated Features. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.04.611295. [PMID: 39282385 PMCID: PMC11398487 DOI: 10.1101/2024.09.04.611295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 09/20/2024]
Abstract
Many vaccine design programs have been developed, including our own machine learning approaches Vaxign-ML and Vaxign-DL. Using deep learning techniques, Vaxign-DL predicts bacterial protective antigens by calculating 509 biological and biomedical features from protein sequences. In this study, we first used the protein folding ESM program to calculate a set of 1,280 features from individual protein sequences, and then utilized the new set of features separately or in combination with the traditional set of 509 features to predict protective antigens. Our result showed that the usage of ESM-derived features alone was able to accurately predict vaccine antigens with a performance similar to the orginal Vaxign-DL prediction method, and the usage of the combined ESM-derived and orginal Vaxign-DL features significantly improved the prediction performance according to a set of seven scores including specificity, sensitivity, and AUROC. To further evaluate the updated methods, we conducted a Leave-One-Pathogen-Out Validation (LOPOV) study, and found that the usage of ESM-derived features significantly improved the the prediction of vaccine antigens from 10 bacterial pathogens. This research is the first reported study demonstrating the added value of protein folding features for vaccine antigen prediction.
Collapse
Affiliation(s)
- Yichao Chen
- University of Michigan, Ann Arbor, MI 48109, USA
- Penn State University, State College, PA 16803, USA
| | - Yuhan Zhang
- University of Michigan, Ann Arbor, MI 48109, USA
| | - Yongqun He
- University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
3
|
Brom JA, Petrikis RG, Nieukirk GE, Bourque J, Pielak GJ. Protecting Lyophilized Escherichia coli Adenylate Kinase. Mol Pharm 2024; 21:3634-3642. [PMID: 38805365 DOI: 10.1021/acs.molpharmaceut.4c00356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
Drying protein-based drugs, usually via lyophilization, can facilitate storage at ambient temperature and improve accessibility but many proteins cannot withstand drying and must be formulated with protective additives called excipients. However, mechanisms of protection are poorly understood, precluding rational formulation design. To better understand dry proteins and their protection, we examine Escherichia coli adenylate kinase (AdK) lyophilized alone and with the additives trehalose, maltose, bovine serum albumin, cytosolic abundant heat soluble protein D, histidine, and arginine. We apply liquid-observed vapor exchange NMR to interrogate the residue-level structure in the presence and absence of additives. We pair these observations with differential scanning calorimetry data of lyophilized samples and AdK activity assays with and without heating. We show that the amino acids do not preserve the native structure as well as sugars or proteins and that after heating the most stable additives protect activity best.
Collapse
Affiliation(s)
- Julia A Brom
- Department of Chemistry, University of North Carolina at Chapel Hill (UNC-CH), 3250 Genome Sciences Building, Chapel Hill, North Carolina 27599-3290, United States
| | - Ruta G Petrikis
- Department of Chemistry, University of North Carolina at Chapel Hill (UNC-CH), 3250 Genome Sciences Building, Chapel Hill, North Carolina 27599-3290, United States
| | - Grace E Nieukirk
- Department of Chemistry, University of North Carolina at Chapel Hill (UNC-CH), 3250 Genome Sciences Building, Chapel Hill, North Carolina 27599-3290, United States
| | - Joshua Bourque
- Department of Chemistry, University of North Carolina at Chapel Hill (UNC-CH), 3250 Genome Sciences Building, Chapel Hill, North Carolina 27599-3290, United States
| | - Gary J Pielak
- Department of Chemistry, University of North Carolina at Chapel Hill (UNC-CH), 3250 Genome Sciences Building, Chapel Hill, North Carolina 27599-3290, United States
- Department of Biochemistry & Biophysics, UNC-CH, Chapel Hill, North Carolina 27599, United States
- Lineberger Cancer Center, UNC-CH, Chapel Hill, North Carolina 27599, United States
- Integrative Program for Biological and Genome Sciences, UNC-CH, Chapel Hill, North Carolina 27599, United States
| |
Collapse
|
4
|
Schnettler JD, Wang MS, Gantz M, Bunzel HA, Karas C, Hollfelder F, Hecht MH. Selection of a promiscuous minimalist cAMP phosphodiesterase from a library of de novo designed proteins. Nat Chem 2024; 16:1200-1208. [PMID: 38702405 PMCID: PMC11230910 DOI: 10.1038/s41557-024-01490-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 02/27/2024] [Indexed: 05/06/2024]
Abstract
The ability of unevolved amino acid sequences to become biological catalysts was key to the emergence of life on Earth. However, billions of years of evolution separate complex modern enzymes from their simpler early ancestors. To probe how unevolved sequences can develop new functions, we use ultrahigh-throughput droplet microfluidics to screen for phosphoesterase activity amidst a library of more than one million sequences based on a de novo designed 4-helix bundle. Characterization of hits revealed that acquisition of function involved a large jump in sequence space enriching for truncations that removed >40% of the protein chain. Biophysical characterization of a catalytically active truncated protein revealed that it dimerizes into an α-helical structure, with the gain of function accompanied by increased structural dynamics. The identified phosphodiesterase is a manganese-dependent metalloenzyme that hydrolyses a range of phosphodiesters. It is most active towards cyclic AMP, with a rate acceleration of ~109 and a catalytic proficiency of >1014 M-1, comparable to larger enzymes shaped by billions of years of evolution.
Collapse
Affiliation(s)
| | - Michael S Wang
- Department of Chemistry, Princeton University, Princeton, USA
| | - Maximilian Gantz
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - H Adrian Bunzel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Christina Karas
- Department of Molecular Biology, Princeton University, Princeton, USA
| | | | - Michael H Hecht
- Department of Chemistry, Princeton University, Princeton, USA.
| |
Collapse
|
5
|
Fang T, Mohseni A, Lonardi S, Ben Mamoun C. Properties and predicted functions of large genes and proteins of apicomplexan parasites. NAR Genom Bioinform 2024; 6:lqae032. [PMID: 38584870 PMCID: PMC10993292 DOI: 10.1093/nargab/lqae032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2024] [Revised: 02/23/2024] [Accepted: 03/20/2024] [Indexed: 04/09/2024] Open
Abstract
Evolutionary constraints greatly favor compact genomes that efficiently encode proteins. However, several eukaryotic organisms, including apicomplexan parasites such as Toxoplasma gondii, Plasmodium falciparum and Babesia duncani, the causative agents of toxoplasmosis, malaria and babesiosis, respectively, encode very large proteins, exceeding 20 times their average protein size. Although these large proteins represent <1% of the total protein pool and are generally expressed at low levels, their persistence throughout evolution raises important questions about their functions and possible evolutionary pressures to maintain them. In this study, we examined the trends in gene and protein size, function and expression patterns within seven apicomplexan pathogens. Our analysis revealed that certain large proteins in apicomplexan parasites harbor domains potentially important for functions such as antigenic variation, erythrocyte invasion and immune evasion. However, these domains are not limited to or strictly conserved within large proteins. While some of these proteins are predicted to engage in conventional metabolic pathways within these parasites, others fulfill specialized functions for pathogen-host interactions, nutrient acquisition and overall survival.
Collapse
Affiliation(s)
- Tiffany Fang
- Department of Internal Medicine, Section of Infectious Diseases, Department of Microbial Pathogenesis and Department of Pathology, Yale School of Medicine, New Haven, CT, 06520 USA
| | - Amir Mohseni
- Department of Computer Science and Engineering, University of California, Riverside, CA, 92521 USA
| | - Stefano Lonardi
- Department of Computer Science and Engineering, University of California, Riverside, CA, 92521 USA
| | - Choukri Ben Mamoun
- Department of Internal Medicine, Section of Infectious Diseases, Department of Microbial Pathogenesis and Department of Pathology, Yale School of Medicine, New Haven, CT, 06520 USA
| |
Collapse
|
6
|
Salgado JCS, Alnoch RC, Polizeli MDLTDM, Ward RJ. Microenzymes: Is There Anybody Out There? Protein J 2024; 43:393-404. [PMID: 38507106 DOI: 10.1007/s10930-024-10193-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/08/2024] [Indexed: 03/22/2024]
Abstract
Biological macromolecules are found in different shapes and sizes. Among these, enzymes catalyze biochemical reactions and are essential in all organisms, but is there a limit size for them to function properly? Large enzymes such as catalases have hundreds of kDa and are formed by multiple subunits, whereas most enzymes are smaller, with molecular weights of 20-60 kDa. Enzymes smaller than 10 kDa could be called microenzymes and the present literature review brings together evidence of their occurrence in nature. Additionally, bioactive peptides could be a natural source for novel microenzymes hidden in larger peptides and molecular downsizing could be useful to engineer artificial enzymes with low molecular weight improving their stability and heterologous expression. An integrative approach is crucial to discover and determine the amino acid sequences of novel microenzymes, together with their genomic identification and their biochemical biological and evolutionary functions.
Collapse
Affiliation(s)
- Jose Carlos Santos Salgado
- Department of Chemistry, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-900, São Paulo, Brazil.
- Department of Biology, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-901, São Paulo, Brazil.
| | - Robson Carlos Alnoch
- Department of Biology, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-901, São Paulo, Brazil
- Department of Biochemistry and Immunology, Faculdade de Medicina de Ribeirão Preto (FMRP), University of São Paulo, Ribeirão Preto, 14049-900, São Paulo, Brazil
| | - Maria de Lourdes Teixeira de Moraes Polizeli
- Department of Biology, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-901, São Paulo, Brazil
- Department of Biochemistry and Immunology, Faculdade de Medicina de Ribeirão Preto (FMRP), University of São Paulo, Ribeirão Preto, 14049-900, São Paulo, Brazil
| | - Richard John Ward
- Department of Chemistry, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto (FFCLRP), University of São Paulo, Ribeirão Preto, 14040-900, São Paulo, Brazil
- Department of Biochemistry and Immunology, Faculdade de Medicina de Ribeirão Preto (FMRP), University of São Paulo, Ribeirão Preto, 14049-900, São Paulo, Brazil
| |
Collapse
|
7
|
Jones AA, Snow CD. Porous protein crystals: synthesis and applications. Chem Commun (Camb) 2024; 60:5790-5803. [PMID: 38756076 DOI: 10.1039/d4cc00183d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024]
Abstract
Large-pore protein crystals (LPCs) are an emerging class of biomaterials. The inherent diversity of proteins translates to a diversity of crystal lattice structures, many of which display large pores and solvent channels. These pores can, in turn, be functionalized via directed evolution and rational redesign based on the known crystal structures. LPCs possess extremely high solvent content, as well as extremely high surface area to volume ratios. Because of these characteristics, LPCs continue to be explored in diverse applications including catalysis, targeted therapeutic delivery, templating of nanostructures, structural biology. This Feature review article will describe several of the existing platforms in detail, with particular focus on LPC synthesis approaches and reported applications.
Collapse
Affiliation(s)
- Alec Arthur Jones
- School of Biomedical Engineering, Colorado State University, Fort Collins, CO 80523-1301, USA.
| | - Christopher D Snow
- School of Biomedical Engineering, Colorado State University, Fort Collins, CO 80523-1301, USA.
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, CO 80523-1301, USA
| |
Collapse
|
8
|
Ortega-Arzola E, Higgins PM, Cockell CS. The minimum energy required to build a cell. Sci Rep 2024; 14:5267. [PMID: 38438463 PMCID: PMC11306549 DOI: 10.1038/s41598-024-54303-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Accepted: 02/11/2024] [Indexed: 03/06/2024] Open
Abstract
Understanding the energy requirements for cell synthesis accurately and comprehensively has been a longstanding challenge. We introduce a computational model that estimates the minimum energy necessary to build any cell from its constituent parts. This method combines omics and internal cell compositions from various sources to calculate the Gibbs Free Energy of biosynthesis independently of specific metabolic pathways. Our public tool, Synercell, can be used with other models for minumum species-specific energy estimations in any well-sequenced species. The energy for synthesising the genome, transcriptome, proteome, and lipid bilayer of four cell types: Escherichia coli, Saccharomyces cerevisiae, an average mammalian cell and JCVI-syn3A were estimated. Their modelled minimum synthesis energies at 298 K were 9.54 × 10 - 11 J/cell, 4.99 × 10 - 9 J/cell, 3.71 × 10 - 7 J/cell and 3.69 × 10 - 12 respectively. Gram-for-gram synthesis of lipid bilayers requires the most energy, followed by the proteome, genome, and transcriptome. The average per gram cost of biomass synthesis is in the 300s of J/g for all four cells. Implications for the generalisability of cell construction and applications to biogeosciences, cellular biology, biotechnology, and astrobiology are discussed.
Collapse
Affiliation(s)
- Edwin Ortega-Arzola
- UK Centre for Astrobiology, School of Physics and Astronomy, University of Edinburgh, Edinburgh, UK.
| | - Peter M Higgins
- UK Centre for Astrobiology, School of Physics and Astronomy, University of Edinburgh, Edinburgh, UK
- Department of Earth Sciences, University of Toronto, Toronto, ON, Canada
| | - Charles S Cockell
- UK Centre for Astrobiology, School of Physics and Astronomy, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
9
|
Vieira MFM, Hernandez G, Zhong Q, Arbesú M, Veloso T, Gomes T, Martins ML, Monteiro H, Frazão C, Frankel G, Zanzoni A, Cordeiro TN. The pathogen-encoded signalling receptor Tir exploits host-like intrinsic disorder for infection. Commun Biol 2024; 7:179. [PMID: 38351154 PMCID: PMC10864410 DOI: 10.1038/s42003-024-05856-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 01/26/2024] [Indexed: 02/16/2024] Open
Abstract
The translocated intimin receptor (Tir) is an essential type III secretion system (T3SS) effector of attaching and effacing pathogens contributing to the global foodborne disease burden. Tir acts as a cell-surface receptor in host cells, rewiring intracellular processes by targeting multiple host proteins. We investigated the molecular basis for Tir's binding diversity in signalling, finding that Tir is a disordered protein with host-like binding motifs. Unexpectedly, also are several other T3SS effectors. By an integrative approach, we reveal that Tir dimerises via an antiparallel OB-fold within a highly disordered N-terminal cytosolic domain. Also, it has a long disordered C-terminal cytosolic domain partially structured at host-like motifs that bind lipids. Membrane affinity depends on lipid composition and phosphorylation, highlighting a previously unrecognised host interaction impacting Tir-induced actin polymerisation and cell death. Furthermore, multi-site tyrosine phosphorylation enables Tir to engage host SH2 domains in a multivalent fuzzy complex, consistent with Tir's scaffolding role and binding promiscuity. Our findings provide insights into the intracellular Tir domains, highlighting the ability of T3SS effectors to exploit host-like protein disorder as a strategy for host evasion.
Collapse
Affiliation(s)
- Marta F M Vieira
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Guillem Hernandez
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Qiyun Zhong
- Department of Life Sciences, Imperial College London, South Kensington Campus, London, UK
| | - Miguel Arbesú
- Department of NMR-supported Structural Biology, Leibniz-Forschungsinstitut für Molekulare Pharmakologie, Berlin, Germany
- InstaDeep Ltd, 5 Merchant Square, London, UK
| | - Tiago Veloso
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Tiago Gomes
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Maria L Martins
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Hugo Monteiro
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Carlos Frazão
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal
| | - Gad Frankel
- Department of Life Sciences, Imperial College London, South Kensington Campus, London, UK
| | - Andreas Zanzoni
- Aix-Marseille Université, Inserm, TAGC, UMR_S1090, Marseille, France
| | - Tiago N Cordeiro
- Instituto de Tecnologia Química e Biológica António Xavier, Universidade Nova de Lisboa, Av. da República, Oeiras, Portugal.
| |
Collapse
|
10
|
Kuder KJ. Docking Foundations: From Rigid to Flexible Docking. Methods Mol Biol 2024; 2780:3-14. [PMID: 38987460 DOI: 10.1007/978-1-0716-3985-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Despite the development of methods for the experimental determination of protein structures, the dissonance between the number of known sequences and their solved structures is still enormous. This is particularly evident in protein-protein complexes. To fill this gap, diverse technologies have been developed to study protein-protein interactions (PPIs) in a cellular context including a range of biological and computational methods. The latter derive from techniques originally published and applied almost half a century ago and are based on interdisciplinary knowledge from the nexus of the fields of biology, chemistry, and physics about protein sequences, structures, and their folding. Protein-protein docking, the main protagonist of this chapter, is routinely treated as an integral part of protein research. Herein, we describe the basic foundations of the whole process in general terms, but step by step from protein representations through docking methods and evaluation of complexes to their final validation.
Collapse
Affiliation(s)
- Kamil J Kuder
- Department of Technology and Biotechnology of Drugs, Faculty of Pharmacy, Jagiellonian University Medical College, Kraków, Poland.
| |
Collapse
|
11
|
Hwang E, Lim YB. Self-Assembled Protein Nanostructures via Irreversible Peptide Assembly. ACS Macro Lett 2023; 12:1679-1684. [PMID: 38035369 DOI: 10.1021/acsmacrolett.3c00550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
The quaternary structure of proteins extends the functionality of monomeric proteins. Similarly, self-assembled protein nanostructures (SPrNs) have great potential to improve the functionality and complexity of proteins; however, the difficulty associated with the fabrication of SPrNs is far greater than that associated with the fabrication of self-assembled peptides or polymers and often requires sophisticated computational design. To make the process of SPrN formation simpler and more intuitive, herein, we devise a strategy to adopt an irreversible self-assembled peptide nanostructure (SPeN) process en route to the formation of SPrNs. The strategy employs three sequential steps: first, the formation of SPeNs (an equilibrium process); second, covalent capture of SPeNs (an irreversible process); third, the final assembly of SPrNs via protein-peptide interactions (an equilibrium process). This strategy allowed us to fabricate SPrNs in which the size of the protein was approximately 9 times higher than that of the self-assembling peptide. Furthermore, we demonstrated that the irreversible SPeN could be used as a primary building block for assembly into superstructures. Overall, this strategy is conceptually as simple as SPeN fabrication and is potentially applicable to any soluble protein.
Collapse
Affiliation(s)
- Euimin Hwang
- Department of Materials Science and Engineering, Yonsei University, Seoul 03722, Republic of Korea
| | - Yong-Beom Lim
- Department of Materials Science and Engineering, Yonsei University, Seoul 03722, Republic of Korea
| |
Collapse
|
12
|
Sharon I, Hilvert D, Schmeing TM. Cyanophycin and its biosynthesis: not hot but very cool. Nat Prod Rep 2023; 40:1479-1497. [PMID: 37231979 DOI: 10.1039/d2np00092j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Covering: 1878 to early 2023Cyanophycin is a biopolymer consisting of a poly-aspartate backbone with arginines linked to each Asp sidechain through isopeptide bonds. Cyanophycin is made by cyanophycin synthetase 1 or 2 through ATP-dependent polymerization of Asp and Arg, or β-Asp-Arg, respectively. It is degraded into dipeptides by exo-cyanophycinases, and these dipeptides are hydrolyzed into free amino acids by general or dedicated isodipeptidase enzymes. When synthesized, chains of cyanophycin coalesce into large, inert, membrane-less granules. Although discovered in cyanobacteria, cyanophycin is made by species throughout the bacterial kingdom, and cyanophycin metabolism provides advantages for toxic bloom forming algae and some human pathogens. Some bacteria have developed dedicated schemes for cyanophycin accumulation and use, which include fine temporal and spatial regulation. Cyanophycin has also been heterologously produced in a variety of host organisms to a remarkable level, over 50% of the host's dry mass, and has potential for a variety of green industrial applications. In this review, we summarize the progression of cyanophycin research, with an emphasis on recent structural studies of enzymes in the cyanophycin biosynthetic pathway. These include several unexpected revelations that show cyanophycin synthetase to be a very cool, multi-functional macromolecular machine.
Collapse
Affiliation(s)
- Itai Sharon
- Department of Biochemistry and Centre de Recherche en Biologie Structurale, McGill University, Montréal, QC, Canada, H3G 0B1.
| | - Donald Hilvert
- Laboratory of Organic Chemistry, ETH Zürich, CH-8093 Zürich, Switzerland
| | - T Martin Schmeing
- Department of Biochemistry and Centre de Recherche en Biologie Structurale, McGill University, Montréal, QC, Canada, H3G 0B1.
| |
Collapse
|
13
|
Lobinska G, Pilpel Y, Ram Y. Phenotype switching of the mutation rate facilitates adaptive evolution. Genetics 2023; 225:iyad111. [PMID: 37293818 PMCID: PMC10471227 DOI: 10.1093/genetics/iyad111] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 02/05/2023] [Accepted: 05/25/2023] [Indexed: 06/10/2023] Open
Abstract
The mutation rate plays an important role in adaptive evolution. It can be modified by mutator and anti-mutator alleles. Recent empirical evidence hints that the mutation rate may vary among genetically identical individuals: evidence from bacteria suggests that the mutation rate can be affected by expression noise of a DNA repair protein and potentially also by translation errors in various proteins. Importantly, this non-genetic variation may be heritable via a transgenerational epigenetic mode of inheritance, giving rise to a mutator phenotype that is independent from mutator alleles. Here, we investigate mathematically how the rate of adaptive evolution is affected by the rate of mutation rate phenotype switching. We model an asexual population with two mutation rate phenotypes, non-mutator and mutator. An offspring may switch from its parental phenotype to the other phenotype. We find that switching rates that correspond to so-far empirically described non-genetic systems of inheritance of the mutation rate lead to higher rates of adaptation on both artificial and natural fitness landscapes. These switching rates can maintain within the same individuals both a mutator phenotype and intermediary mutations, a combination that facilitates adaptation. Moreover, non-genetic inheritance increases the proportion of mutators in the population, which in turn increases the probability of hitchhiking of the mutator phenotype with adaptive mutations. This in turns facilitates the acquisition of additional adaptive mutations. Our results rationalize recently observed noise in the expression of proteins that affect the mutation rate and suggest that non-genetic inheritance of this phenotype may facilitate evolutionary adaptive processes.
Collapse
Affiliation(s)
- Gabriela Lobinska
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Yitzhak Pilpel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Yoav Ram
- School of Zoology, Faculty of Life Sciences, Tel Aviv University, Tel Aviv 6997801, Israel
| |
Collapse
|
14
|
Nikolsky KS, Kulikova LI, Petrovskiy DV, Rudnev VR, Butkova TV, Malsagova KA, Kopylov AT, Kaysheva AL. Three-helix bundle and SH3-type barrels: autonomously stable structural motifs in small and large proteins. J Biomol Struct Dyn 2023:1-15. [PMID: 37640007 DOI: 10.1080/07391102.2023.2250450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 08/12/2023] [Indexed: 08/31/2023]
Abstract
In this study, we investigated two variants of a three-helix bundle and SH3-type barrel, compact in space, present in small and large proteins of various living organisms. Using a neural graph network, proteins with three-helix bundle (n = 1377) and SH3-type barrels (n = 1914) spatial folds were selected. Molecular experiments were performed for small proteins with these folds, and motifs were studied autonomously outside the protein environment at 300, 340, and 370 K. A comparative analysis of the main parameters of the structures in the course of the experiment was performed, including gyration radius, area accessible to the solvent, number of hydrophobic and hydrogen bonds, and root-mean-square deviation of atomic positions (RMSD). We exhibited an autonomous stability of the studied folds outside the protein environment in an aquatic medium. We aimed to demonstrate the possibility of analyzing three-helix bundle and SH3-type barrels autonomously outside the protein globule, thereby reducing the computational time and increasing performance without significant loss of information.Communicated by Ramaswamy H. Sarma.
Collapse
|
15
|
Ahmadi H, Sheikh-Assadi M, Fatahi R, Zamani Z, Shokrpour M. Optimizing an efficient ensemble approach for high-quality de novo transcriptome assembly of Thymus daenensis. Sci Rep 2023; 13:12415. [PMID: 37524806 PMCID: PMC10390528 DOI: 10.1038/s41598-023-39620-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 07/27/2023] [Indexed: 08/02/2023] Open
Abstract
Non-erroneous and well-optimized transcriptome assembly is a crucial prerequisite for authentic downstream analyses. Each de novo assembler has its own algorithm-dependent pros and cons to handle the assembly issues and should be specifically tested for each dataset. Here, we examined efficiency of seven state-of-art assemblers on ~ 30 Gb data obtained from mRNA-sequencing of Thymus daenensis. In an ensemble workflow, combining the outputs of different assemblers associated with an additional redundancy-reducing step could generate an optimized outcome in terms of completeness, annotatability, and ORF richness. Based on the normalized scores of 16 benchmarking metrics, EvidentialGene, BinPacker, Trinity, rnaSPAdes, CAP3, IDBA-trans, and Velvet-Oases performed better, respectively. EvidentialGene, as the best assembler, totally produced 316,786 transcripts, of which 235,730 (74%) were predicted to have a unique protein hit (on uniref100), and also half of its transcripts contained an ORF. The total number of unique BLAST hits for EvidentialGene was approximately three times greater than that of the worst assembler (Velvet-Oases). EvidentialGene could even capture 17% and 7% more average BLAST hits than BinPacker and Trinity. Although BinPacker and CAP3 produced longer transcripts, the EvidentialGene showed a higher collinearity between transcript size and ORF length. Compared with the other programs, EvidentialGene yielded a higher number of optimal transcript sets, further full-length transcripts, and lower possible misassemblies. Our finding corroborates that in non-model species, relying on a single assembler may not give an entirely satisfactory result. Therefore, this study proposes an ensemble approach of accompanying EvidentialGene pipelines to acquire a superior assembly for T. daenensis.
Collapse
Affiliation(s)
- Hosein Ahmadi
- Department of Horticulture Science, Faculty of Agriculture and Natural Sciences, University of Tehran, Karaj, Iran
| | - Morteza Sheikh-Assadi
- Department of Horticulture Science, Faculty of Agriculture and Natural Sciences, University of Tehran, Karaj, Iran
| | - Reza Fatahi
- Department of Horticulture Science, Faculty of Agriculture and Natural Sciences, University of Tehran, Karaj, Iran.
| | - Zabihollah Zamani
- Department of Horticulture Science, Faculty of Agriculture and Natural Sciences, University of Tehran, Karaj, Iran
| | - Majid Shokrpour
- Department of Horticulture Science, Faculty of Agriculture and Natural Sciences, University of Tehran, Karaj, Iran
| |
Collapse
|
16
|
Nevers Y, Glover NM, Dessimoz C, Lecompte O. Protein length distribution is remarkably uniform across the tree of life. Genome Biol 2023; 24:135. [PMID: 37291671 PMCID: PMC10251718 DOI: 10.1186/s13059-023-02973-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 05/16/2023] [Indexed: 06/10/2023] Open
Abstract
BACKGROUND In every living species, the function of a protein depends on its organization of structural domains, and the length of a protein is a direct reflection of this. Because every species evolved under different evolutionary pressures, the protein length distribution, much like other genomic features, is expected to vary across species but has so far been scarcely studied. RESULTS Here we evaluate this diversity by comparing protein length distribution across 2326 species (1688 bacteria, 153 archaea, and 485 eukaryotes). We find that proteins tend to be on average slightly longer in eukaryotes than in bacteria or archaea, but that the variation of length distribution across species is low, especially compared to the variation of other genomic features (genome size, number of proteins, gene length, GC content, isoelectric points of proteins). Moreover, most cases of atypical protein length distribution appear to be due to artifactual gene annotation, suggesting the actual variation of protein length distribution across species is even smaller. CONCLUSIONS These results open the way for developing a genome annotation quality metric based on protein length distribution to complement conventional quality measures. Overall, our findings show that protein length distribution between living species is more uniform than previously thought. Furthermore, we also provide evidence for a universal selection on protein length, yet its mechanism and fitness effect remain intriguing open questions.
Collapse
Affiliation(s)
- Yannis Nevers
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
- Swiss Institute for Bioinformatics, University of Lausanne, Lausanne, Switzerland.
| | - Natasha M Glover
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute for Bioinformatics, University of Lausanne, Lausanne, Switzerland
| | - Christophe Dessimoz
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
- Swiss Institute for Bioinformatics, University of Lausanne, Lausanne, Switzerland
- Department of Computer Science, University College London, London, UK
- Centre for Life's Origins and Evolution, Department of Genetics, Evolution and Environment, University College London, London, UK
| | - Odile Lecompte
- Department of Computer Science, Centre de Recherche en Biomédecine de Strasbourg, ICube, UMR 7357, University of Strasbourg, CNRS, Strasbourg, France
| |
Collapse
|
17
|
Loan Young T, Chang Wang K, James Varley A, Li B. Clinical Delivery of Circular RNA: Lessons Learned from RNA Drug Development. Adv Drug Deliv Rev 2023; 197:114826. [PMID: 37088404 DOI: 10.1016/j.addr.2023.114826] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Revised: 03/28/2023] [Accepted: 04/11/2023] [Indexed: 04/25/2023]
Abstract
Circular RNAs (circRNA) represent a distinct class of covalently closed-loop RNA molecules, which play diverse roles in regulating biological processes and disease states. The enhanced stability of synthetic circRNAs compared to their linear counterparts has recently garnered considerable research interest, paving the way for new therapeutic applications. While clinical circRNA technology is still in its early stages, significant advancements in mRNA technology offer valuable insights into its potential future applications. Two primary obstacles that must be addressed are the development of efficient production methods and the optimization of delivery systems. To expedite progress in this area, this review aims to provide an overview of the current state of knowledge on circRNA structure and function, outline recent techniques for synthesizing circRNAs, highlight key delivery strategies and applications, and discuss the current challenges and future prospects in the field of circRNA-based therapeutics.
Collapse
Affiliation(s)
- Tiana Loan Young
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada
| | - Kevin Chang Wang
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada
| | - Andrew James Varley
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada
| | - Bowen Li
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON M5S 3M2, Canada; Institute of Biomedical Engineering, University of Toronto, Toronto, ON M5S 3M2, Canada; Princess Margaret Cancer Center, University Health Network, Toronto, ON M5G 2C1, Canada.
| |
Collapse
|
18
|
Watanabe T, Kure A, Horiike T. OrthoPhy: A Program to Construct Ortholog Data Sets Using Taxonomic Information. Genome Biol Evol 2023; 15:7044703. [PMID: 36799928 PMCID: PMC9991595 DOI: 10.1093/gbe/evad026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 01/30/2023] [Accepted: 02/13/2023] [Indexed: 02/18/2023] Open
Abstract
Species phylogenetic trees represent the evolutionary processes of organisms, and they are fundamental in evolutionary research. Therefore, new methods have been developed to obtain more reliable species phylogenetic trees. A highly reliable method is the construction of an ortholog data set based on sequence information of genes, which is then used to infer the species phylogenetic tree. However, although methods for constructing an ortholog data set for species phylogenetic analysis have been developed, they cannot remove some paralogs, which is necessary for reliable species phylogenetic inference. To address the limitations of current methods, we developed OrthoPhy, a program that excludes paralogs and constructs highly accurate ortholog data sets using taxonomic information dividing analyzed species into monophyletic groups. OrthoPhy can remove paralogs, detecting inconsistencies between taxonomic information and phylogenetic trees of candidate ortholog groups clustered by sequence similarity. Performance tests using evolutionary simulated sequences and real sequences of 40 bacteria revealed that the precision of ortholog inference by OrthoPhy is higher than that of existing programs. Additionally, the phylogenetic analysis of species was more accurate when performed using ortholog data sets constructed by OrthoPhy than that performed using data sets constructed by existing programs. Furthermore, we performed a benchmark test of the Quest for Orthologs using real sequence data and found that the concordance rate between the phylogenetic trees of orthologs inferred by OrthoPhy and those of species was higher than the rates obtained by other ortholog inference programs. Therefore, ortholog data sets constructed using OrthoPhy enabled a more accurate phylogenetic analysis of species than those constructed using the existing programs, and OrthoPhy can be used for the phylogenetic analysis of species even for distantly related species that have experienced many evolutionary events.
Collapse
Affiliation(s)
- Tomoaki Watanabe
- United Graduate School of Agricultural Science, Gifu University, Gifu, Japan
| | - Akinori Kure
- Graduate School of Integrated Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Tokumasa Horiike
- Department of Bioresource Sciences, Shizuoka University, Shizuoka, Japan
| |
Collapse
|
19
|
The Structure of Evolutionary Model Space for Proteins across the Tree of Life. BIOLOGY 2023; 12:biology12020282. [PMID: 36829559 PMCID: PMC9952988 DOI: 10.3390/biology12020282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 02/04/2023] [Accepted: 02/08/2023] [Indexed: 02/12/2023]
Abstract
The factors that determine the relative rates of amino acid substitution during protein evolution are complex and known to vary among taxa. We estimated relative exchangeabilities for pairs of amino acids from clades spread across the tree of life and assessed the historical signal in the distances among these clade-specific models. We separately trained these models on collections of arbitrarily selected protein alignments and on ribosomal protein alignments. In both cases, we found a clear separation between the models trained using multiple sequence alignments from bacterial clades and the models trained on archaeal and eukaryotic data. We assessed the predictive power of our novel clade-specific models of sequence evolution by asking whether fit to the models could be used to identify the source of multiple sequence alignments. Model fit was generally able to correctly classify protein alignments at the level of domain (bacterial versus archaeal), but the accuracy of classification at finer scales was much lower. The only exceptions to this were the relatively high classification accuracy for two archaeal lineages: Halobacteriaceae and Thermoprotei. Genomic GC content had a modest impact on relative exchangeabilities despite having a large impact on amino acid frequencies. Relative exchangeabilities involving aromatic residues exhibited the largest differences among models. There were a small number of exchangeabilities that exhibited large differences in comparisons among major clades and between generalized models and ribosomal protein models. Taken as a whole, these results reveal that a small number of relative exchangeabilities are responsible for much of the structure of the "model space" for protein sequence evolution. The clade-specific models we generated may be useful tools for protein phylogenetics, and the structure of evolutionary model space that they revealed has implications for phylogenomic inference across the tree of life.
Collapse
|
20
|
Satoh S, Tanaka R, Yokono M, Endoh D, Yabuki T, Tanaka A. Phylogeny analysis of whole protein-coding genes in metagenomic data detected an environmental gradient for the microbiota. PLoS One 2023; 18:e0281288. [PMID: 36730456 PMCID: PMC9894459 DOI: 10.1371/journal.pone.0281288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 01/20/2023] [Indexed: 02/04/2023] Open
Abstract
Environmental factors affect the growth of microorganisms and therefore alter the composition of microbiota. Correlative analysis of the relationship between metagenomic composition and the environmental gradient can help elucidate key environmental factors and establishment principles for microbial communities. However, a reasonable method to quantitatively compare whole metagenomic data and identify the primary environmental factors for the establishment of microbiota has not been reported so far. In this study, we developed a method to compare whole proteomes deduced from metagenomic shotgun sequencing data, and quantitatively display their phylogenetic relationships as metagenomic trees. We called this method Metagenomic Phylogeny by Average Sequence Similarity (MPASS). We also compared one of the metagenomic trees with dendrograms of environmental factors using a comparison tool for phylogenetic trees. The MPASS method correctly constructed metagenomic trees of simulated metagenomes and soil and water samples. The topology of the metagenomic tree of samples from the Kirishima hot springs area in Japan was highly similarity to that of the dendrograms based on previously reported environmental factors for this area. The topology of the metagenomic tree also reflected the dynamics of microbiota at the taxonomic and functional levels. Our results strongly suggest that MPASS can successfully classify metagenomic shotgun sequencing data based on the similarity of whole protein-coding sequences, and will be useful for the identification of principal environmental factors for the establishment of microbial communities. Custom Perl script for the MPASS pipeline is available at https://github.com/s0sat/MPASS.
Collapse
Affiliation(s)
- Soichirou Satoh
- Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto, Japan
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto, Japan
- * E-mail:
| | - Rei Tanaka
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto, Japan
| | - Makio Yokono
- Division of Environmental Photobiology, National Institute for Basic Biology, Okazaki, Japan
| | - Daiji Endoh
- Department of Radiation Biology, School of Veterinary Medicine, Rakuno Gakuen University, Ebetsu, Japan
| | - Tetsuo Yabuki
- General Education Department, Hokusei Gakuen University, Sapporo, Japan
| | - Ayumi Tanaka
- Institute of Low Temperature Science, Hokkaido University, Sapporo, Japan
| |
Collapse
|
21
|
Macho Rendón J, Rebollido-Ríos R, Torrent Burgas M. HPIPred: Host-pathogen interactome prediction with phenotypic scoring. Comput Struct Biotechnol J 2022; 20:6534-6542. [PMID: 36514317 PMCID: PMC9718936 DOI: 10.1016/j.csbj.2022.11.026] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 11/09/2022] [Accepted: 11/10/2022] [Indexed: 11/22/2022] Open
Abstract
Protein-protein interactions (PPIs) are involved in most cellular processes. Unfortunately, current knowledge of host-pathogen interactomes is still very limited. Experimental methods used to detect PPIs have several limitations, including increasing complexity and economic cost in large-scale screenings. Hence, computational methods are commonly used to support experimental data, although they generally suffer from high false-positive rates. To address this issue, we have created HPIPred, a host-pathogen PPI prediction tool based on numerical encoding of physicochemical properties. Unlike other available methods, HPIPred integrates phenotypic data to prioritize biologically meaningful results. We used HPIPred to screen the entire Homo sapiens and Pseudomonas aeruginosa PAO1 proteomes to generate a host-pathogen interactome with 763 interactions displaying a highly connected network topology. Our predictive model can be used to prioritize protein-protein interactions as potential targets for antibacterial drug development. Available at: https://github.com/SysBioUAB/hpi_predictor.
Collapse
|
22
|
Shey RA, Ghogomu SM, Nebangwa DN, Shintouo CM, Yaah NE, Yengo BN, Nkemngo FN, Esoh KK, Tchatchoua NMT, Mbachick TT, Dede AF, Lemoge AA, Ngwese RA, Asa BF, Ayong L, Njemini R, Vanhamme L, Souopgui J. Rational design of a novel multi-epitope peptide-based vaccine against Onchocerca volvulus using transmembrane proteins. FRONTIERS IN TROPICAL DISEASES 2022. [DOI: 10.3389/fitd.2022.1046522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Almost a decade ago, it was recognized that the global elimination of onchocerciasis by 2030 will not be feasible without, at least, an effective prophylactic and/or therapeutic vaccine to complement chemotherapy and vector control strategies. Recent advances in computational immunology (immunoinformatics) have seen the design of novel multi-epitope onchocerciasis vaccine candidates which are however yet to be evaluated in clinical settings. Still, continued research to increase the pool of vaccine candidates, and therefore the chance of success in a clinical trial remains imperative. Here, we designed a multi-epitope vaccine candidate by assembling peptides from 14 O. volvulus (Ov) proteins using an immunoinformatics approach. An initial 126 Ov proteins, retrieved from the Wormbase database, and at least 90% similar to orthologs in related nematode species of economic importance, were screened for localization, presence of transmembrane domain, and antigenicity using different web servers. From the 14 proteins retained after the screening, 26 MHC-1 and MHC-II (T-cell) epitopes, and linear B-lymphocytes epitopes were predicted and merged using suitable linkers. The Mycobacterium tuberculosis Resuscitation-promoting factor E (RPFE_MYCTU), which is an agonist of TLR4, was then added to the N-terminal of the vaccine candidate as a built-in adjuvant. Immune simulation analyses predicted strong B-cell and IFN-γ based immune responses which are necessary for protection against O. volvulus infection. Protein-protein docking and molecular dynamic simulation predicted stable interactions between the 3D structure of the vaccine candidate and human TLR4. These results show that the designed vaccine candidate has the potential to stimulate both humoral and cellular immune responses and should therefore be subject to further laboratory investigation.
Collapse
|
23
|
Guvench O. Atomic-Resolution Experimental Structural Biology and Molecular Dynamics Simulations of Hyaluronan and Its Complexes. Molecules 2022; 27:7276. [PMID: 36364098 PMCID: PMC9658939 DOI: 10.3390/molecules27217276] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 10/20/2022] [Accepted: 10/21/2022] [Indexed: 11/28/2023] Open
Abstract
This review summarizes the atomic-resolution structural biology of hyaluronan and its complexes available in the Protein Data Bank, as well as published studies of atomic-resolution explicit-solvent molecular dynamics simulations on these and other hyaluronan and hyaluronan-containing systems. Advances in accurate molecular mechanics force fields, simulation methods and software, and computer hardware have supported a recent flourish in such simulations, such that the simulation publications now outnumber the structural biology publications by an order of magnitude. In addition to supplementing the experimental structural biology with computed dynamic and thermodynamic information, the molecular dynamics studies provide a wealth of atomic-resolution information on hyaluronan-containing systems for which there is no atomic-resolution structural biology either available or possible. Examples of these summarized in this review include hyaluronan pairing with other hyaluronan molecules and glycosaminoglycans, with ions, with proteins and peptides, with lipids, and with drugs and drug-like molecules. Despite limitations imposed by present-day computing resources on system size and simulation timescale, atomic-resolution explicit-solvent molecular dynamics simulations have been able to contribute significant insight into hyaluronan's flexibility and capacity for intra- and intermolecular non-covalent interactions.
Collapse
Affiliation(s)
- Olgun Guvench
- Department of Pharmaceutical Sciences and Administration, School of Pharmacy, Westbrook College of Health Professions, University of New England, 716 Stevens Avenue, Portland, ME 04103, USA
| |
Collapse
|
24
|
Virtual 2D map of cyanobacterial proteomes. PLoS One 2022; 17:e0275148. [PMID: 36190972 PMCID: PMC9529120 DOI: 10.1371/journal.pone.0275148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Accepted: 09/12/2022] [Indexed: 11/05/2022] Open
Abstract
Cyanobacteria are prokaryotic Gram-negative organisms prevalent in nearly all habitats. A detailed proteomics study of Cyanobacteria has not been conducted despite extensive study of their genome sequences. Therefore, we conducted a proteome-wide analysis of the Cyanobacteria proteome and found Calothrix desertica as the largest (680331.825 kDa) and Candidatus synechococcus spongiarum as the smallest (42726.77 kDa) proteome of the cyanobacterial kingdom. A Cyanobacterial proteome encodes 312.018 amino acids per protein, with a molecular weight of 182173.1324 kDa per proteome. The isoelectric point (pI) of the Cyanobacterial proteome ranges from 2.13 to 13.32. It was found that the Cyanobacterial proteome encodes a greater number of acidic-pI proteins, and their average pI is 6.437. The proteins with higher pI are likely to contain repetitive amino acids. A virtual 2D map of Cyanobacterial proteome showed a bimodal distribution of molecular weight and pI. Several proteins within the Cyanobacterial proteome were found to encode Selenocysteine (Sec) amino acid, while Pyrrolysine amino acids were not detected. The study can enable us to generate a high-resolution cell map to monitor proteomic dynamics. Through this computational analysis, we can gain a better understanding of the bias in codon usage by analyzing the amino acid composition of the Cyanobacterial proteome.
Collapse
|
25
|
Schmidt H, Mauer K, Glaser M, Dezfuli BS, Hellmann SL, Silva Gomes AL, Butter F, Wade RC, Hankeln T, Herlyn H. Identification of antiparasitic drug targets using a multi-omics workflow in the acanthocephalan model. BMC Genomics 2022; 23:677. [PMID: 36180835 PMCID: PMC9523657 DOI: 10.1186/s12864-022-08882-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 09/12/2022] [Indexed: 08/30/2023] Open
Abstract
Background With the expansion of animal production, parasitic helminths are gaining increasing economic importance. However, application of several established deworming agents can harm treated hosts and environment due to their low specificity. Furthermore, the number of parasite strains showing resistance is growing, while hardly any new anthelminthics are being developed. Here, we present a bioinformatics workflow designed to reduce the time and cost in the development of new strategies against parasites. The workflow includes quantitative transcriptomics and proteomics, 3D structure modeling, binding site prediction, and virtual ligand screening. Its use is demonstrated for Acanthocephala (thorny-headed worms) which are an emerging pest in fish aquaculture. We included three acanthocephalans (Pomphorhynchus laevis, Neoechinorhynchus agilis, Neoechinorhynchus buttnerae) from four fish species (common barbel, European eel, thinlip mullet, tambaqui). Results The workflow led to eleven highly specific candidate targets in acanthocephalans. The candidate targets showed constant and elevated transcript abundances across definitive and accidental hosts, suggestive of constitutive expression and functional importance. Hence, the impairment of the corresponding proteins should enable specific and effective killing of acanthocephalans. Candidate targets were also highly abundant in the acanthocephalan body wall, through which these gutless parasites take up nutrients. Thus, the candidate targets are likely to be accessible to compounds that are orally administered to fish. Virtual ligand screening led to ten compounds, of which five appeared to be especially promising according to ADMET, GHS, and RO5 criteria: tadalafil, pranazepide, piketoprofen, heliomycin, and the nematicide derquantel. Conclusions The combination of genomics, transcriptomics, and proteomics led to a broadly applicable procedure for the cost- and time-saving identification of candidate target proteins in parasites. The ligands predicted to bind can now be further evaluated for their suitability in the control of acanthocephalans. The workflow has been deposited at the Galaxy workflow server under the URL tinyurl.com/yx72rda7. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08882-1.
Collapse
Affiliation(s)
- Hanno Schmidt
- Institute of Organismic and Molecular Evolution (iomE), Anthropology, Johannes Gutenberg University Mainz, Mainz, Germany. .,Present address: Institute for Virology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany.
| | - Katharina Mauer
- Institute of Organismic and Molecular Evolution (iomE), Anthropology, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Manuel Glaser
- Molecular and Cellular Modeling, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | | | - Sören Lukas Hellmann
- Institute of Organismic and Molecular Evolution (iomE), Molecular Genetics and Genomic Analysis, Johannes Gutenberg University Mainz, Mainz, Germany.,Present address: Nucleic Acids Core Facility, Johannes Gutenberg University Mainz, Mainz, Germany
| | | | - Falk Butter
- Quantitative Proteomics, Institute of Molecular Biology (IMB), Mainz, Germany
| | - Rebecca C Wade
- Molecular and Cellular Modeling, Heidelberg Institute for Theoretical Studies, Heidelberg, Germany.,Center for Molecular Biology (ZMBH) and Interdisciplinary Center for Scientific Computing (IWR), Heidelberg University, Heidelberg, Germany
| | - Thomas Hankeln
- Institute of Organismic and Molecular Evolution (iomE), Molecular Genetics and Genomic Analysis, Johannes Gutenberg University Mainz, Mainz, Germany
| | - Holger Herlyn
- Institute of Organismic and Molecular Evolution (iomE), Anthropology, Johannes Gutenberg University Mainz, Mainz, Germany.
| |
Collapse
|
26
|
Castro E, Godavarthi A, Rubinfien J, Givechian K, Bhaskar D, Krishnaswamy S. Transformer-based protein generation with regularized latent space optimization. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00532-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
27
|
Ahn SY, Kim M, Bae JE, Bang IS, Lee SW. Reliability of the In Silico Prediction Approach to In Vitro Evaluation of Bacterial Toxicity. SENSORS (BASEL, SWITZERLAND) 2022; 22:6557. [PMID: 36081016 PMCID: PMC9459819 DOI: 10.3390/s22176557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 08/26/2022] [Accepted: 08/26/2022] [Indexed: 06/15/2023]
Abstract
Several pathogens that spread through the air are highly contagious, and related infectious diseases are more easily transmitted through airborne transmission under indoor conditions, as observed during the COVID-19 pandemic. Indoor air contaminated by microorganisms, including viruses, bacteria, and fungi, or by derived pathogenic substances, can endanger human health. Thus, identifying and analyzing the potential pathogens residing in the air are crucial to preventing disease and maintaining indoor air quality. Here, we applied deep learning technology to analyze and predict the toxicity of bacteria in indoor air. We trained the ProtBert model on toxic bacterial and virulence factor proteins and applied them to predict the potential toxicity of some bacterial species by analyzing their protein sequences. The results reflect the results of the in vitro analysis of their toxicity in human cells. The in silico-based simulation and the obtained results demonstrated that it is plausible to find possible toxic sequences in unknown protein sequences.
Collapse
Affiliation(s)
- Sung-Yoon Ahn
- Pattern Recognition and Machine Learning Lab, Department of AI Software, Gachon University, Seongnam 13557, Korea
| | - Mira Kim
- Department of Microbiology and Immunology, Chosun University School of Dentistry, Gwangju 61452, Korea
| | - Ji-Eun Bae
- Department of Microbiology and Immunology, Chosun University School of Dentistry, Gwangju 61452, Korea
| | - Iel-Soo Bang
- Department of Microbiology and Immunology, Chosun University School of Dentistry, Gwangju 61452, Korea
| | - Sang-Woong Lee
- Pattern Recognition and Machine Learning Lab, Department of AI Software, Gachon University, Seongnam 13557, Korea
| |
Collapse
|
28
|
Zendrini A, Guerra G, Sagini K, Vagner T, Di Vizio D, Bergese P. On the surface-to-bulk partition of proteins in extracellular vesicles. Colloids Surf B Biointerfaces 2022; 218:112728. [DOI: 10.1016/j.colsurfb.2022.112728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Revised: 07/20/2022] [Accepted: 07/24/2022] [Indexed: 10/16/2022]
|
29
|
Gurgul A, Szmatoła T, Ocłoń E, Jasielczuk I, Semik-Gurgul E, Finno CJ, Petersen JL, Bellone R, Hales EN, Ząbek T, Arent Z, Kotula-Balak M, Bugno-Poniewierska M. Another lesson from unmapped reads: in-depth analysis of RNA-Seq reads from various horse tissues. J Appl Genet 2022; 63:571-581. [PMID: 35670911 DOI: 10.1007/s13353-022-00705-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 04/27/2022] [Accepted: 05/31/2022] [Indexed: 11/25/2022]
Abstract
In recent years, a vast amount of sequencing data has been generated and large improvements have been made to reference genome sequences. Despite these advances, significant portions of reads still do not map to reference genomes and these reads have been considered as junk or artificial sequences. Recent studies have shown that these reads can be useful, e.g., for refining reference genomes or detecting contaminating microorganisms present in the analyzed biological samples. A special case of this is RNA sequencing (RNA-Seq) reads that come from tissue transcriptomes. Unmapped reads from RNA-Seq have received much less attention than those from whole-genome sequencing. In particular, in the horse, an analysis of unmapped RNA reads has not been performed yet. Thus, in this study, we analyzed the unmapped reads originating from the RNA-Seq performed through the Functional Annotation of Animal Genomes (FAANG) project in the horse, using eight different tissues from two mares. We demonstrated that unmapped reads from RNA-Seq could be easily assembled into transcripts relating to many important genes present in the sequences of other mammals. Large portions of these transcripts did not have coding potential and, thus, can be considered as non-coding RNA. Moreover, reads that were not mapped to the reference genome but aligned to the entries in NCBI database of horse proteins were enriched for biological processes that largely correspond to the functions of organ from which RNA was isolated and thus are presumably true transcripts of genes associated with cell metabolism in those tissues. In addition, a portion of reads aligned to the common pathogenic or neutral microbiota, of which the most common was Brucella spp. These data suggest that unmapped reads can be an important target for in-depth analysis that may substantially enrich results of initial RNA-Seq experiments for various tissues and organs.
Collapse
Affiliation(s)
- Artur Gurgul
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland.
| | - Tomasz Szmatoła
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Ewa Ocłoń
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Igor Jasielczuk
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Ewelina Semik-Gurgul
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Carrie J Finno
- Department of Population Health and Reproduction, University of California Davis School of Veterinary Medicine, Davis, CA, USA
| | - Jessica L Petersen
- Department of Animal Science, University of Nebraska Lincoln, Lincoln, NB, USA
| | - Rebecca Bellone
- Department of Population Health and Reproduction, University of California Davis School of Veterinary Medicine, Davis, CA, USA
- Veterinary Genetics Laboratory, University of California Davis School of Veterinary Medicine, Davis, CA, USA
| | - Erin N Hales
- Department of Population Health and Reproduction, University of California Davis School of Veterinary Medicine, Davis, CA, USA
| | - Tomasz Ząbek
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Zbigniew Arent
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Małgorzata Kotula-Balak
- University Centre of Veterinary Medicine, University of Agriculture in Krakow, Mickiewicza 24/28, 30-059, Krakow, Poland
| | - Monika Bugno-Poniewierska
- Department of Animal Reproduction, Anatomy and Genomics, University of Agriculture in Kraków, al. Mickiewicza 24/28, 30-059, Kraków, Poland
| |
Collapse
|
30
|
Miras M, Pottier M, Schladt TM, Ejike JO, Redzich L, Frommer WB, Kim JY. Plasmodesmata and their role in assimilate translocation. JOURNAL OF PLANT PHYSIOLOGY 2022; 270:153633. [PMID: 35151953 DOI: 10.1016/j.jplph.2022.153633] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 01/26/2022] [Accepted: 01/26/2022] [Indexed: 06/14/2023]
Abstract
During multicellularization, plants evolved unique cell-cell connections, the plasmodesmata (PD). PD of angiosperms are complex cellular domains, embedded in the cell wall and consisting of multiple membranes and a large number of proteins. From the beginning, it had been assumed that PD provide passage for a wide range of molecules, from ions to metabolites and hormones, to RNAs and even proteins. In the context of assimilate allocation, it has been hypothesized that sucrose produced in mesophyll cells is transported via PD from cell to cell down a concentration gradient towards the phloem. Entry into the sieve element companion cell complex (SECCC) is then mediated on three potential routes, depending on the species and conditions, - either via diffusion across PD, after conversion to raffinose via PD using a polymer trap mechanism, or via a set of transporters which secrete sucrose from one cell and secondary active uptake into the SECCC. Multiple loading mechanisms can likely coexist. We here review the current knowledge regarding photoassimilate transport across PD between cells as a prerequisite for translocation from leaves to recipient organs, in particular roots and developing seeds. We summarize the state-of-the-art in protein composition, structure, transport mechanism and regulation of PD to apprehend their functions in carbohydrate allocation. Since many aspects of PD biology remain elusive, we highlight areas that require new approaches and technologies to advance our understanding of these enigmatic and important cell-cell connections.
Collapse
Affiliation(s)
- Manuel Miras
- Institute for Molecular Physiology, Heinrich-Heine-University Düsseldorf, Düsseldorf, 40225, Germany
| | - Mathieu Pottier
- Institute for Molecular Physiology, Heinrich-Heine-University Düsseldorf, Düsseldorf, 40225, Germany
| | - T Moritz Schladt
- Institute for Molecular Physiology, Heinrich-Heine-University Düsseldorf, Düsseldorf, 40225, Germany
| | - J Obinna Ejike
- Institute for Molecular Physiology and Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich-Heine-University Düsseldorf, Düsseldorf, 40225, Germany
| | - Laura Redzich
- Institute for Molecular Physiology and Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich-Heine-University Düsseldorf, Düsseldorf, 40225, Germany
| | - Wolf B Frommer
- Institute for Molecular Physiology and Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich-Heine-University Düsseldorf, Düsseldorf, 40225, Germany; Institute of Transformative Bio-Molecules (WPI-ITbM), Nagoya University, Chikusa, Nagoya, 464-8601, Japan.
| | - Ji-Yun Kim
- Institute for Molecular Physiology and Cluster of Excellence on Plant Sciences (CEPLAS), Heinrich-Heine-University Düsseldorf, Düsseldorf, 40225, Germany
| |
Collapse
|
31
|
Sorokina I, Mushegian AR, Koonin EV. Is Protein Folding a Thermodynamically Unfavorable, Active, Energy-Dependent Process? Int J Mol Sci 2022; 23:521. [PMID: 35008947 PMCID: PMC8745595 DOI: 10.3390/ijms23010521] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 12/30/2021] [Accepted: 12/31/2021] [Indexed: 02/04/2023] Open
Abstract
The prevailing current view of protein folding is the thermodynamic hypothesis, under which the native folded conformation of a protein corresponds to the global minimum of Gibbs free energy G. We question this concept and show that the empirical evidence behind the thermodynamic hypothesis of folding is far from strong. Furthermore, physical theory-based approaches to the prediction of protein folds and their folding pathways so far have invariably failed except for some very small proteins, despite decades of intensive theory development and the enormous increase of computer power. The recent spectacular successes in protein structure prediction owe to evolutionary modeling of amino acid sequence substitutions enhanced by deep learning methods, but even these breakthroughs provide no information on the protein folding mechanisms and pathways. We discuss an alternative view of protein folding, under which the native state of most proteins does not occupy the global free energy minimum, but rather, a local minimum on a fluctuating free energy landscape. We further argue that ΔG of folding is likely to be positive for the majority of proteins, which therefore fold into their native conformations only through interactions with the energy-dependent molecular machinery of living cells, in particular, the translation system and chaperones. Accordingly, protein folding should be modeled as it occurs in vivo, that is, as a non-equilibrium, active, energy-dependent process.
Collapse
Affiliation(s)
| | - Arcady R. Mushegian
- Division of Molecular and Cellular Biosciences, National Science Foundation, Alexandria, VA 22314, USA;
- Clare Hall College, University of Cambridge, Cambridge CB3 9AL, UK
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
32
|
Forman-Kay JD, Ditlev JA, Nosella ML, Lee HO. What are the distinguishing features and size requirements of biomolecular condensates and their implications for RNA-containing condensates? RNA (NEW YORK, N.Y.) 2022; 28:36-47. [PMID: 34772786 PMCID: PMC8675286 DOI: 10.1261/rna.079026.121] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Exciting recent work has highlighted that numerous cellular compartments lack encapsulating lipid bilayers (often called "membraneless organelles"), and that their structure and function are central to the regulation of key biological processes, including transcription, RNA splicing, translation, and more. These structures have been described as "biomolecular condensates" to underscore that biomolecules can be significantly concentrated in them. Many condensates, including RNA granules and processing bodies, are enriched in proteins and nucleic acids. Biomolecular condensates exhibit a range of material states from liquid- to gel-like, with the physical process of liquid-liquid phase separation implicated in driving or contributing to their formation. To date, in vitro studies of phase separation have provided mechanistic insights into the formation and function of condensates. However, the link between the often micron-sized in vitro condensates with nanometer-sized cellular correlates has not been well established. Consequently, questions have arisen as to whether cellular structures below the optical resolution limit can be considered biomolecular condensates. Similarly, the distinction between condensates and discrete dynamic hub complexes is debated. Here we discuss the key features that define biomolecular condensates to help understand behaviors of structures containing and generating RNA.
Collapse
Affiliation(s)
- Julie D Forman-Kay
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Jonathon A Ditlev
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
- Cell Biology Program, The Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
| | - Michael L Nosella
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Hyun O Lee
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| |
Collapse
|
33
|
Williams BAP, Williams TA, Trew J. Comparative Genomics of Microsporidia. EXPERIENTIA SUPPLEMENTUM (2012) 2022; 114:43-69. [PMID: 35543998 DOI: 10.1007/978-3-030-93306-7_2] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The microsporidia are a phylum of intracellular parasites that represent the eukaryotic cell in a state of extreme reduction, with genomes and metabolic capabilities embodying eukaryotic cells in arguably their most streamlined state. Over the past 20 years, microsporidian genomics has become a rapidly expanding field starting with sequencing of the genome of Encephalitozoon cuniculi, one of the first ever sequenced eukaryotes, to the current situation where we have access to the data from over 30 genomes across 20+ genera. Reaching back further in evolutionary history, to the point where microsporidia diverged from other eukaryotic lineages, we now also have genomic data for some of the closest known relatives of the microsporidia such as Rozella allomycis, Metchnikovella spp. and Amphiamblys sp. Data for these organisms allow us to better understand the genomic processes that shaped the emergence of the microsporidia as a group. These intensive genomic efforts have revealed some of the processes that have shaped microsporidian cells and genomes including patterns of genome expansions and contractions through gene gain and loss, whole genome duplication, differential patterns of invasion and purging of transposable elements. All these processes have been shown to occur across short and longer time scales to give rise to a phylum of parasites with dynamic genomes with a diversity of sizes and organisations.
Collapse
Affiliation(s)
| | - Tom A Williams
- School of Biological Sciences, University of Bristol, Bristol, UK
| | - Jahcub Trew
- School of Biosciences, University of Exeter, Exeter, UK
| |
Collapse
|
34
|
van den Bent I, Makrodimitris S, Reinders M. The Power of Universal Contextualized Protein Embeddings in Cross-species Protein Function Prediction. Evol Bioinform Online 2021; 17:11769343211062608. [PMID: 34880594 PMCID: PMC8647222 DOI: 10.1177/11769343211062608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 11/03/2021] [Indexed: 11/16/2022] Open
Abstract
Computationally annotating proteins with a molecular function is a difficult problem that is made even harder due to the limited amount of available labeled protein training data. Unsupervised protein embeddings partly circumvent this limitation by learning a universal protein representation from many unlabeled sequences. Such embeddings incorporate contextual information of amino acids, thereby modeling the underlying principles of protein sequences insensitive to the context of species. We used an existing pre-trained protein embedding method and subjected its molecular function prediction performance to detailed characterization, first to advance the understanding of protein language models, and second to determine areas of improvement. Then, we applied the model in a transfer learning task by training a function predictor based on the embeddings of annotated protein sequences of one training species and making predictions on the proteins of several test species with varying evolutionary distance. We show that this approach successfully generalizes knowledge about protein function from one eukaryotic species to various other species, outperforming both an alignment-based and a supervised-learning-based baseline. This implies that such a method could be effective for molecular function prediction in inadequately annotated species from understudied taxonomic kingdoms.
Collapse
Affiliation(s)
- Irene van den Bent
- Delft Bioinformatics Lab, Delft
University of Technology, Delft, the Netherlands
| | - Stavros Makrodimitris
- Delft Bioinformatics Lab, Delft
University of Technology, Delft, the Netherlands
- Keygene N.V., Wageningen, the
Netherlands
| | - Marcel Reinders
- Delft Bioinformatics Lab, Delft
University of Technology, Delft, the Netherlands
| |
Collapse
|
35
|
Adiguzel Y. Information-theoretic approach in allometric scaling relations of DNA and proteins. Chem Biol Drug Des 2021; 99:331-343. [PMID: 34855304 DOI: 10.1111/cbdd.13988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 10/06/2021] [Accepted: 11/14/2021] [Indexed: 11/28/2022]
Abstract
Allometric scaling relations can be observed in between molecular parameters. Hence, we looked for presence of such relation among sizes (i.e., lengths) of proteins and genes. Protein lengths exist in the literature as the number of amino acids. They can also be derived from the mRNA lengths. Here, we looked for allometric scaling relation by using such data and simultaneously, the data was compared with the sizes of genes and proteins that were obtained from our modified information-theoretic approach. Results implied presence of scaling relation in the calculated results. This was expected due to the implemented modification in the information-theoretic calculation. Relation in the literature-based data was lacking high goodness of fit value. It could be due to physical factors and selective pressures, which ended up in deviations of the literature-sourced values from those in the model. Genome size is correlated with cell size. Intracellular volume, which is related to the DNA size, would require certain number of proteins, the sizes of which can therefore be correlated with the protein sizes. Cell sizes, genome sizes, and average protein and gene sizes, along with the number of proteins, namely the expression levels of the genes, are the physical factors, and the molecular factors influence those physical factors. The selective pressures on those can act through the connection between those physical factors and limit the dynamic ranges. Biological measures could be prone to such forces and are likely to deviate from expected models, regardless of the validity of assumptions, unless those are also implemented in the models. Yet, present discrepancies could be pointing at the need for model improvement, data imperfection, invalid assumptions, etc. Still, current work highlights possible use of information-theoretic approach in allometric scaling relations' studies.
Collapse
Affiliation(s)
- Yekbun Adiguzel
- Department of Medical Biology, School of Medicine, Atilim University, Ankara, Turkey
| |
Collapse
|
36
|
Fan C, Deng Q, Zhu TF. Bioorthogonal information storage in L-DNA with a high-fidelity mirror-image Pfu DNA polymerase. Nat Biotechnol 2021; 39:1548-1555. [PMID: 34326549 DOI: 10.1038/s41587-021-00969-6] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 05/31/2021] [Indexed: 02/07/2023]
Abstract
Natural DNA is exquisitely evolved to store genetic information. The chirally inverted L-DNA, possessing the same informational capacity but resistant to biodegradation, may serve as a robust, bioorthogonal information repository. Here we chemically synthesize a 90-kDa high-fidelity mirror-image Pfu DNA polymerase that enables accurate assembly of a kilobase-sized mirror-image gene. We use the polymerase to encode in L-DNA an 1860 paragraph by Louis Pasteur that first proposed a mirror-image world of biology. We realize chiral steganography by embedding a chimeric D-DNA/L-DNA key molecule in a D-DNA storage library, which conveys a false or secret message depending on the chirality of reading. Furthermore, we show that a trace amount of an L-DNA barcode preserved in water from a local pond remains amplifiable and sequenceable for 1 year, whereas a D-DNA barcode under the same conditions could not be amplified after 1 day. These next-generation mirror-image molecular tools may transform the development of advanced mirror-image biology systems and pave the way for the realization of the mirror-image central dogma and exploration of their applications.
Collapse
Affiliation(s)
- Chuyao Fan
- School of Life Sciences, Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, Beijing Advanced Innovation Center for Structural Biology, Center for Synthetic and Systems Biology, Ministry of Education Key Laboratory of Bioorganic Phosphorus Chemistry and Chemical Biology, Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing, China
| | - Qiang Deng
- School of Life Sciences, Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, Beijing Advanced Innovation Center for Structural Biology, Center for Synthetic and Systems Biology, Ministry of Education Key Laboratory of Bioorganic Phosphorus Chemistry and Chemical Biology, Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing, China
| | - Ting F Zhu
- School of Life Sciences, Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, Beijing Advanced Innovation Center for Structural Biology, Center for Synthetic and Systems Biology, Ministry of Education Key Laboratory of Bioorganic Phosphorus Chemistry and Chemical Biology, Ministry of Education Key Laboratory of Bioinformatics, Tsinghua University, Beijing, China.
| |
Collapse
|
37
|
MacKinnon SS, Madani Tonekaboni SA, Windemuth A. Proteome-Scale Drug-Target Interaction Predictions: Approaches and Applications. Curr Protoc 2021; 1:e302. [PMID: 34794211 DOI: 10.1002/cpz1.302] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Drug-Target interaction predictions are an important cornerstone of computer-aided drug discovery. While predictive methods around individual targets have a long history, the application of proteome-scale models is relatively recent. In this overview, we will provide the context required to understand advances in this emerging field within computational drug discovery, evaluate emerging technologies for suitability to given tasks, and provide guidelines for the design and implementation of new drug-target interaction prediction models. We will discuss the validation approaches used, and propose a set of key criteria that should be applied to evaluate their validity. We note that we find widespread deficiencies in the existing literature, making it difficult to judge the practical effectiveness of some of the techniques proposed from their publications alone. We hope that this review may help remedy this situation and increase awareness of several sources of bias that may enter into commonly used cross-validation methods. © 2021 Cyclica Inc. Current Protocols published by Wiley Periodicals LLC.
Collapse
|
38
|
Zamyatnin AA, Belozerskaya TA, Zamyatnin AA. Taxonomy of Mitochondrial Cytochrome B Proteins of the Same Amino Acid Sequence Length. ScientificWorldJournal 2021; 2021:1041818. [PMID: 34803523 PMCID: PMC8601843 DOI: 10.1155/2021/1041818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 10/26/2021] [Indexed: 11/17/2022] Open
Abstract
Prior to this study, we discovered a protein characterized by many different amino acid sequences with the same number of amino acid residues. This turned out to be a unique cytochrome b, in which 1048 molecules out of 1689 contain 379 amino acid residues. A detailed study of the occurrence of this protein in living organisms at different taxonomic levels (from biological domains to biological orders of animals) has been carried out in the work presented here. We found that the main part of all b cytochromes is present in eukaryotes (99.2%), in biological kingdoms (95.9% in animals), in biological phylums (97.5% in chordates), and in biological classes (79.7% in mammals). Withal, this protein, containing 379 amino acid residues and characterized by many different amino acid sequences, is found only in eukaryotes (100%), only in animals (100%) and mainly in mammals (81.1%). Thus, a representative that has cytochrome b with a corresponding number of amino acid residues has not yet been identified among archaea and prokaryotes, while it is common in representatives of different biological types, classes, and orders of animals. It is believed that the structural diversity of a given protein within the same length and its one function of participation in the process of electron transfer relate to the physicochemical features of the extra- and intramembrane fragments of the polypeptide chain of this protein.
Collapse
Affiliation(s)
- Alexander A. Zamyatnin
- A.N. Bach Institute of Biochemistry, Federal Research Center of Biotechnology, Russian Academy of Sciences, Moscow 119071, Russia
| | - Tatiana A. Belozerskaya
- A.N. Bach Institute of Biochemistry, Federal Research Center of Biotechnology, Russian Academy of Sciences, Moscow 119071, Russia
| | - Andrey A. Zamyatnin
- Institute of Molecular Medicine, Sechenov First Moscow State Medical University, Moscow, Russia
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia
- Department of Biotechnology, Sirius University of Science and Technology, Sochi, Russia
| |
Collapse
|
39
|
Conquer by cryo-EM without physically dividing. Biochem Soc Trans 2021; 49:2287-2298. [PMID: 34709401 DOI: 10.1042/bst20210360] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 09/29/2021] [Accepted: 10/05/2021] [Indexed: 12/15/2022]
Abstract
This mini-review provides an update regarding the substantial progress that has been made in using single-particle cryo-EM to obtain high-resolution structures for proteins and other macromolecules whose particle sizes are smaller than 100 kDa. We point out that establishing the limits of what can be accomplished, both in terms of particle size and attainable resolution, serves as a guide for what might be expected when attempting to improve the resolution of small flexible portions of a larger structure using focused refinement approaches. These approaches, which involve computationally ignoring all but a specific, targeted region of interest on the macromolecules, is known as 'masking and refining,' and it thus is the computational equivalent of the 'divide and conquer' approach that has been used so successfully in X-ray crystallography. The benefit of masked refinement, however, is that one is able to determine structures in their native architectural context, without physically separating them from the biological connections that they require for their function. This mini-review also compares where experimental achievements currently stand relative to various theoretical estimates for the smallest particle size that can be successfully reconstructed to high resolution. Since it is clear that a substantial gap still remains between the two, we briefly recap the areas in which further improvement seems possible, both in equipment and in methods.
Collapse
|
40
|
Ahnert M, Schalk T, Brückner H, Effenberger J, Kuehn V, Krebs P. Organic matter parameters in WWTP - a critical review and recommendations for application in activated sludge modelling. WATER SCIENCE AND TECHNOLOGY : A JOURNAL OF THE INTERNATIONAL ASSOCIATION ON WATER POLLUTION RESEARCH 2021; 84:2093-2112. [PMID: 34810300 DOI: 10.2166/wst.2021.419] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
This paper includes a comprehensive literature review of sludge composition data from wastewater treatment plants. 722 data sets from 249 sources were used to establish typical ratios between COD and solids-based parameters and to verify rule-of-thumb values, respectively. Confirmation of these typical ratios can also be accomplished by using biochemical composition data. It is shown that a correlation between data from proteins, lipids and carbohydrates analysis can be related to COD/VSS ratios. Finally, using the findings from the literature review, the organic and inorganic conversion factors of COD fractions in activated sludge models are adjusted to solids-based parameters. It was shown that with the adjustments of the factors and a partition of the particulate inert fraction into a fraction assigned to the influent and a fraction assigned to the endogenous products, a better agreement with the ratios of COD/VSS in the individual sludge streams can be established.
Collapse
Affiliation(s)
- Markus Ahnert
- Technische Universität Dresden, Institute of Urban and Industrial Water Management, 01062 Dresden, Germany E-mail:
| | - Thomas Schalk
- Technische Universität Dresden, Institute of Urban and Industrial Water Management, 01062 Dresden, Germany E-mail:
| | - Heike Brückner
- Technische Universität Dresden, Institute of Urban and Industrial Water Management, 01062 Dresden, Germany E-mail:
| | | | - Volker Kuehn
- Stadtentwässerung Dresden GmbH, Scharfenberger Str. 152, 01139 Dresden, Germany
| | - Peter Krebs
- Technische Universität Dresden, Institute of Urban and Industrial Water Management, 01062 Dresden, Germany E-mail:
| |
Collapse
|
41
|
A general approach to protein folding using thermostable exoshells. Nat Commun 2021; 12:5720. [PMID: 34588451 PMCID: PMC8481291 DOI: 10.1038/s41467-021-25996-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 09/07/2021] [Indexed: 02/08/2023] Open
Abstract
In vitro protein folding is a complex process which often results in protein aggregation, low yields and low specific activity. Here we report the use of nanoscale exoshells (tES) to provide complementary nanoenvironments for the folding and release of 12 highly diverse protein substrates ranging from small protein toxins to human albumin, a dimeric protein (alkaline phosphatase), a trimeric ion channel (Omp2a) and the tetrameric tumor suppressor, p53. These proteins represent a unique diversity in size, volume, disulfide linkages, isoelectric point and multi versus monomeric nature of their functional units. Protein encapsulation within tES increased crude soluble yield (3-fold to >100-fold), functional yield (2-fold to >100-fold) and specific activity (3-fold to >100-fold) for all the proteins tested. The average soluble yield was 6.5 mg/100 mg of tES with charge complementation between the tES internal cavity and the protein substrate being the primary determinant of functional folding. Our results confirm the importance of nanoscale electrostatic effects and provide a solution for folding proteins in vitro.
Collapse
|
42
|
Tsaban T, Stupp D, Sherill-Rofe D, Bloch I, Sharon E, Schueler-Furman O, Wiener R, Tabach Y. CladeOScope: functional interactions through the prism of clade-wise co-evolution. NAR Genom Bioinform 2021; 3:lqab024. [PMID: 33928243 PMCID: PMC8057497 DOI: 10.1093/nargab/lqab024] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Revised: 03/12/2021] [Accepted: 03/18/2021] [Indexed: 12/11/2022] Open
Abstract
Mapping co-evolved genes via phylogenetic profiling (PP) is a powerful approach to uncover functional interactions between genes and to associate them with pathways. Despite many successful endeavors, the understanding of co-evolutionary signals in eukaryotes remains partial. Our hypothesis is that 'Clades', branches of the tree of life (e.g. primates and mammals), encompass signals that cannot be detected by PP using all eukaryotes. As such, integrating information from different clades should reveal local co-evolution signals and improve function prediction. Accordingly, we analyzed 1028 genomes in 66 clades and demonstrated that the co-evolutionary signal was scattered across clades. We showed that functionally related genes are frequently co-evolved in only parts of the eukaryotic tree and that clades are complementary in detecting functional interactions within pathways. We examined the non-homologous end joining pathway and the UFM1 ubiquitin-like protein pathway and showed that both demonstrated distinguished co-evolution patterns in specific clades. Our research offers a different way to look at co-evolution across eukaryotes and points to the importance of modular co-evolution analysis. We developed the 'CladeOScope' PP method to integrate information from 16 clades across over 1000 eukaryotic genomes and is accessible via an easy to use web server at http://cladeoscope.cs.huji.ac.il.
Collapse
Affiliation(s)
- Tomer Tsaban
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada and Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Doron Stupp
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada and Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Dana Sherill-Rofe
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada and Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Idit Bloch
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada and Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Elad Sharon
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada and Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada and Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Reuven Wiener
- Department of Biochemistry and Molecular Biology, Institute for Medical Research Israel-Canada and Hadassah Medical School,The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| | - Yuval Tabach
- Department of Developmental Biology and Cancer Research, Institute for Medical Research Israel-Canada and Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem 9112001, Israel
| |
Collapse
|
43
|
Holzheu P, Krebs M, Larasati C, Schumacher K, Kummer U. An integrative view on vacuolar pH homeostasis in Arabidopsis thaliana: Combining mathematical modeling and experimentation. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2021; 106:1541-1556. [PMID: 33780094 DOI: 10.1111/tpj.15251] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 02/27/2021] [Accepted: 03/10/2021] [Indexed: 06/12/2023]
Abstract
The acidification of plant vacuoles is of great importance for various physiological processes, as a multitude of secondary active transporters utilize the proton gradient established across the vacuolar membrane. Vacuolar-type H+ -translocating ATPases and a pyrophosphatase are thought to enable vacuoles to accumulate protons against their electrochemical potential. However, recent studies pointed to the ATPase located at the trans-Golgi network/early endosome (TGN/EE) to contribute to vacuolar acidification in a manner not understood as of now. Here, we combined experimental data and computational modeling to test different hypotheses for vacuolar acidification mechanisms. For this, we analyzed different models with respect to their ability to describe existing experimental data. To better differentiate between alternative acidification mechanisms, new experimental data have been generated. By fitting the models to the experimental data, we were able to prioritize the hypothesis in which vesicular trafficking of Ca2+ /H+ -antiporters from the TGN/EE to the vacuolar membrane and the activity of ATP-dependent Ca2+ -pumps at the tonoplast might explain the residual acidification observed in Arabidopsis mutants defective in vacuolar proton pump activity. The presented modeling approach provides an integrative perspective on vacuolar pH regulation in Arabidopsis and holds potential to guide further experimental work.
Collapse
Affiliation(s)
- Pascal Holzheu
- Department of Modeling of Biological Processes, COS Heidelberg/Bioquant, Heidelberg University, Im Neuenheimer Feld 267, Heidelberg, 69120, Germany
| | - Melanie Krebs
- Department of Cell Biology, COS Heidelberg, Heidelberg University, Im Neuenheimer Feld 230, Heidelberg, 69120, Germany
| | - Catharina Larasati
- Department of Cell Biology, COS Heidelberg, Heidelberg University, Im Neuenheimer Feld 230, Heidelberg, 69120, Germany
| | - Karin Schumacher
- Department of Cell Biology, COS Heidelberg, Heidelberg University, Im Neuenheimer Feld 230, Heidelberg, 69120, Germany
| | - Ursula Kummer
- Department of Modeling of Biological Processes, COS Heidelberg/Bioquant, Heidelberg University, Im Neuenheimer Feld 267, Heidelberg, 69120, Germany
| |
Collapse
|
44
|
What's in a mass? Biochem Soc Trans 2021; 49:1027-1037. [PMID: 33929513 DOI: 10.1042/bst20210288] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 03/27/2021] [Accepted: 03/30/2021] [Indexed: 02/03/2023]
Abstract
This short essay pretends to make the reader reflect on the concept of biological mass and on the added value that the determination of this molecular property of a protein brings to the interpretation of evolutionary and translational snake venomics research. Starting from the premise that the amino acid sequence is the most distinctive primary molecular characteristics of any protein, the thesis underlying the first part of this essay is that the isotopic distribution of a protein's molecular mass serves to unambiguously differentiate it from any other of an organism's proteome. In the second part of the essay, we discuss examples of collaborative projects among our laboratories, where mass profiling of snake venom PLA2 across conspecific populations played a key role revealing dispersal routes that determined the current phylogeographic pattern of the species.
Collapse
|
45
|
Mohanta TK, Mishra AK, Khan A, Hashem A, Abd-Allah EF, Al-Harrasi A. Virtual 2-D map of the fungal proteome. Sci Rep 2021; 11:6676. [PMID: 33758316 PMCID: PMC7988114 DOI: 10.1038/s41598-021-86201-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 03/03/2021] [Indexed: 02/08/2023] Open
Abstract
The molecular weight and isoelectric point (pI) of the proteins plays important role in the cell. Depending upon the shape, size, and charge, protein provides its functional role in different parts of the cell. Therefore, understanding to the knowledge of their molecular weight and charges is (pI) is very important. Therefore, we conducted a proteome-wide analysis of protein sequences of 689 fungal species (7.15 million protein sequences) and construct a virtual 2-D map of the fungal proteome. The analysis of the constructed map revealed the presence of a bimodal distribution of fungal proteomes. The molecular mass of individual fungal proteins ranged from 0.202 to 2546.166 kDa and the predicted isoelectric point (pI) ranged from 1.85 to 13.759 while average molecular weight of fungal proteome was 50.98 kDa. A non-ribosomal peptide synthase (RFU80400.1) found in Trichoderma arundinaceum was identified as the largest protein in the fungal kingdom. The collective fungal proteome is dominated by the presence of acidic rather than basic pI proteins and Leu is the most abundant amino acid while Cys is the least abundant amino acid. Aspergillus ustus encodes the highest percentage (76.62%) of acidic pI proteins while Nosema ceranae was found to encode the highest percentage (66.15%) of basic pI proteins. Selenocysteine and pyrrolysine amino acids were not found in any of the analysed fungal proteomes. Although the molecular weight and pI of the protein are of enormous important to understand their functional roles, the amino acid compositions of the fungal protein will enable us to understand the synonymous codon usage in the fungal kingdom. The small peptides identified during the study can provide additional biotechnological implication.
Collapse
Affiliation(s)
- Tapan Kumar Mohanta
- Natural and Medical Sciences Research Center, University of Nizwa, Nizwa, 616, Oman.
| | - Awdhesh Kumar Mishra
- Department of Biotechnology, Yeungnam University, Gyeongsan, Gyeongsangbuk-do, 38541, Republic of Korea
| | - Adil Khan
- Natural and Medical Sciences Research Center, University of Nizwa, Nizwa, 616, Oman
| | - Abeer Hashem
- Botany and Microbiology Department, College of Science, King Saud University, P.O. Box. 2460, Riyadh, 11451, Saudi Arabia
- Mycology and Plant Disease Survey Department, Plant Pathology Research Institute, ARC, Giza, 12511, Egypt
| | - Elsayed Fathi Abd-Allah
- Plant Production Department, College of Food and Agricultural Sciences, King Saud University, P.O. Box. 2460, Riyadh, 11451, Saudi Arabia
| | - Ahmed Al-Harrasi
- Natural and Medical Sciences Research Center, University of Nizwa, Nizwa, 616, Oman.
| |
Collapse
|
46
|
Padgitt-Cobb LK, Kingan SB, Wells J, Elser J, Kronmiller B, Moore D, Concepcion G, Peluso P, Rank D, Jaiswal P, Henning J, Hendrix DA. A draft phased assembly of the diploid Cascade hop (Humulus lupulus) genome. THE PLANT GENOME 2021; 14:e20072. [PMID: 33605092 DOI: 10.1002/tpg2.20072] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 10/03/2020] [Indexed: 05/25/2023]
Abstract
Hop (Humulus lupulus L. var Lupulus) is a diploid, dioecious plant with a history of cultivation spanning more than one thousand years. Hop cones are valued for their use in brewing and contain compounds of therapeutic interest including xanthohumol. Efforts to determine how biochemical pathways responsible for desirable traits are regulated have been challenged by the large (2.8 Gb), repetitive, and heterozygous genome of hop. We present a draft haplotype-phased assembly of the Cascade cultivar genome. Our draft assembly and annotation of the Cascade genome is the most extensive representation of the hop genome to date. PacBio long-read sequences from hop were assembled with FALCON and partially phased with FALCON-Unzip. Comparative analysis of haplotype sequences provides insight into selective pressures that have driven evolution in hop. We discovered genes with greater sequence divergence enriched for stress-response, growth, and flowering functions in the draft phased assembly. With improved resolution of long terminal retrotransposons (LTRs) due to long-read sequencing, we found that hop is over 70% repetitive. We identified a homolog of cannabidiolic acid synthase (CBDAS) that is expressed in multiple tissues. The approaches we developed to analyze the draft phased assembly serve to deepen our understanding of the genomic landscape of hop and may have broader applicability to the study of other large, complex genomes.
Collapse
Affiliation(s)
- Lillian K Padgitt-Cobb
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR, 97331, USA
| | - Sarah B Kingan
- Pacific Biosciences of California, Menlo Park, CA, 94025, USA
| | - Jackson Wells
- Center for Genome Research and Biocomputing, Oregon State University, Corvallis, OR, 97331, USA
| | - Justin Elser
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, 97331, USA
| | - Brent Kronmiller
- Center for Genome Research and Biocomputing, Oregon State University, Corvallis, OR, 97331, USA
| | | | | | - Paul Peluso
- Pacific Biosciences of California, Menlo Park, CA, 94025, USA
| | - David Rank
- Pacific Biosciences of California, Menlo Park, CA, 94025, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, 97331, USA
| | | | - David A Hendrix
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis, OR, 97331, USA
- School of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, 97331, USA
| |
Collapse
|
47
|
Pavlova YS, Paez-Espino D, Morozov AY, Belalov IS. Searching for fat tails in CRISPR-Cas systems: Data analysis and mathematical modeling. PLoS Comput Biol 2021; 17:e1008841. [PMID: 33770071 PMCID: PMC8026048 DOI: 10.1371/journal.pcbi.1008841] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Revised: 04/07/2021] [Accepted: 03/01/2021] [Indexed: 12/28/2022] Open
Abstract
Understanding CRISPR-Cas systems-the adaptive defence mechanism that about half of bacterial species and most of archaea use to neutralise viral attacks-is important for explaining the biodiversity observed in the microbial world as well as for editing animal and plant genomes effectively. The CRISPR-Cas system learns from previous viral infections and integrates small pieces from phage genomes called spacers into the microbial genome. The resulting library of spacers collected in CRISPR arrays is then compared with the DNA of potential invaders. One of the most intriguing and least well understood questions about CRISPR-Cas systems is the distribution of spacers across the microbial population. Here, using empirical data, we show that the global distribution of spacer numbers in CRISPR arrays across multiple biomes worldwide typically exhibits scale-invariant power law behaviour, and the standard deviation is greater than the sample mean. We develop a mathematical model of spacer loss and acquisition dynamics which fits observed data from almost four thousand metagenomes well. In analogy to the classical 'rich-get-richer' mechanism of power law emergence, the rate of spacer acquisition is proportional to the CRISPR array size, which allows a small proportion of CRISPRs within the population to possess a significant number of spacers. Our study provides an alternative explanation for the rarity of all-resistant super microbes in nature and why proliferation of phages can be highly successful despite the effectiveness of CRISPR-Cas systems.
Collapse
Affiliation(s)
- Yekaterina S. Pavlova
- Mathematics Department, Palomar College, San Marcos, California, United States of America
| | - David Paez-Espino
- Department of Energy, Joint Genome Institute, Walnut Creek, California, United States of America
- Mammoth BioSciences, South San Francisco, California, United States of America
| | - Andrew Yu. Morozov
- School of Mathematics and Actuarial Science, University of Leicester, Leicester, United Kingdom
- Institute of Ecology and Evolution, Russian Academy of Sciences, Moscow, Russia
| | - Ilya S. Belalov
- Laboratory of Microbial Viruses, Winogradsky Institute of Microbiology, Research Center of Biotechnology RAS, Moscow, Russia
| |
Collapse
|
48
|
Leeb S, Yang F, Oliveberg M, Danielsson J. Connecting Longitudinal and Transverse Relaxation Rates in Live-Cell NMR. J Phys Chem B 2020; 124:10698-10707. [PMID: 33179918 PMCID: PMC7735724 DOI: 10.1021/acs.jpcb.0c08274] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 10/22/2020] [Indexed: 12/26/2022]
Abstract
In the cytosolic environment, protein crowding and Brownian motions result in numerous transient encounters. Each such encounter event increases the apparent size of the interacting molecules, leading to slower rotational tumbling. The extent of transient protein complexes formed in live cells can conveniently be quantified by an apparent viscosity, based on NMR-detected spin-relaxation measurements, that is, the longitudinal (T1) and transverse (T2) relaxation. From combined analysis of three different proteins and surface mutations thereof, we find that T2 implies significantly higher apparent viscosity than T1. At first sight, the effect on T1 and T2 seems thus nonunifiable, consistent with previous reports on other proteins. We show here that the T1 and T2 deviation is actually not a inconsistency but an expected feature of a system with fast exchange between free monomers and transient complexes. In this case, the deviation is basically reconciled by a model with fast exchange between the free-tumbling reporter protein and a transient complex with a uniform 143 kDa partner. The analysis is then taken one step further by accounting for the fact that the cytosolic content is by no means uniform but comprises a wide range of molecular sizes. Integrating over the complete size distribution of the cytosolic interaction ensemble enables us to predict both T1 and T2 from a single binding model. The result yields a bound population for each protein variant and provides a quantification of the transient interactions. We finally extend the approach to obtain a correction term for the shape of a database-derived mass distribution of the interactome in the mammalian cytosol, in good accord with the existing data of the cellular composition.
Collapse
Affiliation(s)
- Sarah Leeb
- Department of Biochemistry and Biophysics,
Arrhenius Laboratories of Natural Sciences, Stockholm University, Stockholm 106 91, Sweden
| | - Fan Yang
- Department of Biochemistry and Biophysics,
Arrhenius Laboratories of Natural Sciences, Stockholm University, Stockholm 106 91, Sweden
| | - Mikael Oliveberg
- Department of Biochemistry and Biophysics,
Arrhenius Laboratories of Natural Sciences, Stockholm University, Stockholm 106 91, Sweden
| | - Jens Danielsson
- Department of Biochemistry and Biophysics,
Arrhenius Laboratories of Natural Sciences, Stockholm University, Stockholm 106 91, Sweden
| |
Collapse
|
49
|
Massange-Sánchez JA, Casados-Vázquez LE, Juarez-Colunga S, Sawers RJH, Tiessen A. The Phosphoglycerate Kinase (PGK) Gene Family of Maize ( Zea mays var. B73). PLANTS (BASEL, SWITZERLAND) 2020; 9:plants9121639. [PMID: 33255472 PMCID: PMC7761438 DOI: 10.3390/plants9121639] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 08/27/2020] [Accepted: 09/08/2020] [Indexed: 05/17/2023]
Abstract
Phosphoglycerate kinase (PGK, E.C. 2.7.2.3) interconverts ADP + 1,3-bisphospho-glycerate (1,3-bPGA) to ATP + 3-phosphoglycerate (3PGA). While most bacteria have a single pgk gene and mammals possess two copies, plant genomes contain three or more PGK genes. In this study, we identified five Pgk genes in the Zea mays var. B73 genome, predicted to encode proteins targeted to different subcellular compartments: ZmPgk1, ZmPgk2, and ZmPgk4 (chloroplast), ZmPgk3 (cytosol), and ZmPgk5 (nucleus). The expression of ZmPgk3 was highest in non-photosynthetic tissues (roots and cobs), where PGK activity was also greatest, consistent with a function in glycolysis. Green tissues (leaf blade and husk leaf) showed intermediate levels of PGK activity, and predominantly expressed ZmPgk1 and ZmPgk2, suggesting involvement in photosynthetic metabolism. ZmPgk5 was weakly expressed and ZmPgk4 was not detected in any tissue. Phylogenetic analysis showed that the photosynthetic and glycolytic isozymes of plants clustered together, but were distinct from PGKs of animals, fungi, protozoa, and bacteria, indicating that photosynthetic and glycolytic isozymes of plants diversified after the divergence of the plant lineage from other groups. These results show the distinct role of each PGK in maize and provide the basis for future studies into the regulation and function of this key enzyme.
Collapse
Affiliation(s)
- Julio A. Massange-Sánchez
- Departamento de Ingeniería Genética, CINVESTAV Unidad Irapuato, Irapuato 36821, Mexico; (L.E.C.-V.); (S.J.-C.); (A.T.)
- Unidad de Biotecnología Vegetal, Centro de Investigación y Asistencia en Tecnología y Diseño del Estado de Jalisco A.C. (CIATEJ) Subsede Zapopan, Guadalajara 44270, Mexico
- Correspondence: ; Tel.: +52-(33)-3345-5200 (ext. 1700)
| | - Luz E. Casados-Vázquez
- Departamento de Ingeniería Genética, CINVESTAV Unidad Irapuato, Irapuato 36821, Mexico; (L.E.C.-V.); (S.J.-C.); (A.T.)
- Life Science Division Food Department, University of Guanajuato Campus Irapuato-Salamanca, Irapuato, Guanajuato 36500, Mexico
| | - Sheila Juarez-Colunga
- Departamento de Ingeniería Genética, CINVESTAV Unidad Irapuato, Irapuato 36821, Mexico; (L.E.C.-V.); (S.J.-C.); (A.T.)
| | - Ruairidh J. H. Sawers
- Department of Plant Science, The Pennsylvania State University, State College, PA 16801, USA;
| | - Axel Tiessen
- Departamento de Ingeniería Genética, CINVESTAV Unidad Irapuato, Irapuato 36821, Mexico; (L.E.C.-V.); (S.J.-C.); (A.T.)
- Laboratorio Nacional PlanTECC, Ciudad de México C.P. 06020, Mexico
| |
Collapse
|
50
|
Arginine-Rich Small Proteins with a Domain of Unknown Function, DUF1127, Play a Role in Phosphate and Carbon Metabolism of Agrobacterium tumefaciens. J Bacteriol 2020; 202:JB.00309-20. [PMID: 33093235 DOI: 10.1128/jb.00309-20] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 07/21/2020] [Indexed: 02/06/2023] Open
Abstract
In any given organism, approximately one-third of all proteins have a yet-unknown function. A widely distributed domain of unknown function is DUF1127. Approximately 17,000 proteins with such an arginine-rich domain are found in 4,000 bacteria. Most of them are single-domain proteins, and a large fraction qualifies as small proteins with fewer than 50 amino acids. We systematically identified and characterized the seven DUF1127 members of the plant pathogen Agrobacterium tumefaciens They all give rise to authentic proteins and are differentially expressed as shown at the RNA and protein levels. The seven proteins fall into two subclasses on the basis of their length, sequence, and reciprocal regulation by the LysR-type transcription factor LsrB. The absence of all three short DUF1127 proteins caused a striking phenotype in later growth phases and increased cell aggregation and biofilm formation. Protein profiling and transcriptome sequencing (RNA-seq) analysis of the wild type and triple mutant revealed a large number of differentially regulated genes in late exponential and stationary growth. The most affected genes are involved in phosphate uptake, glycine/serine homeostasis, and nitrate respiration. The results suggest a redundant function of the small DUF1127 paralogs in nutrient acquisition and central carbon metabolism of A. tumefaciens They may be required for diauxic switching between carbon sources when sugar from the medium is depleted. We end by discussing how DUF1127 might confer such a global impact on cell physiology and gene expression.IMPORTANCE Despite being prevalent in numerous ecologically and clinically relevant bacterial species, the biological role of proteins with a domain of unknown function, DUF1127, is unclear. Experimental models are needed to approach their elusive function. We used the phytopathogen Agrobacterium tumefaciens, a natural genetic engineer that causes crown gall disease, and focused on its three small DUF1127 proteins. They have redundant and pervasive roles in nutrient acquisition, cellular metabolism, and biofilm formation. The study shows that small proteins have important previously missed biological functions. How small basic proteins can have such a broad impact is a fascinating prospect of future research.
Collapse
|