1
|
Structural genomics and the Protein Data Bank. J Biol Chem 2021; 296:100747. [PMID: 33957120 PMCID: PMC8166929 DOI: 10.1016/j.jbc.2021.100747] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 04/16/2021] [Accepted: 04/30/2021] [Indexed: 12/14/2022] Open
Abstract
The field of Structural Genomics arose over the last 3 decades to address a large and rapidly growing divergence between microbial genomic, functional, and structural data. Several international programs took advantage of the vast genomic sequence information and evaluated the feasibility of structure determination for expanded and newly discovered protein families. As a consequence, structural genomics has developed structure-determination pipelines and applied them to a wide range of novel, uncharacterized proteins, often from “microbial dark matter,” and later to proteins from human pathogens. Advances were especially needed in protein production and rapid de novo structure solution. The experimental three-dimensional models were promptly made public, facilitating structure determination of other members of the family and helping to understand their molecular and biochemical functions. Improvements in experimental methods and databases resulted in fast progress in molecular and structural biology. The Protein Data Bank structure repository played a central role in the coordination of structural genomics efforts and the structural biology community as a whole. It facilitated development of standards and validation tools essential for maintaining high quality of deposited structural data.
Collapse
|
2
|
Ageorges V, Monteiro R, Leroy S, Burgess CM, Pizza M, Chaucheyras-Durand F, Desvaux M. Molecular determinants of surface colonisation in diarrhoeagenic Escherichia coli (DEC): from bacterial adhesion to biofilm formation. FEMS Microbiol Rev 2021; 44:314-350. [PMID: 32239203 DOI: 10.1093/femsre/fuaa008] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2019] [Accepted: 03/31/2020] [Indexed: 12/11/2022] Open
Abstract
Escherichia coli is primarily known as a commensal colonising the gastrointestinal tract of infants very early in life but some strains being responsible for diarrhoea, which can be especially severe in young children. Intestinal pathogenic E. coli include six pathotypes of diarrhoeagenic E. coli (DEC), namely, the (i) enterotoxigenic E. coli, (ii) enteroaggregative E. coli, (iii) enteropathogenic E. coli, (iv) enterohemorragic E. coli, (v) enteroinvasive E. coli and (vi) diffusely adherent E. coli. Prior to human infection, DEC can be found in natural environments, animal reservoirs, food processing environments and contaminated food matrices. From an ecophysiological point of view, DEC thus deal with very different biotopes and biocoenoses all along the food chain. In this context, this review focuses on the wide range of surface molecular determinants acting as surface colonisation factors (SCFs) in DEC. In the first instance, SCFs can be broadly discriminated into (i) extracellular polysaccharides, (ii) extracellular DNA and (iii) surface proteins. Surface proteins constitute the most diverse group of SCFs broadly discriminated into (i) monomeric SCFs, such as autotransporter (AT) adhesins, inverted ATs, heat-resistant agglutinins or some moonlighting proteins, (ii) oligomeric SCFs, namely, the trimeric ATs and (iii) supramolecular SCFs, including flagella and numerous pili, e.g. the injectisome, type 4 pili, curli chaperone-usher pili or conjugative pili. This review also details the gene regulatory network of these numerous SCFs at the various stages as it occurs from pre-transcriptional to post-translocational levels, which remains to be fully elucidated in many cases.
Collapse
Affiliation(s)
- Valentin Ageorges
- Université Clermont Auvergne, INRAE, MEDiS, F-63000 Clermont-Ferrand, France
| | - Ricardo Monteiro
- Université Clermont Auvergne, INRAE, MEDiS, F-63000 Clermont-Ferrand, France.,GSK, Via Fiorentina 1, 53100 Siena, Italy
| | - Sabine Leroy
- Université Clermont Auvergne, INRAE, MEDiS, F-63000 Clermont-Ferrand, France
| | - Catherine M Burgess
- Food Safety Department, Teagasc Food Research Centre, Ashtown, Dublin 15, Ireland
| | | | - Frédérique Chaucheyras-Durand
- Université Clermont Auvergne, INRAE, MEDiS, F-63000 Clermont-Ferrand, France.,Lallemand Animal Nutrition SAS, F-31702 Blagnac Cedex, France
| | - Mickaël Desvaux
- Université Clermont Auvergne, INRAE, MEDiS, F-63000 Clermont-Ferrand, France
| |
Collapse
|
3
|
Use of evolutionary information in the fitting of atomic level protein models in low resolution cryo-EM map of a protein assembly improves the accuracy of the fitting. J Struct Biol 2016; 195:294-305. [PMID: 27444391 DOI: 10.1016/j.jsb.2016.07.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Revised: 07/15/2016] [Accepted: 07/18/2016] [Indexed: 11/22/2022]
Abstract
Protein-protein interface residues, especially those at the core of the interface, exhibit higher conservation than residues in solvent exposed regions. Here, we explore the ability of this differential conservation to evaluate fittings of atomic models in low-resolution cryo-EM maps and select models from the ensemble of solutions that are often proposed by different model fitting techniques. As a prelude, using a non-redundant and high-resolution structural dataset involving 125 permanent and 95 transient complexes, we confirm that core interface residues are conserved significantly better than nearby non-interface residues and this result is used in the cryo-EM map analysis. From the analysis of inter-component interfaces in a set of fitted models associated with low-resolution cryo-EM maps of ribosomes, chaperones and proteasomes we note that a few poorly conserved residues occur at interfaces. Interestingly a few conserved residues are not in the interface, though they are close to the interface. These observations raise the potential requirement of refitting the models in the cryo-EM maps. We show that sampling an ensemble of models and selection of models with high residue conservation at the interface and in good agreement with the density helps in improving the accuracy of the fit. This study indicates that evolutionary information can serve as an additional input to improve and validate fitting of atomic models in cryo-EM density maps.
Collapse
|
4
|
Lawson CL, Patwardhan A, Baker ML, Hryc C, Garcia ES, Hudson BP, Lagerstedt I, Ludtke SJ, Pintilie G, Sala R, Westbrook JD, Berman HM, Kleywegt GJ, Chiu W. EMDataBank unified data resource for 3DEM. Nucleic Acids Res 2015; 44:D396-403. [PMID: 26578576 PMCID: PMC4702818 DOI: 10.1093/nar/gkv1126] [Citation(s) in RCA: 190] [Impact Index Per Article: 21.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Accepted: 10/15/2015] [Indexed: 01/10/2023] Open
Abstract
Three-dimensional Electron Microscopy (3DEM) has become a key experimental method in structural biology for a broad spectrum of biological specimens from molecules to cells. The EMDataBank project provides a unified portal for deposition, retrieval and analysis of 3DEM density maps, atomic models and associated metadata (emdatabank.org). We provide here an overview of the rapidly growing 3DEM structural data archives, which include maps in EM Data Bank and map-derived models in the Protein Data Bank. In addition, we describe progress and approaches toward development of validation protocols and methods, working with the scientific community, in order to create a validation pipeline for 3DEM data.
Collapse
Affiliation(s)
- Catherine L Lawson
- Department of Chemistry and Chemical Biology and Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, 610 Taylor Road Piscataway, NJ 08854, USA
| | - Ardan Patwardhan
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthew L Baker
- Verna and Marrs McLean Department of Biochemistry & Molecular Biology, National Center for Macromolecular Imaging, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 70030, USA
| | - Corey Hryc
- Verna and Marrs McLean Department of Biochemistry & Molecular Biology, National Center for Macromolecular Imaging, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 70030, USA
| | - Eduardo Sanz Garcia
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Brian P Hudson
- Department of Chemistry and Chemical Biology and Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, 610 Taylor Road Piscataway, NJ 08854, USA
| | - Ingvar Lagerstedt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Steven J Ludtke
- Verna and Marrs McLean Department of Biochemistry & Molecular Biology, National Center for Macromolecular Imaging, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 70030, USA
| | - Grigore Pintilie
- Verna and Marrs McLean Department of Biochemistry & Molecular Biology, National Center for Macromolecular Imaging, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 70030, USA
| | - Raul Sala
- Department of Chemistry and Chemical Biology and Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, 610 Taylor Road Piscataway, NJ 08854, USA
| | - John D Westbrook
- Department of Chemistry and Chemical Biology and Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, 610 Taylor Road Piscataway, NJ 08854, USA
| | - Helen M Berman
- Department of Chemistry and Chemical Biology and Research Collaboratory for Structural Bioinformatics, Rutgers, The State University of New Jersey, 610 Taylor Road Piscataway, NJ 08854, USA
| | - Gerard J Kleywegt
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Wah Chiu
- Verna and Marrs McLean Department of Biochemistry & Molecular Biology, National Center for Macromolecular Imaging, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 70030, USA
| |
Collapse
|
5
|
Lyumkis D, Brilot AF, Theobald DL, Grigorieff N. Likelihood-based classification of cryo-EM images using FREALIGN. J Struct Biol 2013; 183:377-388. [PMID: 23872434 DOI: 10.1016/j.jsb.2013.07.005] [Citation(s) in RCA: 185] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2013] [Revised: 07/03/2013] [Accepted: 07/09/2013] [Indexed: 10/26/2022]
Abstract
We describe an implementation of maximum likelihood classification for single particle electron cryo-microscopy that is based on the FREALIGN software. Particle alignment parameters are determined by maximizing a joint likelihood that can include hierarchical priors, while classification is performed by expectation maximization of a marginal likelihood. We test the FREALIGN implementation using a simulated dataset containing computer-generated projection images of three different 70S ribosome structures, as well as a publicly available dataset of 70S ribosomes. The results show that the mixed strategy of the new FREALIGN algorithm yields performance on par with other maximum likelihood implementations, while remaining computationally efficient.
Collapse
Affiliation(s)
- Dmitry Lyumkis
- National Resource for Automated Molecular Microscopy, Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Axel F Brilot
- Department of Biochemistry, Rosenstiel Basic Medical Sciences Research Center, Brandeis University, MS029, 415 South Street, Waltham, MA 02454, USA
| | - Douglas L Theobald
- Department of Biochemistry, Rosenstiel Basic Medical Sciences Research Center, Brandeis University, MS029, 415 South Street, Waltham, MA 02454, USA
| | - Nikolaus Grigorieff
- Department of Biochemistry, Rosenstiel Basic Medical Sciences Research Center, Brandeis University, MS029, 415 South Street, Waltham, MA 02454, USA; Howard Hughes Medical Institute, Brandeis University, MS029, 415 South Street, Waltham, MA 02454, USA.
| |
Collapse
|