1
|
Leemann M, Sagasta A, Eberhardt J, Schwede T, Robin X, Durairaj J. Automated benchmarking of combined protein structure and ligand conformation prediction. Proteins 2023; 91:1912-1924. [PMID: 37885318 DOI: 10.1002/prot.26605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 09/15/2023] [Accepted: 09/21/2023] [Indexed: 10/28/2023]
Abstract
The prediction of protein-ligand complexes (PLC), using both experimental and predicted structures, is an active and important area of research, underscored by the inclusion of the Protein-Ligand Interaction category in the latest round of the Critical Assessment of Protein Structure Prediction experiment CASP15. The prediction task in CASP15 consisted of predicting both the three-dimensional structure of the receptor protein as well as the position and conformation of the ligand. This paper addresses the challenges and proposed solutions for devising automated benchmarking techniques for PLC prediction. The reliability of experimentally solved PLC as ground truth reference structures is assessed using various validation criteria. Similarity of PLC to previously released complexes are employed to judge PLC diversity and the difficulty of a PLC as a prediction target. We show that the commonly used PDBBind time-split test-set is inappropriate for comprehensive PLC evaluation, with state-of-the-art tools showing conflicting results on a more representative and high quality dataset constructed for benchmarking purposes. We also show that redocking on crystal structures is a much simpler task than docking into predicted protein models, demonstrated by the two PLC-prediction-specific scoring metrics created. Finally, we introduce a fully automated pipeline that predicts PLC and evaluates the accuracy of the protein structure, ligand pose, and protein-ligand interactions.
Collapse
Affiliation(s)
- Michèle Leemann
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Ander Sagasta
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Jerome Eberhardt
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Xavier Robin
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Janani Durairaj
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
2
|
Hou Q, Waury K, Gogishvili D, Feenstra KA. Ten quick tips for sequence-based prediction of protein properties using machine learning. PLoS Comput Biol 2022; 18:e1010669. [PMID: 36454728 PMCID: PMC9714715 DOI: 10.1371/journal.pcbi.1010669] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
The ubiquitous availability of genome sequencing data explains the popularity of machine learning-based methods for the prediction of protein properties from their amino acid sequences. Over the years, while revising our own work, reading submitted manuscripts as well as published papers, we have noticed several recurring issues, which make some reported findings hard to understand and replicate. We suspect this may be due to biologists being unfamiliar with machine learning methodology, or conversely, machine learning experts may miss some of the knowledge needed to correctly apply their methods to proteins. Here, we aim to bridge this gap for developers of such methods. The most striking issues are linked to a lack of clarity: how were annotations of interest obtained; which benchmark metrics were used; how are positives and negatives defined. Others relate to a lack of rigor: If you sneak in structural information, your method is not sequence-based; if you compare your own model to "state-of-the-art," take the best methods; if you want to conclude that some method is better than another, obtain a significance estimate to support this claim. These, and other issues, we will cover in detail. These points may have seemed obvious to the authors during writing; however, they are not always clear-cut to the readers. We also expect many of these tips to hold for other machine learning-based applications in biology. Therefore, many computational biologists who develop methods in this particular subject will benefit from a concise overview of what to avoid and what to do instead.
Collapse
Affiliation(s)
- Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong, P. R. China
- National Institute of Health Data Science of China, Shandong University, Shandong, P. R. China
| | - Katharina Waury
- Department of Computer Science, Bioinformatics Group, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - Dea Gogishvili
- Department of Computer Science, Bioinformatics Group, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - K. Anton Feenstra
- Department of Computer Science, Bioinformatics Group, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
3
|
Varadi M, Anyango S, Appasamy SD, Armstrong D, Bage M, Berrisford J, Choudhary P, Bertoni D, Deshpande M, Leines GD, Ellaway J, Evans G, Gaborova R, Gupta D, Gutmanas A, Harrus D, Kleywegt GJ, Bueno WM, Nadzirin N, Nair S, Pravda L, Afonso MQL, Sehnal D, Tanweer A, Tolchard J, Abrams C, Dunlop R, Velankar S. PDBe and PDBe-KB: Providing high-quality, up-to-date and integrated resources of macromolecular structures to support basic and applied research and education. Protein Sci 2022; 31:e4439. [PMID: 36173162 PMCID: PMC9517934 DOI: 10.1002/pro.4439] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 09/02/2022] [Accepted: 09/05/2022] [Indexed: 11/26/2022]
Abstract
The archiving and dissemination of protein and nucleic acid structures as well as their structural, functional and biophysical annotations is an essential task that enables the broader scientific community to conduct impactful research in multiple fields of the life sciences. The Protein Data Bank in Europe (PDBe; pdbe.org) team develops and maintains several databases and web services to address this fundamental need. From data archiving as a member of the Worldwide PDB consortium (wwPDB; wwpdb.org), to the PDBe Knowledge Base (PDBe-KB; pdbekb.org), we provide data, data-access mechanisms, and visualizations that facilitate basic and applied research and education across the life sciences. Here, we provide an overview of the structural data and annotations that we integrate and make freely available. We describe the web services and data visualization tools we offer, and provide information on how to effectively use or even further develop them. Finally, we discuss the direction of our data services, and how we aim to tackle new challenges that arise from the recent, unprecedented advances in the field of structure determination and protein structure modeling.
Collapse
Affiliation(s)
- Mihaly Varadi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Stephen Anyango
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Sri Devan Appasamy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - David Armstrong
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Marcus Bage
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - John Berrisford
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Preeti Choudhary
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Damian Bertoni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Mandar Deshpande
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Grisell Diaz Leines
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Joseph Ellaway
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Genevieve Evans
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Romana Gaborova
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Deepti Gupta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Aleksandras Gutmanas
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Deborah Harrus
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Gerard J Kleywegt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | | | - Nurul Nadzirin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Sreenath Nair
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Lukas Pravda
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | | | - David Sehnal
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
- CEITEC - Central European Institute of Technology, Masaryk University, Brno, Czech Republic
- National Centre for Biomolecular Research, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Ahsan Tanweer
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - James Tolchard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Charlotte Abrams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Roisin Dunlop
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton
| |
Collapse
|
4
|
Simpkin AJ, Thomas JMH, Keegan RM, Rigden DJ. MrParse: finding homologues in the PDB and the EBI AlphaFold database for molecular replacement and more. Acta Crystallogr D Struct Biol 2022; 78:553-559. [PMID: 35503204 PMCID: PMC9063843 DOI: 10.1107/s2059798322003576] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Accepted: 03/29/2022] [Indexed: 11/10/2022] Open
Abstract
Crystallographers have an array of search-model options for structure solution by molecular replacement (MR). The well established options of homologous experimental structures and regular secondary-structure elements or motifs are increasingly supplemented by computational modelling. Such modelling may be carried out locally or may use pre-calculated predictions retrieved from databases such as the EBI AlphaFold database. MrParse is a new pipeline to help to streamline the decision process in MR by consolidating bioinformatic predictions in one place. When reflection data are provided, MrParse can rank any experimental homologues found using eLLG, which indicates the likelihood that a given search model will work in MR. Inbuilt displays of predicted secondary structure, coiled-coil and transmembrane regions further inform the choice of MR protocol. MrParse can also identify and rank homologues in the EBI AlphaFold database, a function that will also interest other structural biologists and bioinformaticians.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Jens M. H. Thomas
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| | - Ronan M. Keegan
- UKRI–STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, United Kingdom
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, United Kingdom
| |
Collapse
|
5
|
Varadi M, Anyango S, Armstrong D, Berrisford J, Choudhary P, Deshpande M, Nadzirin N, Nair SS, Pravda L, Tanweer A, Al-Lazikani B, Andreini C, Barton GJ, Bednar D, Berka K, Blundell T, Brock KP, Carazo JM, Damborsky J, David A, Dey S, Dunbrack R, Recio JF, Fraternali F, Gibson T, Helmer-Citterich M, Hoksza D, Hopf T, Jakubec D, Kannan N, Krivak R, Kumar M, Levy ED, London N, Macias JR, Srivatsan MM, Marks DS, Martens L, McGowan SA, McGreig JE, Modi V, Parra RG, Pepe G, Piovesan D, Prilusky J, Putignano V, Radusky LG, Ramasamy P, Rausch AO, Reuter N, Rodriguez LA, Rollins NJ, Rosato A, Rubach P, Serrano L, Singh G, Skoda P, Sorzano COS, Stourac J, Sulkowska JI, Svobodova R, Tichshenko N, Tosatto SCE, Vranken W, Wass MN, Xue D, Zaidman D, Thornton J, Sternberg M, Orengo C, Velankar S. PDBe-KB: collaboratively defining the biological context of structural data. Nucleic Acids Res 2022; 50:D534-D542. [PMID: 34755867 PMCID: PMC8728252 DOI: 10.1093/nar/gkab988] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/01/2021] [Accepted: 10/14/2021] [Indexed: 12/15/2022] Open
Abstract
The Protein Data Bank in Europe - Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the Protein Data Bank (PDB). The goal of PDBe-KB is to place macromolecular structure data in their biological context by developing standardised data exchange formats and integrating functional annotations from the contributing partner resources into a knowledge graph that can provide valuable biological insights. Since we described PDBe-KB in 2019, there have been significant improvements in the variety of available annotation data sets and user functionality. Here, we provide an overview of the consortium, highlighting the addition of annotations such as predicted covalent binders, phosphorylation sites, effects of mutations on the protein structure and energetic local frustration. In addition, we describe a library of reusable web-based visualisation components and introduce new features such as a bulk download data service and a novel superposition service that generates clusters of superposed protein chains weekly for the whole PDB archive.
Collapse
|
6
|
Noor A. Improving bioinformatics software quality through incorporation of software engineering practices. PeerJ Comput Sci 2022; 8:e839. [PMID: 35111923 PMCID: PMC8771759 DOI: 10.7717/peerj-cs.839] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 12/13/2021] [Indexed: 06/14/2023]
Abstract
BACKGROUND Bioinformatics software is developed for collecting, analyzing, integrating, and interpreting life science datasets that are often enormous. Bioinformatics engineers often lack the software engineering skills necessary for developing robust, maintainable, reusable software. This study presents review and discussion of the findings and efforts made to improve the quality of bioinformatics software. METHODOLOGY A systematic review was conducted of related literature that identifies core software engineering concepts for improving bioinformatics software development: requirements gathering, documentation, testing, and integration. The findings are presented with the aim of illuminating trends within the research that could lead to viable solutions to the struggles faced by bioinformatics engineers when developing scientific software. RESULTS The findings suggest that bioinformatics engineers could significantly benefit from the incorporation of software engineering principles into their development efforts. This leads to suggestion of both cultural changes within bioinformatics research communities as well as adoption of software engineering disciplines into the formal education of bioinformatics engineers. Open management of scientific bioinformatics development projects can result in improved software quality through collaboration amongst both bioinformatics engineers and software engineers. CONCLUSIONS While strides have been made both in identification and solution of issues of particular import to bioinformatics software development, there is still room for improvement in terms of shifts in both the formal education of bioinformatics engineers as well as the culture and approaches of managing scientific bioinformatics research and development efforts.
Collapse
|
7
|
Waman VP, Orengo C, Kleywegt GJ, Lesk AM. Three-dimensional Structure Databases of Biological Macromolecules. Methods Mol Biol 2022; 2449:43-91. [PMID: 35507259 DOI: 10.1007/978-1-0716-2095-3_3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Databases of three-dimensional structures of proteins (and their associated molecules) provide: (a) Curated repositories of coordinates of experimentally determined structures, including extensive metadata; for instance information about provenance, details about data collection and interpretation, and validation of results. (b) Information-retrieval tools to allow searching to identify entries of interest and provide access to them. (c) Links among databases, especially to databases of amino-acid and genetic sequences, and of protein function; and links to software for analysis of amino-acid sequence and protein structure, and for structure prediction. (d) Collections of predicted three-dimensional structures of proteins. These will become more and more important after the breakthrough in structure prediction achieved by AlphaFold2. The single global archive of experimentally determined biomacromolecular structures is the Protein Data Bank (PDB). It is managed by wwPDB, a consortium of five partner institutions: the Protein Data Bank in Europe (PDBe), the Research Collaboratory for Structural Bioinformatics (RCSB), the Protein Data Bank Japan (PDBj), the BioMagResBank (BMRB), and the Electron Microscopy Data Bank (EMDB). In addition to jointly managing the PDB repository, the individual wwPDB partners offer many tools for analysis of protein and nucleic acid structures and their complexes, including providing computer-graphic representations. Their collective and individual websites serve as hubs of the community of structural biologists, offering newsletters, reports from Task Forces, training courses, and "helpdesks," as well as links to external software.Many specialized projects are based on the information contained in the PDB. Especially important are SCOP, CATH, and ECOD, which present classifications of protein domains.
Collapse
Affiliation(s)
- Vaishali P Waman
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, London, UK
| | - Gerard J Kleywegt
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Cambridge, UK
| | - Arthur M Lesk
- Department of Biochemistry and Molecular Biology and Center for Computational Biology and Bioinformatics, The Pennsylvania State University, University Park, PA, USA.
| |
Collapse
|
8
|
Appels R, Wang P, Islam S. Integrating Wheat Nucleolus Structure and Function: Variation in the Wheat Ribosomal RNA and Protein Genes. FRONTIERS IN PLANT SCIENCE 2021; 12:686586. [PMID: 35003148 PMCID: PMC8739226 DOI: 10.3389/fpls.2021.686586] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/27/2021] [Accepted: 11/08/2021] [Indexed: 06/13/2023]
Abstract
We review the coordinated production and integration of the RNA (ribosomal RNA, rRNA) and protein (ribosomal protein, RP) components of wheat cytoplasmic ribosomes in response to changes in genetic constitution, biotic and abiotic stresses. The components examined are highly conserved and identified with reference to model systems such as human, Arabidopsis, and rice, but have sufficient levels of differences in their DNA and amino acid sequences to form fingerprints or gene haplotypes that provide new markers to associate with phenotype variation. Specifically, it is argued that populations of ribosomes within a cell can comprise distinct complements of rRNA and RPs to form units with unique functionalities. The unique functionalities of ribosome populations within a cell can become central in situations of stress where they may preferentially translate mRNAs coding for proteins better suited to contributing to survival of the cell. In model systems where this concept has been developed, the engagement of initiation factors and elongation factors to account for variation in the translation machinery of the cell in response to stresses provided the precedents. The polyploid nature of wheat adds extra variation at each step of the synthesis and assembly of the rRNAs and RPs which can, as a result, potentially enhance its response to changing environments and disease threats.
Collapse
Affiliation(s)
- Rudi Appels
- AgriBio, Centre for AgriBioscience, La Trobe University, Bundoora, VIC, Australia
- Faculty of Veterinary and Agricultural Science, Melbourne, VIC, Australia
| | - Penghao Wang
- School of Veterinary and Life Sciences, Murdoch University, Murdoch, WA, Australia
| | - Shahidul Islam
- Centre for Crop Innovation, Food Futures Institute, Murdoch University, Murdoch, WA, Australia
| |
Collapse
|
9
|
BEHZADI PAYAM, GAJDÁCS MÁRIÓ. Worldwide Protein Data Bank (wwPDB): A virtual treasure for research in biotechnology. Eur J Microbiol Immunol (Bp) 2021; 11:77-86. [PMID: 34908533 PMCID: PMC8830413 DOI: 10.1556/1886.2021.00020] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 11/23/2021] [Indexed: 12/25/2022] Open
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RSCB PDB) provides a wide range of digital data regarding biology and biomedicine. This huge internet resource involves a wide range of important biological data, obtained from experiments around the globe by different scientists. The Worldwide Protein Data Bank (wwPDB) represents a brilliant collection of 3D structure data associated with important and vital biomolecules including nucleic acids (RNAs and DNAs) and proteins. Moreover, this database accumulates knowledge regarding function and evolution of biomacromolecules which supports different disciplines such as biotechnology. 3D structure, functional characteristics and phylogenetic properties of biomacromolecules give a deep understanding of the biomolecules' characteristics. An important advantage of the wwPDB database is the data updating time, which is done every week. This updating process helps users to have the newest data and information for their projects. The data and information in wwPDB can be a great support to have an accurate imagination and illustrations of the biomacromolecules in biotechnology. As demonstrated by the SARS-CoV-2 pandemic, rapidly reliable and accessible biological data for microbiology, immunology, vaccinology, and drug development are critical to address many healthcare-related challenges that are facing humanity. The aim of this paper is to introduce the readers to wwPDB, and to highlight the importance of this database in biotechnology, with the expectation that the number of scientists interested in the utilization of Protein Data Bank's resources will increase substantially in the coming years.
Collapse
Affiliation(s)
- PAYAM BEHZADI
- Department of Microbiology, College of Basic Sciences, Shahr-e-Qods Branch, Islamic Azad University, Tehran, 37541-374, Iran
| | - MÁRIÓ GAJDÁCS
- Department of Oral Biology and Experimental Dental Research, Faculty of Dentistry, University of Szeged, 6720, Szeged, Hungary,*Corresponding author. Tel.: +36-62-342-532. E-mail:
| |
Collapse
|
10
|
Robin X, Haas J, Gumienny R, Smolinski A, Tauriello G, Schwede T. Continuous Automated Model EvaluatiOn (CAMEO)-Perspectives on the future of fully automated evaluation of structure prediction methods. Proteins 2021; 89:1977-1986. [PMID: 34387007 PMCID: PMC8673552 DOI: 10.1002/prot.26213] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 08/05/2021] [Accepted: 08/07/2021] [Indexed: 11/18/2022]
Abstract
The Continuous Automated Model EvaluatiOn (CAMEO) platform complements the biennial CASP experiment by conducting fully automated blind evaluations of three‐dimensional protein prediction servers based on the weekly prerelease of sequences of those structures, which are going to be published in the upcoming release of the Protein Data Bank. While in CASP14, significant success was observed in predicting the structures of individual protein chains with high accuracy, significant challenges remain in correctly predicting the structures of complexes. By implementing fully automated evaluation of predictions for protein–protein complexes, as well as for proteins in complex with ligands, peptides, nucleic acids, or proteins containing noncanonical amino acid residues, CAMEO will assist new developments in those challenging areas of active research.
Collapse
Affiliation(s)
- Xavier Robin
- Biozentrum, University of Basel, Basel, Switzerland.,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Juergen Haas
- Biozentrum, University of Basel, Basel, Switzerland.,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Rafal Gumienny
- Biozentrum, University of Basel, Basel, Switzerland.,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Anna Smolinski
- Biozentrum, University of Basel, Basel, Switzerland.,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel, Switzerland.,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland.,Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
11
|
Poitevin F, Kushner A, Li X, Dao Duc K. Structural Heterogeneities of the Ribosome: New Frontiers and Opportunities for Cryo-EM. Molecules 2020; 25:E4262. [PMID: 32957592 PMCID: PMC7570653 DOI: 10.3390/molecules25184262] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 09/11/2020] [Accepted: 09/15/2020] [Indexed: 12/18/2022] Open
Abstract
The extent of ribosomal heterogeneity has caught increasing interest over the past few years, as recent studies have highlighted the presence of structural variations of the ribosome. More precisely, the heterogeneity of the ribosome covers multiple scales, including the dynamical aspects of ribosomal motion at the single particle level, specialization at the cellular and subcellular scale, or evolutionary differences across species. Upon solving the ribosome atomic structure at medium to high resolution, cryogenic electron microscopy (cryo-EM) has enabled investigating all these forms of heterogeneity. In this review, we present some recent advances in quantifying ribosome heterogeneity, with a focus on the conformational and evolutionary variations of the ribosome and their functional implications. These efforts highlight the need for new computational methods and comparative tools, to comprehensively model the continuous conformational transition pathways of the ribosome, as well as its evolution. While developing these methods presents some important challenges, it also provides an opportunity to extend our interpretation and usage of cryo-EM data, which would more generally benefit the study of molecular dynamics and evolution of proteins and other complexes.
Collapse
Affiliation(s)
- Frédéric Poitevin
- Department of LCLS Data Analytics, Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, CA 94025, USA;
| | - Artem Kushner
- Department of Mathematics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; (A.K.); (X.L.)
- Department of Computer Science, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Xinpei Li
- Department of Mathematics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; (A.K.); (X.L.)
- Department of Computer Science, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Khanh Dao Duc
- Department of Mathematics, University of British Columbia, Vancouver, BC V6T 1Z4, Canada; (A.K.); (X.L.)
- Department of Computer Science, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Department of Zoology, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| |
Collapse
|