1
|
van Ree R, Sapiter Ballerda D, Berin MC, Beuf L, Chang A, Gadermaier G, Guevera PA, Hoffmann-Sommergruber K, Islamovic E, Koski L, Kough J, Ladics GS, McClain S, McKillop KA, Mitchell-Ryan S, Narrod CA, Pereira Mouriès L, Pettit S, Poulsen LK, Silvanovich A, Song P, Teuber SS, Bowman C. The COMPARE Database: A Public Resource for Allergen Identification, Adapted for Continuous Improvement. FRONTIERS IN ALLERGY 2021; 2:700533. [PMID: 35386979 PMCID: PMC8974746 DOI: 10.3389/falgy.2021.700533] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 07/06/2021] [Indexed: 11/14/2022] Open
Abstract
Motivation: The availability of databases identifying allergenic proteins via a transparent and consensus-based scientific approach is of prime importance to support the safety review of genetically-modified foods and feeds, and public safety in general. Over recent years, screening for potential new allergens sequences has become more complex due to the exponential increase of genomic sequence information. To address these challenges, an international collaborative scientific group coordinated by the Health and Environmental Sciences Institute (HESI), was tasked to develop a contemporary, adaptable, high-throughput process to build the COMprehensive Protein Allergen REsource (COMPARE) database, a publicly accessible allergen sequence data resource along with bioinformatics analytical tools following guidelines of FAO/WHO and CODEX Alimentarius Commission. Results: The COMPARE process is novel in that it involves the identification of candidate sequences via automated keyword-based sorting algorithm and manual curation of the annotated sequence entries retrieved from public protein sequence databases on a yearly basis; its process is meant for continuous improvement, with updates being transparently documented with each version; as a complementary approach, a yearly key-word based search of literature databases is added to identify new allergen sequences that were not (yet) submitted to protein databases; in addition, comments from the independent peer-review panel are posted on the website to increase transparency of decision making; finally, sequence comparison capabilities associated with the COMPARE database was developed to evaluate the potential allergenicity of proteins, based on internationally recognized guidelines, FAO/WHO and CODEX Alimentarius Commission
Collapse
Affiliation(s)
- Ronald van Ree
- Departments of Experimental Immunology and of Otorhinolaryngology, Amsterdam University Medical Centers, Amsterdam, Netherlands
| | - Dexter Sapiter Ballerda
- Joint Institute for Food Safety and Applied Nutrition (JIFSAN), University of Maryland, College Park, MD, United States
| | - M. Cecilia Berin
- Department of Pediatrics, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Laurent Beuf
- Limagrain Field Seeds, Centre de Recherche, Route d'Ennezat, Chappes, France
| | - Alexander Chang
- Joint Institute for Food Safety and Applied Nutrition (JIFSAN), University of Maryland, College Park, MD, United States
| | - Gabriele Gadermaier
- Department of Biosciences, Paris Lodron University of Salzburg, Salzburg, Austria
| | - Paul A. Guevera
- Joint Institute for Food Safety and Applied Nutrition (JIFSAN), University of Maryland, College Park, MD, United States
| | | | - Emir Islamovic
- Regulatory Science Seeds and Traits, BASF Corporation, Morrisville, NC, United States
| | - Liisa Koski
- Health and Environmental Sciences Institute (HESI), Washington, DC, United States
| | - John Kough
- Office of Pesticide Programs, Microbial Pesticides Branch, US Environmental Protection Agency, Washington, DC, United States
| | | | - Scott McClain
- Syngenta Crop Protection LLC, Research Triangle Park, NC, United States
| | - Kyle A. McKillop
- Joint Institute for Food Safety and Applied Nutrition (JIFSAN), University of Maryland, College Park, MD, United States
| | | | - Clare A. Narrod
- Joint Institute for Food Safety and Applied Nutrition (JIFSAN), University of Maryland, College Park, MD, United States
| | - Lucilia Pereira Mouriès
- Health and Environmental Sciences Institute (HESI), Washington, DC, United States
- *Correspondence: Lucilia Pereira Mouriès
| | - Syril Pettit
- Health and Environmental Sciences Institute (HESI), Washington, DC, United States
| | - Lars K. Poulsen
- Allergy Clinic, Department of Dermatology and Allergy, Gentofte Hospital, University of Copenhagen, Copenhagen, Denmark
| | - Andre Silvanovich
- Bayer U.S., Crop Science Regulatory Science Building FF4, Chesterfield, MO, United States
| | - Ping Song
- Seeds Regulatory Science, Corteva Agriscience LLC, Indianapolis, IN, United States
| | - Suzanne S. Teuber
- Department of Internal Medicine, School of Medicine, University of California, Davis, Davis, CA, United States
- Division of Rheumatology, Allergy, and Clinical Immunology, Davis, CA, United States
- Veterans Affairs Northern California Healthcare System, Mather, CA, United States
| | - Christal Bowman
- Formerly: Human Safety Regulatory Toxicology, Bayer CropScience LP, Research Triangle Park, NC, United States
| |
Collapse
|
2
|
Baek M, Anishchenko I, Park H, Humphreys IR, Baker D. Protein oligomer modeling guided by predicted interchain contacts in CASP14. Proteins 2021; 89:1824-1833. [PMID: 34324224 PMCID: PMC8616806 DOI: 10.1002/prot.26197] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 07/02/2021] [Accepted: 07/23/2021] [Indexed: 01/01/2023]
Abstract
For CASP14, we developed deep learning‐based methods for predicting homo‐oligomeric and hetero‐oligomeric contacts and used them for oligomer modeling. To build structure models, we developed an oligomer structure generation method that utilizes predicted interchain contacts to guide iterative restrained minimization from random backbone structures. We supplemented this gradient‐based fold‐and‐dock method with template‐based and ab initio docking approaches using deep learning‐based subunit predictions on 29 assembly targets. These methods produced oligomer models with summed Z‐scores 5.5 units higher than the next best group, with the fold‐and‐dock method having the best relative performance. Over the eight targets for which this method was used, the best of the five submitted models had average oligomer TM‐score of 0.71 (average oligomer TM‐score of the next best group: 0.64), and explicit modeling of inter‐subunit interactions improved modeling of six out of 40 individual domains (ΔGDT‐TS > 2.0).
Collapse
Affiliation(s)
- Minkyung Baek
- Department of Biochemistry, University of Washington, Seattle, Washington, USA.,Institute for Protein Design, University of Washington, Seattle, Washington, USA
| | - Ivan Anishchenko
- Department of Biochemistry, University of Washington, Seattle, Washington, USA.,Institute for Protein Design, University of Washington, Seattle, Washington, USA
| | - Hahnbeom Park
- Department of Biochemistry, University of Washington, Seattle, Washington, USA.,Institute for Protein Design, University of Washington, Seattle, Washington, USA
| | - Ian R Humphreys
- Department of Biochemistry, University of Washington, Seattle, Washington, USA.,Institute for Protein Design, University of Washington, Seattle, Washington, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, Washington, USA.,Institute for Protein Design, University of Washington, Seattle, Washington, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, Washington, USA
| |
Collapse
|
3
|
Scherer M, Fleishman SJ, Jones PR, Dandekar T, Bencurova E. Computational Enzyme Engineering Pipelines for Optimized Production of Renewable Chemicals. Front Bioeng Biotechnol 2021; 9:673005. [PMID: 34211966 PMCID: PMC8239229 DOI: 10.3389/fbioe.2021.673005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 05/06/2021] [Indexed: 11/13/2022] Open
Abstract
To enable a sustainable supply of chemicals, novel biotechnological solutions are required that replace the reliance on fossil resources. One potential solution is to utilize tailored biosynthetic modules for the metabolic conversion of CO2 or organic waste to chemicals and fuel by microorganisms. Currently, it is challenging to commercialize biotechnological processes for renewable chemical biomanufacturing because of a lack of highly active and specific biocatalysts. As experimental methods to engineer biocatalysts are time- and cost-intensive, it is important to establish efficient and reliable computational tools that can speed up the identification or optimization of selective, highly active, and stable enzyme variants for utilization in the biotechnological industry. Here, we review and suggest combinations of effective state-of-the-art software and online tools available for computational enzyme engineering pipelines to optimize metabolic pathways for the biosynthesis of renewable chemicals. Using examples relevant for biotechnology, we explain the underlying principles of enzyme engineering and design and illuminate future directions for automated optimization of biocatalysts for the assembly of synthetic metabolic pathways.
Collapse
Affiliation(s)
- Marc Scherer
- Department of Bioinformatics, Julius-Maximilians University of Würzburg, Würzburg, Germany
| | - Sarel J Fleishman
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Patrik R Jones
- Department of Life Sciences, Imperial College London, London, United Kingdom
| | - Thomas Dandekar
- Department of Bioinformatics, Julius-Maximilians University of Würzburg, Würzburg, Germany
| | - Elena Bencurova
- Department of Bioinformatics, Julius-Maximilians University of Würzburg, Würzburg, Germany
| |
Collapse
|
4
|
Hameduh T, Haddad Y, Adam V, Heger Z. Homology modeling in the time of collective and artificial intelligence. Comput Struct Biotechnol J 2020; 18:3494-3506. [PMID: 33304450 PMCID: PMC7695898 DOI: 10.1016/j.csbj.2020.11.007] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 11/04/2020] [Accepted: 11/04/2020] [Indexed: 12/12/2022] Open
Abstract
Homology modeling is a method for building protein 3D structures using protein primary sequence and utilizing prior knowledge gained from structural similarities with other proteins. The homology modeling process is done in sequential steps where sequence/structure alignment is optimized, then a backbone is built and later, side-chains are added. Once the low-homology loops are modeled, the whole 3D structure is optimized and validated. In the past three decades, a few collective and collaborative initiatives allowed for continuous progress in both homology and ab initio modeling. Critical Assessment of protein Structure Prediction (CASP) is a worldwide community experiment that has historically recorded the progress in this field. Folding@Home and Rosetta@Home are examples of crowd-sourcing initiatives where the community is sharing computational resources, whereas RosettaCommons is an example of an initiative where a community is sharing a codebase for the development of computational algorithms. Foldit is another initiative where participants compete with each other in a protein folding video game to predict 3D structure. In the past few years, contact maps deep machine learning was introduced to the 3D structure prediction process, adding more information and increasing the accuracy of models significantly. In this review, we will take the reader in a journey of exploration from the beginnings to the most recent turnabouts, which have revolutionized the field of homology modeling. Moreover, we discuss the new trends emerging in this rapidly growing field.
Collapse
Affiliation(s)
- Tareq Hameduh
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
| | - Yazan Haddad
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| | - Vojtech Adam
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| | - Zbynek Heger
- Department of Chemistry and Biochemistry, Mendel University in Brno, Zemedelska 1, CZ-613 00 Brno, Czech Republic
- Central European Institute of Technology, Brno University of Technology, Purkynova 656/123, 612 00 Brno, Czech Republic
| |
Collapse
|
5
|
Singh A, Dauzhenka T, Kundrotas PJ, Sternberg MJE, Vakser IA. Application of docking methodologies to modeled proteins. Proteins 2020; 88:1180-1188. [PMID: 32170770 DOI: 10.1002/prot.25889] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 02/15/2020] [Accepted: 03/07/2020] [Indexed: 12/12/2022]
Abstract
Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and their complexes is important for understanding the fundamental principles and specific aspects of protein interactions. The accuracy of protein modeling, in general, is still less than that of the experimental approaches. Thus, it is important to investigate the applicability of docking techniques to modeled proteins. We present new comprehensive benchmark sets of protein models for the development and validation of protein docking, as well as a systematic assessment of free and template-based docking techniques on these sets. As opposed to previous studies, the benchmark sets reflect the real case modeling/docking scenario where the accuracy of the models is assessed by the modeling procedure, without reference to the native structure (which would be unknown in practical applications). We also expanded the analysis to include docking of protein pairs where proteins have different structural accuracy. The results show that, in general, the template-based docking is less sensitive to the structural inaccuracies of the models than the free docking. The near-native docking poses generated by the template-based approach, typically, also have higher ranks than those produces by the free docking (although the free docking is indispensable in modeling the multiplicity of protein interactions in a crowded cellular environment). The results show that docking techniques are applicable to protein models in a broad range of modeling accuracy. The study provides clear guidelines for practical applications of docking to protein models.
Collapse
Affiliation(s)
- Amar Singh
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA
| | - Taras Dauzhenka
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA
| | - Petras J Kundrotas
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA
| | - Michael J E Sternberg
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, South Kensington, London, UK
| | - Ilya A Vakser
- Computational Biology Program, The University of Kansas, Lawrence, Kansas, USA.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, USA
| |
Collapse
|
6
|
Park T, Woo H, Baek M, Yang J, Seok C. Structure prediction of biological assemblies using GALAXY in CAPRI rounds 38-45. Proteins 2019; 88:1009-1017. [PMID: 31774573 DOI: 10.1002/prot.25859] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 11/11/2019] [Accepted: 11/23/2019] [Indexed: 12/12/2022]
Abstract
We participated in CARPI rounds 38-45 both as a server predictor and a human predictor. These CAPRI rounds provided excellent opportunities for testing prediction methods for three classes of protein interactions, that is, protein-protein, protein-peptide, and protein-oligosaccharide interactions. Both template-based methods (GalaxyTBM for monomer protein, GalaxyHomomer for homo-oligomer protein, GalaxyPepDock for protein-peptide complex) and ab initio docking methods (GalaxyTongDock and GalaxyPPDock for protein oligomer, GalaxyPepDock-ab-initio for protein-peptide complex, GalaxyDock2 and Galaxy7TM for protein-oligosaccharide complex) have been tested. Template-based methods depend heavily on the availability of proper templates and template-target similarity, and template-target difference is responsible for inaccuracy of template-based models. Inaccurate template-based models could be improved by our structure refinement and loop modeling methods based on physics-based energy optimization (GalaxyRefineComplex and GalaxyLoop) for several CAPRI targets. Current ab initio docking methods require accurate protein structures as input. Small conformational changes from input structure could be accounted for by our docking methods, producing one of the best models for several CAPRI targets. However, predicting large conformational changes involving protein backbone is still challenging, and full exploration of physics-based methods for such problems is still to come.
Collapse
Affiliation(s)
- Taeyong Park
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Hyeonuk Woo
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Minkyung Baek
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Jinsol Yang
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|