1
|
Perez R, Li X, Giannakoulias S, Petersson EJ. AggBERT: Best in Class Prediction of Hexapeptide Amyloidogenesis with a Semi-Supervised ProtBERT Model. J Chem Inf Model 2023; 63:5727-5733. [PMID: 37552230 PMCID: PMC10777593 DOI: 10.1021/acs.jcim.3c00817] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]
Abstract
The prediction of peptide amyloidogenesis is a challenging problem in the field of protein folding. Large language models, such as the ProtBERT model, have recently emerged as powerful tools in analyzing protein sequences for applications, such as predicting protein structure and function. In this article, we describe the use of a semisupervised and fine-tuned ProtBERT model to predict peptide amyloidogenesis from sequences alone. Our approach, which we call AggBERT, achieved state-of-the-art performance, demonstrating the potential for large language models to improve the accuracy and speed of amyloid fibril prediction over simple heuristics or structure-based approaches. This work highlights the transformative potential of machine learning and large language models in the fields of chemical biology and biomedicine.
Collapse
Affiliation(s)
- Ryann Perez
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Xinning Li
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Sam Giannakoulias
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - E. James Petersson
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
2
|
Cory MB, Jones CM, Shaffer KD, Venkatesh Y, Giannakoulias S, Perez RM, Lougee MG, Hummingbird E, Pagar VV, Hurley CM, Li A, Mach RH, Kohli RM, Petersson EJ. FRETing about the details: Case studies in the use of a genetically encoded fluorescent amino acid for distance-dependent energy transfer. Protein Sci 2023; 32:e4633. [PMID: 36974585 PMCID: PMC10108435 DOI: 10.1002/pro.4633] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2023] [Revised: 03/22/2023] [Accepted: 03/24/2023] [Indexed: 03/29/2023]
Abstract
Förster resonance energy transfer (FRET) is a valuable method for monitoring protein conformation and biomolecular interactions. Intrinsically fluorescent amino acids that can be genetically encoded, such as acridonylalanine (Acd), are particularly useful for FRET studies. However, quantitative interpretation of FRET data to derive distance information requires careful use of controls and consideration of photophysical effects. Here we present two case studies illustrating how Acd can be used in FRET experiments to study small molecule induced conformational changes and multicomponent biomolecular complexes.
Collapse
Affiliation(s)
- Michael B. Cory
- Graduate Group in Biochemistry and BiophysicsPerelman School of Medicine, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Chloe M. Jones
- Graduate Group in Biochemistry and BiophysicsPerelman School of Medicine, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Kyle D. Shaffer
- Department of ChemistrySchool of Arts and Sciences, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Yarra Venkatesh
- Department of ChemistrySchool of Arts and Sciences, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Sam Giannakoulias
- Department of ChemistrySchool of Arts and Sciences, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Ryann M. Perez
- Department of ChemistrySchool of Arts and Sciences, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Marshall G. Lougee
- Department of ChemistrySchool of Arts and Sciences, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Eshe Hummingbird
- Department of ChemistrySchool of Arts and Sciences, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Vinayak V. Pagar
- Department of ChemistrySchool of Arts and Sciences, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Christina M. Hurley
- Graduate Group in Biochemistry and BiophysicsPerelman School of Medicine, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Allen Li
- Department of ChemistrySchool of Arts and Sciences, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Robert H. Mach
- Department of RadiologyPerelman School of Medicine, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - Rahul M. Kohli
- Department of Biochemistry and BiophysicsPerelman School of Medicine, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
- Department of MedicinePerelman School of Medicine, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| | - E. James Petersson
- Department of ChemistrySchool of Arts and Sciences, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
- Department of Biochemistry and BiophysicsPerelman School of Medicine, University of PennsylvaniaPhiladelphiaPennsylvania19104USA
| |
Collapse
|
3
|
Ferraz MVF, Neto JCS, Lins RD, Teixeira ES. An artificial neural network model to predict structure-based protein-protein free energy of binding from Rosetta-calculated properties. Phys Chem Chem Phys 2023; 25:7257-7267. [PMID: 36810523 DOI: 10.1039/d2cp05644e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The prediction of the free energy (ΔG) of binding for protein-protein complexes is of general scientific interest as it has a variety of applications in the fields of molecular and chemical biology, materials science, and biotechnology. Despite its centrality in understanding protein association phenomena and protein engineering, the ΔG of binding is a daunting quantity to obtain theoretically. In this work, we devise a novel Artificial Neural Network (ANN) model to predict the ΔG of binding for a given three-dimensional structure of a protein-protein complex with Rosetta-calculated properties. Our model was tested using two data sets, and it presented a root-mean-square error ranging from 1.67 kcal mol-1 to 2.45 kcal mol-1, showing a better performance compared to the available state-of-the-art tools. Validation of the model for a variety of protein-protein complexes is showcased.
Collapse
Affiliation(s)
- Matheus V F Ferraz
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation, FIOCRUZ, Recife, PE, Brazil.,Department of Fundamental Chemistry, Federal University of Pernambuco, UFPE, Recife, PE, Brazil.,Heidelberg Institute for Theoretical Studies, HITS, Heidelberg, Germany
| | - José C S Neto
- Recife Center for Advanced Studies and Systems, CESAR, Recife, PE, Brazil.
| | - Roberto D Lins
- Department of Virology, Aggeu Magalhães Institute, Oswaldo Cruz Foundation, FIOCRUZ, Recife, PE, Brazil.,Department of Fundamental Chemistry, Federal University of Pernambuco, UFPE, Recife, PE, Brazil
| | - Erico S Teixeira
- Recife Center for Advanced Studies and Systems, CESAR, Recife, PE, Brazil.
| |
Collapse
|
4
|
Zhang H, Zheng Z, Dong L, Shi N, Yang Y, Chen H, Shen Y, Xia Q. Rational incorporation of any unnatural amino acid into proteins by machine learning on existing experimental proofs. Comput Struct Biotechnol J 2022; 20:4930-4941. [PMID: 36147660 PMCID: PMC9472073 DOI: 10.1016/j.csbj.2022.08.063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 08/28/2022] [Accepted: 08/28/2022] [Indexed: 11/26/2022] Open
Abstract
The unnatural amino acid (UAA) incorporation technique through genetic code expansion has been extensively used in protein engineering for the last two decades. Mutations into UAAs offer more dimensions to tune protein structures and functions. However, the huge library of optional UAAs and various circumstances of mutation sites on different proteins urge rational UAA incorporations guided by artificial intelligence. Here we collected existing experimental proofs of UAA-incorporated proteins in literature and established a database of known UAA substitution sites. By program designing and machine learning on the database, we showed that UAA incorporations into proteins are predictable by the observed evolutional, steric and physiochemical factors. Based on the predicted probability of successful UAA substitutions, we tested the model performance using literature-reported and freshly-designed experimental proofs, and demonstrated its potential in screening UAA-incorporated proteins. This work expands structure-based computational biology and virtual screening to UAA-incorporated proteins, and offers a useful tool to automate the rational design of proteins with any UAA.
Collapse
Affiliation(s)
- Haoran Zhang
- State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Zhetao Zheng
- State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Liangzhen Dong
- State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Ningning Shi
- State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Yuelin Yang
- State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Hongmin Chen
- State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Yuxuan Shen
- State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| | - Qing Xia
- State Key Laboratory of Natural and Biomimetic Drugs, Department of Chemical Biology, School of Pharmaceutical Sciences, Peking University, Beijing 100191, China
| |
Collapse
|