Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Hou Q, Kwasigroch JM, Rooman M, Pucci F. SOLart: a structure-based method to predict protein solubility and aggregation. Bioinformatics 2020;36:1445-1452. [PMID: 31603466 DOI: 10.1093/bioinformatics/btz773] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Revised: 08/31/2019] [Accepted: 10/08/2019] [Indexed: 12/12/2022] Open

For:	Hou Q, Kwasigroch JM, Rooman M, Pucci F. SOLart: a structure-based method to predict protein solubility and aggregation. Bioinformatics 2020;36:1445-1452. [PMID: 31603466 DOI: 10.1093/bioinformatics/btz773] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2019] [Revised: 08/31/2019] [Accepted: 10/08/2019] [Indexed: 12/12/2022] Open

Number

Cited by Other Article(s)

Zheng F, Jiang X, Wen Y, Yang Y, Li M. Systematic investigation of machine learning on limited data: A study on predicting protein-protein binding strength. Comput Struct Biotechnol J 2024;23:460-472. [PMID: 38235359 PMCID: PMC10792694 DOI: 10.1016/j.csbj.2023.12.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Revised: 12/14/2023] [Accepted: 12/16/2023] [Indexed: 01/19/2024] Open

Cárdenas-Guerra RE, Montes-Flores O, Nava-Pintor EE, Reséndiz-Cardiel G, Flores-Pucheta CI, Rodríguez-Gavaldón YI, Arroyo R, Bottazzi ME, Hotez PJ, Ortega-López J. Chagasin from Trypanosoma cruzi as a molecular scaffold to express epitopes of TSA-1 as soluble recombinant chimeras. Protein Expr Purif 2024;218:106458. [PMID: 38423156 DOI: 10.1016/j.pep.2024.106458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 02/13/2024] [Accepted: 02/21/2024] [Indexed: 03/02/2024]

Affiliation(s)

Rosa Elena Cárdenas-Guerra Departamento de Biotecnología y Bioingeniería, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Av. IPN # 2508, Col. San Pedro Zacatenco, Gustavo A. Madero, CP 07360, Mexico City, Mexico
Octavio Montes-Flores Departamento de Biotecnología y Bioingeniería, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Av. IPN # 2508, Col. San Pedro Zacatenco, Gustavo A. Madero, CP 07360, Mexico City, Mexico
Edgar Ezequiel Nava-Pintor Departamento de Biotecnología y Bioingeniería, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Av. IPN # 2508, Col. San Pedro Zacatenco, Gustavo A. Madero, CP 07360, Mexico City, Mexico
Gerardo Reséndiz-Cardiel Departamento de Biotecnología y Bioingeniería, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Av. IPN # 2508, Col. San Pedro Zacatenco, Gustavo A. Madero, CP 07360, Mexico City, Mexico
Claudia Ivonne Flores-Pucheta Departamento de Biotecnología y Bioingeniería, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Av. IPN # 2508, Col. San Pedro Zacatenco, Gustavo A. Madero, CP 07360, Mexico City, Mexico
Yasmín Irene Rodríguez-Gavaldón Departamento de Biotecnología y Bioingeniería, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Av. IPN # 2508, Col. San Pedro Zacatenco, Gustavo A. Madero, CP 07360, Mexico City, Mexico
Rossana Arroyo Departamento de Infectómica y Patogénesis Molecular, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Av. IPN # 2508, Col. San Pedro Zacatenco, Gustavo A. Madero, CP 07360, Mexico City, Mexico
Maria Elena Bottazzi Texas Children's Hospital Center for Vaccine Development, Department of Pediatrics and Molecular Virology and Microbiology, National School of Tropical Medicine, Baylor College of Medicine, Houston, TX, USA
Peter J Hotez Texas Children's Hospital Center for Vaccine Development, Department of Pediatrics and Molecular Virology and Microbiology, National School of Tropical Medicine, Baylor College of Medicine, Houston, TX, USA
Jaime Ortega-López Departamento de Biotecnología y Bioingeniería, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional (CINVESTAV-IPN), Av. IPN # 2508, Col. San Pedro Zacatenco, Gustavo A. Madero, CP 07360, Mexico City, Mexico.

Collapse

Li B, Ming D. GATSol, an enhanced predictor of protein solubility through the synergy of 3D structure graph and large language modeling. BMC Bioinformatics 2024;25:204. [PMID: 38824535 DOI: 10.1186/s12859-024-05820-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 05/29/2024] [Indexed: 06/03/2024] Open

Abstract

BACKGROUND

Protein solubility is a critically important physicochemical property closely related to protein expression. For example, it is one of the main factors to be considered in the design and production of antibody drugs and a prerequisite for realizing various protein functions. Although several solubility prediction models have emerged in recent years, many of these models are limited to capturing information embedded in one-dimensional amino acid sequences, resulting in unsatisfactory predictive performance.

RESULTS

In this study, we introduce a novel Graph Attention network-based protein Solubility model, GATSol, which represents the 3D structure of proteins as a protein graph. In addition to the node features of amino acids extracted by the state-of-the-art protein large language model, GATSol utilizes amino acid distance maps generated using the latest AlphaFold technology. Rigorous testing on independent eSOL and the Saccharomyces cerevisiae test datasets has shown that GATSol outperforms most recently introduced models, especially with respect to the coefficient of determination R2, which reaches 0.517 and 0.424, respectively. It outperforms the current state-of-the-art GraphSol by 18.4% on the S. cerevisiae_test set.

CONCLUSIONS

GATSol captures 3D dimensional features of proteins by building protein graphs, which significantly improves the accuracy of protein solubility prediction. Recent advances in protein structure modeling allow our method to incorporate spatial structure features extracted from predicted structures into the model by relying only on the input of protein sequences, which simplifies the entire graph neural network prediction process, making it more user-friendly and efficient. As a result, GATSol may help prioritize highly soluble proteins, ultimately reducing the cost and effort of experimental work. The source code and data of the GATSol model are freely available at https://github.com/binbinbinv/GATSol .

Collapse

Li W, Lin H, Huang Z, Xie S, Zhou Y, Gong R, Jiang Q, Xiang C, Huang J. DOTAD: A Database of Therapeutic Antibody Developability. Interdiscip Sci 2024:10.1007/s12539-024-00613-2. [PMID: 38530613 DOI: 10.1007/s12539-024-00613-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 01/25/2024] [Accepted: 01/27/2024] [Indexed: 03/28/2024]

Rahbar MR, Nezafat N, Morowvat MH, Savardashtaki A, Ghoshoon MB, Mehrabani-Zeinabad K, Ghasemi Y. Targeting Efficient Features of Urate Oxidase to Increase Its Solubility. Appl Biochem Biotechnol 2024:10.1007/s12010-023-04819-w. [PMID: 38308671 DOI: 10.1007/s12010-023-04819-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/19/2023] [Indexed: 02/05/2024]

Zheng F, Liu Y, Yang Y, Wen Y, Li M. Assessing computational tools for predicting protein stability changes upon missense mutations using a new dataset. Protein Sci 2024;33:e4861. [PMID: 38084013 PMCID: PMC10751734 DOI: 10.1002/pro.4861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 11/14/2023] [Accepted: 12/06/2023] [Indexed: 12/28/2023]

Helmick H, Tonner T, Hauersperger D, Okos M, Kokini JL. Comparison of the specific mechanical energy, specific thermal energy, and functional properties of cold and hot extruded pea protein isolate. Food Res Int 2023;174:113603. [PMID: 37986466 DOI: 10.1016/j.foodres.2023.113603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/13/2023] [Accepted: 10/14/2023] [Indexed: 11/22/2023]

Chen Z, Wang X, Chen X, Huang J, Wang C, Wang J, Wang Z. Accelerating therapeutic protein design with computational approaches toward the clinical stage. Comput Struct Biotechnol J 2023;21:2909-2926. [PMID: 38213894 PMCID: PMC10781723 DOI: 10.1016/j.csbj.2023.04.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 04/11/2023] [Accepted: 04/27/2023] [Indexed: 01/13/2024] Open

Wittmund M, Cadet F, Davari MD. Learning Epistasis and Residue Coevolution Patterns: Current Trends and Future Perspectives for Advancing Enzyme Engineering. ACS Catal 2022. [DOI: 10.1021/acscatal.2c01426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Qing R, Hao S, Smorodina E, Jin D, Zalevsky A, Zhang S. Protein Design: From the Aspect of Water Solubility and Stability. Chem Rev 2022;122:14085-14179. [PMID: 35921495 PMCID: PMC9523718 DOI: 10.1021/acs.chemrev.1c00757] [Citation(s) in RCA: 28] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Indexed: 12/13/2022]

Ferreira CS, Martins YC, Souza RC, Vasconcelos ATR. EpiCurator: an immunoinformatic workflow to predict and prioritize SARS-CoV-2 epitopes. PeerJ 2021;9:e12548. [PMID: 34909278 PMCID: PMC8641484 DOI: 10.7717/peerj.12548] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Accepted: 11/04/2021] [Indexed: 12/12/2022] Open

Sun T, Chen Y, Wen Y, Zhu Z, Li M. PremPLI: a machine learning model for predicting the effects of missense mutations on protein-ligand interactions. Commun Biol 2021;4:1311. [PMID: 34799678 PMCID: PMC8604987 DOI: 10.1038/s42003-021-02826-3] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Accepted: 10/26/2021] [Indexed: 02/07/2023] Open

Melo MCR, Maasch JRMA, de la Fuente-Nunez C. Accelerating antibiotic discovery through artificial intelligence. Commun Biol 2021;4:1050. [PMID: 34504303 PMCID: PMC8429579 DOI: 10.1038/s42003-021-02586-0] [Citation(s) in RCA: 59] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Accepted: 07/16/2021] [Indexed: 02/07/2023] Open

Revolutionizing enzyme engineering through artificial intelligence and machine learning. Emerg Top Life Sci 2021;5:113-125. [PMID: 33835131 DOI: 10.1042/etls20200257] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 03/17/2021] [Accepted: 03/22/2021] [Indexed: 12/20/2022]

Ptak-Kaczor M, Banach M, Stapor K, Fabian P, Konieczny L, Roterman I. Solubility and Aggregation of Selected Proteins Interpreted on the Basis of Hydrophobicity Distribution. Int J Mol Sci 2021;22:ijms22095002. [PMID: 34066830 PMCID: PMC8125953 DOI: 10.3390/ijms22095002] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Revised: 05/03/2021] [Accepted: 05/06/2021] [Indexed: 11/30/2022] Open

Postic G, Janel N, Moroy G. Representations of protein structure for exploring the conformational space: A speed-accuracy trade-off. Comput Struct Biotechnol J 2021;19:2618-2625. [PMID: 34025948 PMCID: PMC8120936 DOI: 10.1016/j.csbj.2021.04.049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 04/19/2021] [Accepted: 04/20/2021] [Indexed: 11/25/2022] Open

Abstract

•

We compare ten structural representations, either atomistic or coarse-grained.

•

Thus, ten distance-dependent statistical potentials of mean force (PMF) were built.

•

The Cβ-only and Cα + Cβ representations provide the best speed–accuracy trade-off.

•

Including glycines through Cα, in a Cβ-only representation, yields a higher accuracy.

•

We generalize the conclusions to the total information gain (TIG) scoring function.

The recent breakthrough in the field of protein structure prediction shows the relevance of using knowledge-based based scoring functions in combination with a low-resolution 3D representation of protein macromolecules. The choice of not using all atoms is barely supported by any data in the literature, and is mostly motivated by empirical and practical reasons, such as the computational cost of assessing the numerous folds of the protein conformational space. Here, we present a comprehensive study, carried on a large and balanced benchmark of predicted protein structures, to see how different types of structural representations rank in either accuracy or calculation speed, and which ones offer the best compromise between these two criteria. We tested ten representations, including low-resolution, high-resolution, and coarse-grained approaches. We also investigated the generalization of the findings to other formalisms than the widely-used “potential of mean force” (PMF) method. Thus, we observed that representing protein structures by their β carbons—combined or not with Cα—provides the best speed–accuracy trade-off, when using a “total information gain” scoring function. For statistical PMFs, using MARTINI backbone and side-chains beads is the best option. Finally, we also demonstrated the necessity of training the reference state on all atom types, and of including the Cα atoms of glycine residues, in a Cβ-based representation.

Collapse

Bhandari BK, Gardner PP, Lim CS. Solubility-Weighted Index: fast and accurate prediction of protein solubility. Bioinformatics 2021;36:4691-4698. [PMID: 32559287 PMCID: PMC7750957 DOI: 10.1093/bioinformatics/btaa578] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 05/05/2020] [Accepted: 06/12/2020] [Indexed: 12/14/2022] Open

Chen J, Zheng S, Zhao H, Yang Y. Structure-aware protein solubility prediction from sequence through graph convolutional network and predicted contact map. J Cheminform 2021;13:7. [PMID: 33557952 PMCID: PMC7869490 DOI: 10.1186/s13321-021-00488-1] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Accepted: 01/20/2021] [Indexed: 11/26/2022] Open

Chen Y, Lu H, Zhang N, Zhu Z, Wang S, Li M. PremPS: Predicting the impact of missense mutations on protein stability. PLoS Comput Biol 2020;16:e1008543. [PMID: 33378330 PMCID: PMC7802934 DOI: 10.1371/journal.pcbi.1008543] [Citation(s) in RCA: 93] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Revised: 01/12/2021] [Accepted: 11/16/2020] [Indexed: 12/12/2022] Open

Abstract

Computational methods that predict protein stability changes induced by missense mutations have made a lot of progress over the past decades. Most of the available methods however have very limited accuracy in predicting stabilizing mutations because existing experimental sets are dominated by mutations reducing protein stability. Moreover, few approaches could consistently perform well across different test cases. To address these issues, we developed a new computational method PremPS to more accurately evaluate the effects of missense mutations on protein stability. The PremPS method is composed of only ten evolutionary- and structure-based features and parameterized on a balanced dataset with an equal number of stabilizing and destabilizing mutations. A comprehensive comparison of the predictive performance of PremPS with other available methods on nine benchmark datasets confirms that our approach consistently outperforms other methods and shows considerable improvement in estimating the impacts of stabilizing mutations. A protein could have multiple structures available, and if another structure of the same protein is used, the predicted change in stability for structure-based methods might be different. Thus, we further estimated the impact of using different structures on prediction accuracy, and demonstrate that our method performs well across different types of structures except for low-resolution structures and models built based on templates with low sequence identity. PremPS can be used for finding functionally important variants, revealing the molecular mechanisms of functional influences and protein design. PremPS is freely available at https://lilab.jysw.suda.edu.cn/research/PremPS/, which allows to do large-scale mutational scanning and takes about four minutes to perform calculations for a single mutation per protein with ~ 300 residues and requires ~ 0.4 seconds for each additional mutation.

The development of computational methods to accurately predict the impacts of amino acid substitutions on protein stability is of paramount importance for the field of protein design and understanding the roles of missense mutations in disease. However, most of the available methods have very limited predictive accuracy for mutations increasing stability and few could consistently perform well across different test cases. Here we present a new computational approach PremPS, which is capable of predicting the effects of single point mutations on protein stability. PremPS employs only ten evolutionary- and structure-based features and is trained on a symmetrical dataset consisting of the same number of cases of stabilizing and destabilizing mutations. Our method was tested against numerous blind datasets and shows a considerable improvement especially in evaluating the effects of stabilizing mutations, outperforming previously developed methods. PremPS is freely available as a user-friendly web server at http://lilab.jysw.suda.edu.cn/research/PremPS/, which is fast enough to handle the large number of cases.

Collapse

Ebo JS, Guthertz N, Radford SE, Brockwell DJ. Using protein engineering to understand and modulate aggregation. Curr Opin Struct Biol 2020;60:157-166. [PMID: 32087409 PMCID: PMC7132541 DOI: 10.1016/j.sbi.2020.01.005] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2019] [Revised: 01/08/2020] [Accepted: 01/09/2020] [Indexed: 02/07/2023]

Mazurenko S, Prokop Z, Damborsky J. Machine Learning in Enzyme Engineering. ACS Catal 2019. [DOI: 10.1021/acscatal.9b04321] [Citation(s) in RCA: 134] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]