1
|
Pazos F. Computational prediction of protein functional sites-Applications in biotechnology and biomedicine. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 130:39-57. [PMID: 35534114 DOI: 10.1016/bs.apcsb.2021.12.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
There are many computational approaches for predicting protein functional sites based on different sequence and structural features. These methods are essential to cope with the sequence deluge that is filling databases with uncharacterized protein sequences. They complement the more expensive and time-consuming experimental approaches by pointing them to possible candidate positions. In many cases they are jointly used to characterize the functional sites in proteins of biotechnological and biomedical interest and eventually modify them for different purposes. There is a clear trend towards approaches based on machine learning and those using structural information, due to the recent developments in these areas. Nevertheless, "classic" methods based on sequence and evolutionary features are still playing an important role as these features are strongly related to functionality. In this review, the main approaches for predicting general functional sites in a protein are discussed, with a focus on sequence-based approaches.
Collapse
Affiliation(s)
- Florencio Pazos
- Computational Systems Biology Group, National Center for Biotechnology (CNB-CSIC), Madrid, Spain.
| |
Collapse
|
2
|
Rauer C, Sen N, Waman VP, Abbasian M, Orengo CA. Computational approaches to predict protein functional families and functional sites. Curr Opin Struct Biol 2021; 70:108-122. [PMID: 34225010 DOI: 10.1016/j.sbi.2021.05.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 05/13/2021] [Accepted: 05/25/2021] [Indexed: 01/06/2023]
Abstract
Understanding the mechanisms of protein function is indispensable for many biological applications, such as protein engineering and drug design. However, experimental annotations are sparse, and therefore, theoretical strategies are needed to fill the gap. Here, we present the latest developments in building functional subclassifications of protein superfamilies and using evolutionary conservation to detect functional determinants, for example, catalytic-, binding- and specificity-determining residues important for delineating the functional families. We also briefly review other features exploited for functional site detection and new machine learning strategies for combining multiple features.
Collapse
Affiliation(s)
- Clemens Rauer
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Neeladri Sen
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Vaishali P Waman
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Mahnaz Abbasian
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Christine A Orengo
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK.
| |
Collapse
|
3
|
Fonseca NJ, Afonso MQL, Carrijo L, Bleicher L. CONAN: a web application to detect specificity determinants and functional sites by amino acids co-variation network analysis. Bioinformatics 2021; 37:1026-1028. [PMID: 32780795 DOI: 10.1093/bioinformatics/btaa713] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Revised: 08/01/2020] [Accepted: 08/05/2020] [Indexed: 11/12/2022] Open
Abstract
SUMMARY CONAN is a web application developed to detect specificity determinants and function-related sites by amino acids co-variation networks analysis, emphasizing local coevolutionary constraints. The software allows the characterization of structurally and functionally relevant groups of residues and their relationship with subsets of sequences by automatic cross-referencing with GO terms, UniprotKb annotations and INTERPRO. AVAILABILITY AND IMPLEMENTATION CONAN is free and open-source, being distributed in the terms of the GPLV3 license. The software is available as a web application and python script versions and can be accessed at http://bioinfo.icb.ufmg.br/conan. We also provide running instructions, the source code and a user guide.
Collapse
Affiliation(s)
- N J Fonseca
- Cellular Structure and 3D Bioimaging, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK.,Department of Biochemistry and Immunology, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - M Q L Afonso
- Department of Biochemistry and Immunology, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - L Carrijo
- Department of Biochemistry and Immunology, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil
| | - L Bleicher
- Department of Biochemistry and Immunology, Institute of Biological Sciences, Federal University of Minas Gerais, Belo Horizonte 31270-901, Brazil
| |
Collapse
|
4
|
Riggs K, Chen HS, Rotunno M, Li B, Simonds NI, Mechanic LE, Peng B. On the application, reporting, and sharing of in silico simulations for genetic studies. Genet Epidemiol 2020; 45:131-141. [PMID: 33063887 PMCID: PMC7984380 DOI: 10.1002/gepi.22362] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2020] [Revised: 09/11/2020] [Accepted: 09/14/2020] [Indexed: 12/31/2022]
Abstract
In silico simulations play an indispensable role in the development and application of statistical models and methods for genetic studies. Simulation tools allow for the evaluation of methods and investigation of models in a controlled manner. With the growing popularity of evolutionary models and simulation‐based statistical methods, genetic simulations have been applied to a wide variety of research disciplines such as population genetics, evolutionary genetics, genetic epidemiology, ecology, and conservation biology. In this review, we surveyed 1409 articles from five journals that publish on major application areas of genetic simulations. We identified 432 papers in which genetic simulations were used and examined the targets and applications of simulation studies and how these simulation methods and simulated data sets are reported and shared. Whereas a large proportion (30%) of the surveyed articles reported the use of genetic simulations, only 28% of these genetic simulation studies used existing simulation software, 2% used existing simulated data sets, and 19% and 12% made source code and simulated data sets publicly available, respectively. Moreover, 15% of articles provided no information on how simulation studies were performed. These findings suggest a need to encourage sharing and reuse of existing simulation software and data sets, as well as providing more information regarding the performance of simulations.
Collapse
Affiliation(s)
- Kaleigh Riggs
- Department of Statistics, Rice University, Houston, Texas, USA
| | - Huann-Sheng Chen
- Division of Cancer Control and Population Sciences, Statistical Research and Applications Branch, Surveillance Research Program, National Cancer Institute (NCI), National Institutes of Health (NIH), Bethesda, Maryland, USA
| | - Melissa Rotunno
- Division of Cancer Control and Population Sciences, Genomic Epidemiology Branch, Epidemiology and Genomics Research Program, NCI, NIH, Bethesda, Maryland, USA
| | - Bing Li
- Department of Biostatistics, Brown University, Providence, Rhode Island, USA
| | | | - Leah E Mechanic
- Division of Cancer Control and Population Sciences, Genomic Epidemiology Branch, Epidemiology and Genomics Research Program, NCI, NIH, Bethesda, Maryland, USA
| | - Bo Peng
- Department of Medicine, Baylor College of Medicine, Houston, Texas, USA
| |
Collapse
|