1
|
Petrovskiy DV, Nikolsky KS, Rudnev VR, Kulikova LI, Butkova TV, Malsagova KA, Kopylov AT, Kaysheva AL. SAFoldNet: A Novel Tool for Discovering and Aligning Three-Dimensional Protein Structures Based on a Neural Network. Int J Mol Sci 2023; 24:14439. [PMID: 37833886 PMCID: PMC10572457 DOI: 10.3390/ijms241914439] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 09/15/2023] [Accepted: 09/19/2023] [Indexed: 10/15/2023] Open
Abstract
The development and improvement of methods for comparing and searching for three-dimensional protein structures remain urgent tasks in modern structural biology. To solve this problem, we developed a new tool, SAFoldNet, which allows for searching, aligning, superimposing, and determining the exact coordinates of fragments of protein structures. The proposed search and alignment tool was built using neural networking. Specifically, we implemented the integrative synergy of neural network predictions and the well-known BLAST algorithm for searching and aligning sequences. The proposed method involves multistage processing, comprising a stage for converting the geometry of protein structures into sequences of a structural alphabet using a neural network, a search stage for forming a set of candidate structures, and a refinement stage for calculating the structural alignment and overlap and evaluating the similarity with the starting structure of the search. The effectiveness and practical applicability of the proposed tool were compared with those of several widely used services for searching and aligning protein structures. The results of the comparisons confirmed that the proposed method is effective and competitive relative to the available modern services. Furthermore, using the proposed approach, a service with a user-friendly web interface was developed, which allows for searching, aligning, and superimposing protein structures; determining the location of protein fragments; mapping onto a protein molecule chain; and providing structural similarity metrices (expected value and root mean square deviation).
Collapse
Affiliation(s)
| | | | | | | | | | - Kristina A. Malsagova
- Institute of Biomedical Chemistry, 119121 Moscow, Russia; (D.V.P.); (K.S.N.); (V.R.R.); (L.I.K.); (T.V.B.); (A.T.K.); (A.L.K.)
| | | | | |
Collapse
|
2
|
Arias-Agudelo LM, Garcia-Montoya G, Cabarcas F, Galvan-Diaz AL, Alzate JF. Comparative genomic analysis of the principal Cryptosporidium species that infect humans. PeerJ 2020; 8:e10478. [PMID: 33344091 PMCID: PMC7718795 DOI: 10.7717/peerj.10478] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 11/11/2020] [Indexed: 11/25/2022] Open
Abstract
Cryptosporidium parasites are ubiquitous and can infect a broad range of vertebrates and are considered the most frequent protozoa associated with waterborne parasitic outbreaks. The intestine is the target of three of the species most frequently found in humans: C. hominis, C. parvum, and. C. meleagridis. Despite the recent advance in genome sequencing projects for this apicomplexan, a broad genomic comparison including the three species most prevalent in humans have not been published so far. In this work, we downloaded raw NGS data, assembled it under normalized conditions, and compared 23 publicly available genomes of C. hominis, C. parvum, and C. meleagridis. Although few genomes showed highly fragmented assemblies, most of them had less than 500 scaffolds and mean coverage that ranged between 35X and 511X. Synonymous single nucleotide variants were the most common in C. hominis and C. meleagridis, while in C. parvum, they accounted for around 50% of the SNV observed. Furthermore, deleterious nucleotide substitutions common to all three species were more common in genes associated with DNA repair, recombination, and chromosome-associated proteins. Indel events were observed in the 23 studied isolates that spanned up to 500 bases. The highest number of deletions was observed in C. meleagridis, followed by C. hominis, with more than 60 species-specific deletions found in some isolates of these two species. Although several genes with indel events have been partially annotated, most of them remain to encode uncharacterized proteins.
Collapse
Affiliation(s)
- Laura M Arias-Agudelo
- Centro Nacional de Secuenciación Genómica - CNSG, Sede de Investigación Universitaria - SIU, Departamento de Microbiología y Parasitología, Facultad de Medicina, Universidad de Antioquia, Medellin, Antioquia, Colombia
| | - Gisela Garcia-Montoya
- Centro Nacional de Secuenciación Genómica - CNSG, Sede de Investigación Universitaria - SIU, Departamento de Microbiología y Parasitología, Facultad de Medicina, Universidad de Antioquia, Medellin, Antioquia, Colombia
| | - Felipe Cabarcas
- Centro Nacional de Secuenciación Genómica - CNSG, Sede de Investigación Universitaria - SIU, Departamento de Microbiología y Parasitología, Facultad de Medicina, Universidad de Antioquia, Medellin, Antioquia, Colombia.,Grupo SISTEMIC, Departamento de Ingeniería Electrónica, Facultad de Ingeniería, Universidad de Antioquia, Medellin, Antioquia, Colombia
| | - Ana L Galvan-Diaz
- Grupo de Microbiología ambiental. Escuela de Microbiología, Universidad de Antioquia, Medellin, Antioquia, Colombia
| | - Juan F Alzate
- Centro Nacional de Secuenciación Genómica - CNSG, Sede de Investigación Universitaria - SIU, Departamento de Microbiología y Parasitología, Facultad de Medicina, Universidad de Antioquia, Medellin, Antioquia, Colombia
| |
Collapse
|