1
|
Janković P, Otović E, Mauša G, Kalafatovic D. Manually curated dataset of catalytic peptides for ester hydrolysis. Data Brief 2023; 48:109290. [PMID: 37383747 PMCID: PMC10294096 DOI: 10.1016/j.dib.2023.109290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 05/12/2023] [Accepted: 05/30/2023] [Indexed: 06/30/2023] Open
Abstract
Catalytic peptides are low cost biomolecules able to catalyse chemical reactions such as ester hydrolysis. This dataset provides a list of catalytic peptides currently reported in literature. Several parameters were evaluated, including sequence length, composition, net charge, isoelectric point, hydrophobicity, self-assembly propensity and mechanism of catalysis. Along with the analysis of physico-chemical properties, the SMILES representation for each sequence was generated to provide an easy-to-use means of training machine learning models. This offers a unique opportunity for the development and validation of proof-of-concept predictive models. Being a reliable manually curated dataset, it also enables the benchmark for comparison of new models or models trained on automatically gathered peptide-oriented datasets. Moreover, the dataset provides an insight in the currently developed catalytic mechanisms and can be used as the foundation for the development of next-generation peptide-based catalysts.
Collapse
Affiliation(s)
- Patrizia Janković
- University of Rijeka, Department of Biotechnology, Rijeka 51000, Croatia
| | - Erik Otović
- University of Rijeka, Faculty of Engineering, Rijeka 51000, Croatia
- University of Rijeka, Center for Artificial Intelligence and Cybersecurity, Rijeka 51000, Croatia
| | - Goran Mauša
- University of Rijeka, Faculty of Engineering, Rijeka 51000, Croatia
- University of Rijeka, Center for Artificial Intelligence and Cybersecurity, Rijeka 51000, Croatia
| | - Daniela Kalafatovic
- University of Rijeka, Department of Biotechnology, Rijeka 51000, Croatia
- University of Rijeka, Center for Artificial Intelligence and Cybersecurity, Rijeka 51000, Croatia
| |
Collapse
|
2
|
Gašparović B, Morelato L, Lenac K, Mauša G, Zhurov A, Katić V. Comparing Direct Measurements and Three-Dimensional (3D) Scans for Evaluating Facial Soft Tissue. Sensors (Basel) 2023; 23:2412. [PMID: 36904614 PMCID: PMC10007047 DOI: 10.3390/s23052412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 02/14/2023] [Accepted: 02/15/2023] [Indexed: 06/18/2023]
Abstract
The inspection of patients' soft tissues and the effects of various dental procedures on their facial physiognomy are quite challenging. To minimise discomfort and simplify the process of manual measuring, we performed facial scanning and computer measurement of experimentally determined demarcation lines. Images were acquired using a low-cost 3D scanner. Two consecutive scans were obtained from 39 participants, to test the scanner repeatability. An additional ten persons were scanned before and after forward movement of the mandible (predicted treatment outcome). Sensor technology that combines red, green, and blue (RGB) data with depth information (RGBD) integration was used for merging frames into a 3D object. For proper comparison, the resulting images were registered together, which was performed with ICP (Iterative Closest Point)-based techniques. Measurements on 3D images were performed using the exact distance algorithm. One operator measured the same demarcation lines directly on participants; repeatability was tested (intra-class correlations). The results showed that the 3D face scans were reproducible with high accuracy (mean difference between repeated scans <1%); the actual measurements were repeatable to some extent (excellent only for the tragus-pogonion demarcation line); computational measurements were accurate, repeatable, and comparable to the actual measurements. Three dimensional (3D) facial scans can be used as a faster, more comfortable for patients, and more accurate technique to detect and quantify changes in facial soft tissue resulting from various dental procedures.
Collapse
Affiliation(s)
- Boris Gašparović
- Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia
- Center for Artificial Intelligence and Cybersecurity, University of Rijeka, R. Matejčić 2, 51000 Rijeka, Croatia
| | - Luka Morelato
- Faculty of Dental Medicine, University of Rijeka, Krešimirova 40-42, 51000 Rijeka, Croatia
- Clinical Hospital Centre Rijeka, Krešimirova 42, 51000 Rijeka, Croatia
| | - Kristijan Lenac
- Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia
| | - Goran Mauša
- Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia
| | - Alexei Zhurov
- Applied Clinical Research & Public Health, School of Dentistry, Cardiff University, College of Biomedical & Life Sciences Heath Park, Cardiff CF14 4XY, UK
| | - Višnja Katić
- Faculty of Dental Medicine, University of Rijeka, Krešimirova 40-42, 51000 Rijeka, Croatia
- Clinical Hospital Centre Rijeka, Krešimirova 42, 51000 Rijeka, Croatia
| |
Collapse
|
3
|
Babić M, Janković P, Marchesan S, Mauša G, Kalafatovic D. Esterase Sequence Composition Patterns for the Identification of Catalytic Triad Microenvironment Motifs. J Chem Inf Model 2022; 62:6398-6410. [PMID: 36223497 DOI: 10.1021/acs.jcim.2c00977] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Ester hydrolysis is of wide biomedical interest, spanning from the green synthesis of pharmaceuticals to biomaterials' development. Existing peptide-based catalysts exhibit low catalytic efficiency compared to natural enzymes, due to the conformational heterogeneity of peptides. Moreover, there is lack of understanding of the correlation between the primary sequence and catalytic function. For this purpose, we statistically analyzed 22 EC 3.1 hydrolases with known catalytic triads, characterized by unique and well-defined mechanisms. The aim was to identify patterns at the sequence level that will better inform the creation of short peptides containing important information for catalysis, based on the catalytic triad, oxyanion holes and the triad residues microenvironments. Moreover, fragmentation schemes of the primary sequence of selected enzymes alongside the study of their amino acid frequencies, composition, and physicochemical properties are proposed. The results showed highly conserved catalytic sites with distinct positional patterns and chemical microenvironments that favor catalysis and revealed variations in catalytic site composition that could be useful for the design of minimalistic catalysts.
Collapse
Affiliation(s)
- Marko Babić
- Department of Biotechnology, University of Rijeka, 51000Rijeka, Croatia
| | - Patrizia Janković
- Department of Biotechnology, University of Rijeka, 51000Rijeka, Croatia
| | - Silvia Marchesan
- Chemical and Pharmaceutical Sciences Department, University of Trieste, 34127Trieste, Italy
| | - Goran Mauša
- Faculty of Engineering, University of Rijeka, 51000Rijeka, Croatia
| | - Daniela Kalafatovic
- Department of Biotechnology, University of Rijeka, 51000Rijeka, Croatia.,Center for Advanced Computing and Modeling, University of Rijeka, 51000Rijeka, Croatia
| |
Collapse
|
4
|
Lučin I, Družeta S, Mauša G, Alvir M, Grbčić L, Lušić DV, Sikirica A, Kranjčević L. Predictive modeling of microbiological seawater quality in karst region using cascade model. Sci Total Environ 2022; 851:158009. [PMID: 35987218 DOI: 10.1016/j.scitotenv.2022.158009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 08/06/2022] [Accepted: 08/09/2022] [Indexed: 06/15/2023]
Abstract
This paper presents an in-depth analysis of seawater quality measurements during the bathing seasons from year 2009 to 2020 in the city of Rijeka, Croatia. Due to rare occurrences of measurements with less than excellent water quality, considered dataset is deeply imbalanced. Additionally, it incorporates measurements under the influence of submerged groundwater discharges (SGD), which were observed in some bathing locations. These discharges were previously thought to dry up during the summer season and are now suspected to be one of the causes of increased Escherichia coli values. Consequently, and in view of the fact that the accuracy of prediction models can be significantly influenced by temporal and spatial variation of the input data, a novel cascade prediction modeling strategy was proposed. It consists of a sequence of prediction models which tend to identify general environmental conditions which confidently lead to excellent bathing water quality. The proposed model uses environmental features which can rather easily be estimated or obtained from the weather forecast. The model was trained on a highly biased dataset, consisting of data from locations with and without SGD influence, and for the time period spanning extremely dry and warm seasons, extremely wet seasons, as well as normal seasons. To simulate realistic application, the model was tested using temporal and spatial stratification of data. The cascade strategy was shown to be a good approach for reliably detecting environmental parameters which produce excellent water quality. Proposed model is designed as a filter method, where instances classified as less-than-excellent water quality require further analysis. The cascade model provides great flexibility as it can be customized to the particular needs of the investigated area and dataset specifics.
Collapse
Affiliation(s)
- Ivana Lučin
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Siniša Družeta
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Goran Mauša
- Department of Computer Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Marta Alvir
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia
| | - Luka Grbčić
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Darija Vukić Lušić
- Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia; Department of Environmental Health, Faculty of Medicine, University of Rijeka, Braće Branchetta 20/1, Rijeka 51000, Croatia; Department of Environmental Health, Teaching Institute of Public Health of Primorje-Gorski Kotar County, Krešimirova 52a, Rijeka 51000, Croatia
| | - Ante Sikirica
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia
| | - Lado Kranjčević
- Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka 51000, Croatia; Center for Advanced Computing and Modelling, University of Rijeka, Radmile Matejčić 2, Rijeka 51000, Croatia.
| |
Collapse
|
5
|
Otović E, Njirjak M, Kalafatovic D, Mauša G. Sequential Properties Representation Scheme for Recurrent Neural Network-Based Prediction of Therapeutic Peptides. J Chem Inf Model 2022; 62:2961-2972. [PMID: 35704881 DOI: 10.1021/acs.jcim.2c00526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The discovery of therapeutic peptides is often accelerated by means of virtual screening supported by machine learning-based predictive models. The predictive performance of such models is sensitive to the choice of data and its representation scheme. While the peptide physicochemical and compositional representations fail to distinguish sequence permutations, the amino acid arrangement within the sequence lacks the important information contained in physicochemical, conformational, topological, and geometrical properties. In this paper, we propose a solution to the identified information gap by implementing a hybrid scheme that complements the best traits from both approaches with the aim of predicting antimicrobial and antiviral activities based on experimental data from DRAMP 2.0, AVPdb, and Uniprot data repositories. Using the Friedman test of statistical significance, we compared our hybrid, sequential properties approach to peptide properties, one-hot vector encoding, and word embedding schemes in the 10-fold cross-validation setting, with respect to the F1 score, Matthews correlation coefficient, geometric mean, recall, and precision evaluation metrics. Moreover, the sequence modeling neural network was employed to gain insight into the synergic effect of both properties- and amino acid order-based predictions. The results suggest that sequential properties significantly (P < 0.01) surpasses the aforementioned state-of-the-art representation schemes. This makes it a strong candidate for increasing the predictive power of screening methods based on machine learning, applicable to any category of peptides.
Collapse
Affiliation(s)
- Erik Otović
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | - Marko Njirjak
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia
| | - Daniela Kalafatovic
- University of Rijeka, Department of Biotechnology, 51000 Rijeka, Croatia.,University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
| | - Goran Mauša
- University of Rijeka, Faculty of Engineering, 51000 Rijeka, Croatia.,University of Rijeka, Center for Artificial Intelligence and Cybersecurity, 51000 Rijeka, Croatia
| |
Collapse
|
6
|
Mauša G, Galinac Grbac T. Co-evolutionary multi-population genetic programming for classification in software defect prediction: An empirical case study. Appl Soft Comput 2017. [DOI: 10.1016/j.asoc.2017.01.050] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|