1
|
Molęda M, Małysiak-Mrozek B, Ding W, Sunderam V, Mrozek D. From Corrective to Predictive Maintenance-A Review of Maintenance Approaches for the Power Industry. Sensors (Basel) 2023; 23:5970. [PMID: 37447820 DOI: 10.3390/s23135970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2023] [Revised: 06/23/2023] [Accepted: 06/24/2023] [Indexed: 07/15/2023]
Abstract
Appropriate maintenance of industrial equipment keeps production systems in good health and ensures the stability of production processes. In specific production sectors, such as the electrical power industry, equipment failures are rare but may lead to high costs and substantial economic losses not only for the power plant but for consumers and the larger society. Therefore, the power production industry relies on a variety of approaches to maintenance tasks, ranging from traditional solutions and engineering know-how to smart, AI-based analytics to avoid potential downtimes. This review shows the evolution of maintenance approaches to support maintenance planning, equipment monitoring and supervision. We present older techniques traditionally used in maintenance tasks and those that rely on IT analytics to automate tasks and perform the inference process for failure detection. We analyze prognostics and health-management techniques in detail, including their requirements, advantages and limitations. The review focuses on the power-generation sector. However, some of the issues addressed are common to other industries. The article also presents concepts and solutions that utilize emerging technologies related to Industry 4.0, touching on prescriptive analysis, Big Data and the Internet of Things. The primary motivation and purpose of the article are to present the existing practices and classic methods used by engineers, as well as modern approaches drawing from Artificial Intelligence and the concept of Industry 4.0. The summary of existing practices and the state of the art in the area of predictive maintenance provides two benefits. On the one hand, it leads to improving processes by matching existing tools and methods. On the other hand, it shows researchers potential directions for further analysis and new developments.
Collapse
Affiliation(s)
- Marek Molęda
- TAURON Wytwarzanie S.A., Promienna 51, 43-603 Jaworzno, Poland
| | - Bożena Małysiak-Mrozek
- Department of Distributed Systems and Informatic Devices, Silesian University of Technology, 44-100 Gliwice, Poland
| | - Weiping Ding
- School of Information Science and Technology, Nantong University, No. 9 Seyuan Road, Nantong 226019, China
| | - Vaidy Sunderam
- Department of Computer Science, Emory University, Atlanta, GA 30322, USA
| | - Dariusz Mrozek
- Department of Applied Informatics, Silesian University of Technology, 44-100 Gliwice, Poland
| |
Collapse
|
2
|
Hung CL, Lin KH, Lee YK, Mrozek D, Tsai YT, Lin CH. The Classification of Stages of Epiretinal Membrane using Convolutional Neural Network on Optical Coherence Tomography Image. Methods 2023; 214:28-34. [PMID: 37116670 DOI: 10.1016/j.ymeth.2023.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Revised: 03/18/2023] [Accepted: 04/22/2023] [Indexed: 04/30/2023] Open
Abstract
BACKGROUND AND OBJECTIVE The gold standard for diagnosing epiretinal membranes is to observe the surface of the internal limiting membrane on optical coherence tomography images. The stages of the epiretinal membrane are used to decide the condition of the health of the membrane. The stages are not detected because some of them are similar. To accurately classify the stages, a deep-learning technology can be used to improve the classification accuracy. METHODS A combinatorial fusion with multiple convolutional neural networks (CNN) algorithms are proposed to enhance the accuracy of a single image classification model. The proposed method was trained using a dataset of 1947 optical coherence tomography images diagnosed with the epiretinal membrane at the Taichung Veterans General Hospital in Taiwan. The images consisted of 4 stages; stages 1, 2, 3, and 4. RESULTS The overall accuracy of the classification was 84%. The combination of five and six CNN models achieves the highest testing accuracy (85%) among other combinations, respectively. Any combination with a different number of CNN models outperforms any single CNN algorithm working alone. Meanwhile, the accuracy of the proposed method is better than ophthalmologists with years of clinical experience. CONCLUSIONS We have developed an efficient epiretinal membrane classification method by using combinatorial fusion with CNN models on optical coherence tomography images. The proposed method can be used for screening purposes to facilitate ophthalmologists making the correct diagnoses in general medical practice.
Collapse
Affiliation(s)
- Che-Lun Hung
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taiwan R.O. C; Computer Science and Communication Engineering, Providence University, Taiwan R.O. C.
| | - Keng-Hung Lin
- Department of Ophthalmology, Taichung Veterans General Hospital, Taiwan R.O.C
| | - Yu-Kai Lee
- Department of Computer Science and Information Engineering, Providence University, Taiwan R.O.C.
| | - Dariusz Mrozek
- Department of Applied Informatics, Silesian University of Technology.
| | - Yin-Te Tsai
- Computer Science and Communication Engineering, Providence University, Taiwan R.O. C.
| | - Chun-Hsien Lin
- Department of Ophthalmology, Taichung Veterans General Hospital, Taiwan R.O.C.
| |
Collapse
|
3
|
Ding W, Perez JA, Cheung YM, Das S, Yue X, Mrozek D. Special issue on fuzzy systems for biomedical science in healthcare. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
4
|
Qi R, Sangaiah AK, Mrozek D, Zou Q. Editorial: Machine Learning Techniques on Gene Function Prediction Volume II. Front Genet 2022; 13:949285. [PMID: 35846151 PMCID: PMC9280618 DOI: 10.3389/fgene.2022.949285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 06/13/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Ren Qi
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | | | - Dariusz Mrozek
- Institute of Informatics, Silesian University of Technology, Gliwice, Poland
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- *Correspondence: Quan Zou,
| |
Collapse
|
5
|
Grzesik P, Augustyn DR, Wyciślik Ł, Mrozek D. Serverless computing in omics data analysis and integration. Brief Bioinform 2021; 23:6367629. [PMID: 34505137 PMCID: PMC8499876 DOI: 10.1093/bib/bbab349] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 06/28/2021] [Accepted: 08/06/2021] [Indexed: 11/30/2022] Open
Abstract
A comprehensive analysis of omics data can require vast computational resources and access to varied data sources that must be integrated into complex, multi-step analysis pipelines. Execution of many such analyses can be accelerated by applying the cloud computing paradigm, which provides scalable resources for storing data of different types and parallelizing data analysis computations. Moreover, these resources can be reused for different multi-omics analysis scenarios. Traditionally, developers are required to manage a cloud platform’s underlying infrastructure, configuration, maintenance and capacity planning. The serverless computing paradigm simplifies these operations by automatically allocating and maintaining both servers and virtual machines, as required for analysis tasks. This paradigm offers highly parallel execution and high scalability without manual management of the underlying infrastructure, freeing developers to focus on operational logic. This paper reviews serverless solutions in bioinformatics and evaluates their usage in omics data analysis and integration. We start by reviewing the application of the cloud computing model to a multi-omics data analysis and exposing some shortcomings of the early approaches. We then introduce the serverless computing paradigm and show its applicability for performing an integrative analysis of multiple omics data sources in the context of the COVID-19 pandemic.
Collapse
Affiliation(s)
- Piotr Grzesik
- Silesian University of Technology, Department of Applied Informatics, Gliwice 44-100, Poland
| | - Dariusz R Augustyn
- Silesian University of Technology, Department of Applied Informatics, Gliwice 44-100, Poland
| | - Łukasz Wyciślik
- Silesian University of Technology, Department of Applied Informatics, Gliwice 44-100, Poland
| | - Dariusz Mrozek
- Corresponding author: Dariusz Mrozek, Department of Applied Informatics, Silesian University of Technology, Gliwice 44-100, Poland. E-mail:
| |
Collapse
|
6
|
Mrozek D, Stępień K, Grzesik P, Małysiak-Mrozek B. A Large-Scale and Serverless Computational Approach for Improving Quality of NGS Data Supporting Big Multi-Omics Data Analyses. Front Genet 2021; 12:699280. [PMID: 34326863 PMCID: PMC8314304 DOI: 10.3389/fgene.2021.699280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 05/28/2021] [Indexed: 11/13/2022] Open
Abstract
Various types of analyses performed over multi-omics data are driven today by next-generation sequencing (NGS) techniques that produce large volumes of DNA/RNA sequences. Although many tools allow for parallel processing of NGS data in a Big Data distributed environment, they do not facilitate the improvement of the quality of NGS data for a large scale in a simple declarative manner. Meanwhile, large sequencing projects and routine DNA/RNA sequencing associated with molecular profiling of diseases for personalized treatment require both good quality data and appropriate infrastructure for efficient storing and processing of the data. To solve the problems, we adapt the concept of Data Lake for storing and processing big NGS data. We also propose a dedicated library that allows cleaning the DNA/RNA sequences obtained with single-read and paired-end sequencing techniques. To accommodate the growth of NGS data, our solution is largely scalable on the Cloud and may rapidly and flexibly adjust to the amount of data that should be processed. Moreover, to simplify the utilization of the data cleaning methods and implementation of other phases of data analysis workflows, our library extends the declarative U-SQL query language providing a set of capabilities for data extraction, processing, and storing. The results of our experiments prove that the whole solution supports requirements for ample storage and highly parallel, scalable processing that accompanies NGS-based multi-omics data analyses.
Collapse
Affiliation(s)
- Dariusz Mrozek
- Department of Applied Informatics, Silesian University of Technology, Gliwice, Poland
| | - Krzysztof Stępień
- Department of Applied Informatics, Silesian University of Technology, Gliwice, Poland
| | - Piotr Grzesik
- Department of Applied Informatics, Silesian University of Technology, Gliwice, Poland
| | - Bożena Małysiak-Mrozek
- Department of Graphics, Computer Vision and Digital Systems, Silesian University of Technology, Gliwice, Poland
| |
Collapse
|
7
|
Augustyn DR, Wyciślik Ł, Mrozek D. Perspectives of using Cloud computing in integrative analysis of multi-omics data. Brief Funct Genomics 2021; 20:198-206. [PMID: 33676373 DOI: 10.1093/bfgp/elab007] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 01/25/2021] [Accepted: 01/26/2021] [Indexed: 12/11/2022] Open
Abstract
Integrative analysis of multi-omics data is usually computationally demanding. It frequently requires building complex, multi-step analysis pipelines, applying dedicated techniques for data processing and combining several data sources. These efforts lead to a better understanding of life processes, current health state or the effects of therapeutic activities. However, many omics data analysis solutions focus only on a selected problem, disease, types of data or organisms. Moreover, they are implemented for general-purpose scientific computational platforms that most often do not easily scale the calculations natively. These features are not conducive to advances in understanding genotype-phenotypic relationships. Fortunately, with new technological paradigms, including Cloud computing, virtualization and containerization, these functionalities could be orchestrated for easy scaling and building independent analysis pipelines for omics data. Therefore, solutions can be re-used for purposes that they were not primarily designed. This paper shows perspectives of using Cloud computing advances and containerization approach for such a purpose. We first review how the Cloud computing model is utilized in multi-omics data analysis and show weak points of the adopted solutions. Then, we introduce containerization concepts, which allow both scaling and linking of functional services designed for various purposes. Finally, on the Bioconductor software package example, we disclose a verified concept model of a universal solution that exhibits the potentials for performing integrative analysis of multiple omics data sources.
Collapse
Affiliation(s)
- Dariusz R Augustyn
- Silesian University of Technology, Department of Applied Informatics, Gliwice 44-100, Poland
| | - Łukasz Wyciślik
- Silesian University of Technology, Department of Applied Informatics, Gliwice 44-100, Poland
| | - Dariusz Mrozek
- Silesian University of Technology, Department of Applied Informatics, Gliwice 44-100, Poland
| |
Collapse
|
8
|
Mrozek D, Koczur A, Małysiak-Mrozek B. Fall detection in older adults with mobile IoT devices and machine learning in the cloud and on the edge. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.05.070] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
9
|
Mrozek D. A review of Cloud computing technologies for comprehensive microRNA analyses. Comput Biol Chem 2020; 88:107365. [PMID: 32906056 DOI: 10.1016/j.compbiolchem.2020.107365] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 08/05/2020] [Accepted: 08/18/2020] [Indexed: 01/08/2023]
Abstract
Cloud computing revolutionized many fields that require ample computational power. Cloud platforms may also provide huge support for microRNA analysis mainly through disclosing scalable resources of different types. In Clouds, these resources are available as services, which simplifies their allocation and releasing. This feature is especially useful during the analysis of large volumes of data, like the one produced by next generation sequencing experiments, which require not only extended storage space but also a distributed computing environment. In this paper, we show which of the Cloud properties and service models can be especially beneficial for microRNA analysis. We also explain the most useful services of the Cloud (including storage space, computational power, web application hosting, machine learning models, and Big Data frameworks) that can be used for microRNA analysis. At the same time, we review several solutions for microRNA and show that the utilization of the Cloud in this field is still weak, but can increase in the future when the awareness of their applicability grows.
Collapse
Affiliation(s)
- Dariusz Mrozek
- Department of Applied Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland.
| |
Collapse
|
10
|
Mrozek D, Kwiendacz J, Malysiak-Mrozek B. Protein Construction-Based Data Partitioning Scheme for Alignment of Protein Macromolecular Structures Through Distributed Querying in Federated Databases. IEEE Trans Nanobioscience 2020; 19:102-116. [DOI: 10.1109/tnb.2019.2930494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
11
|
Affiliation(s)
- Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | | | - Dariusz Mrozek
- Institute of Informatics, Silesian University of Technology, Gliwice, Poland
| |
Collapse
|
12
|
Affiliation(s)
- Bozena Malysiak-Mrozek
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | - Kamil Zur
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | - Dariusz Mrozek
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| |
Collapse
|
13
|
Abstract
Summary: Popular methods for 3D protein structure similarity searching, especially those that generate high-quality alignments such as Combinatorial Extension (CE) and Flexible structure Alignment by Chaining Aligned fragment pairs allowing Twists (FATCAT) are still time consuming. As a consequence, performing similarity searching against large repositories of structural data requires increased computational resources that are not always available. Cloud computing provides huge amounts of computational power that can be provisioned on a pay-as-you-go basis. We have developed the cloud-based system that allows scaling of the similarity searching process vertically and horizontally. Cloud4Psi (Cloud for Protein Similarity) was tested in the Microsoft Azure cloud environment and provided good, almost linearly proportional acceleration when scaled out onto many computational units. Availability and implementation: Cloud4Psi is available as Software as a Service for testing purposes at: http://cloud4psi.cloudapp.net/. For source code and software availability, please visit the Cloud4Psi project home page at http://zti.polsl.pl/dmrozek/science/cloud4psi.htm. Contact:dariusz.mrozek@polsl.pl
Collapse
Affiliation(s)
- Dariusz Mrozek
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | - Bożena Małysiak-Mrozek
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | - Artur Kłapciński
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| |
Collapse
|
14
|
Mrozek D, Brożek M, Małysiak-Mrozek B. Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA. J Mol Model 2014; 20:2067. [PMID: 24481593 PMCID: PMC3936136 DOI: 10.1007/s00894-014-2067-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2013] [Accepted: 10/11/2013] [Indexed: 01/16/2023]
Abstract
Searching for similar 3D protein structures is one of the primary processes employed in the field of structural bioinformatics. However, the computational complexity of this process means that it is constantly necessary to search for new methods that can perform such a process faster and more efficiently. Finding molecular substructures that complex protein structures have in common is still a challenging task, especially when entire databases containing tens or even hundreds of thousands of protein structures must be scanned. Graphics processing units (GPUs) and general purpose graphics processing units (GPGPUs) can perform many time-consuming and computationally demanding processes much more quickly than a classical CPU can. In this paper, we describe the GPU-based implementation of the CASSERT algorithm for 3D protein structure similarity searching. This algorithm is based on the two-phase alignment of protein structures when matching fragments of the compared proteins. The GPU (GeForce GTX 560Ti: 384 cores, 2GB RAM) implementation of CASSERT (“GPU-CASSERT”) parallelizes both alignment phases and yields an average 180-fold increase in speed over its CPU-based, single-core implementation on an Intel Xeon E5620 (2.40GHz, 4 cores). In this paper, we show that massive parallelization of the 3D structure similarity search process on many-core GPU devices can reduce the execution time of the process, allowing it to be performed in real time. GPU-CASSERT is available at: http://zti.polsl.pl/dmrozek/science/gpucassert/cassert.htm.
Collapse
Affiliation(s)
- Dariusz Mrozek
- Institute of Informatics, Silesian University of Technology, Gliwice, Poland,
| | | | | |
Collapse
|
15
|
Mrozek D, Małysiak-Mrozek B, Siążnik A. search GenBank: interactive orchestration and ad-hoc choreography of Web services in the exploration of the biomedical resources of the National Center For Biotechnology Information. BMC Bioinformatics 2013; 14:73. [PMID: 23452691 PMCID: PMC3602006 DOI: 10.1186/1471-2105-14-73] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2012] [Accepted: 02/22/2013] [Indexed: 11/27/2022] Open
Abstract
Background Due to the growing number of biomedical entries in data repositories of the National Center for Biotechnology Information (NCBI), it is difficult to collect, manage and process all of these entries in one place by third-party software developers without significant investment in hardware and software infrastructure, its maintenance and administration. Web services allow development of software applications that integrate in one place the functionality and processing logic of distributed software components, without integrating the components themselves and without integrating the resources to which they have access. This is achieved by appropriate orchestration or choreography of available Web services and their shared functions. After the successful application of Web services in the business sector, this technology can now be used to build composite software tools that are oriented towards biomedical data processing. Results We have developed a new tool for efficient and dynamic data exploration in GenBank and other NCBI databases. A dedicated search GenBank system makes use of NCBI Web services and a package of Entrez Programming Utilities (eUtils) in order to provide extended searching capabilities in NCBI data repositories. In search GenBank users can use one of the three exploration paths: simple data searching based on the specified user’s query, advanced data searching based on the specified user’s query, and advanced data exploration with the use of macros. search GenBank orchestrates calls of particular tools available through the NCBI Web service providing requested functionality, while users interactively browse selected records in search GenBank and traverse between NCBI databases using available links. On the other hand, by building macros in the advanced data exploration mode, users create choreographies of eUtils calls, which can lead to the automatic discovery of related data in the specified databases. Conclusions search GenBank extends standard capabilities of the NCBI Entrez search engine in querying biomedical databases. The possibility of creating and saving macros in the search GenBank is a unique feature and has a great potential. The potential will further grow in the future with the increasing density of networks of relationships between data stored in particular databases. search GenBank is available for public use at http://sgb.biotools.pl/.
Collapse
Affiliation(s)
- Dariusz Mrozek
- Institute of Informatics, Silesian University of Technology, Akademicka 16, Gliwice, 44-100, Poland.
| | | | | |
Collapse
|
16
|
|
17
|
Mrozek D, Malysiak-Mrozek B. An Improved Method for Protein Similarity Searching by Alignment of Fuzzy Energy Signatures. INT J COMPUT INT SYS 2011. [DOI: 10.2991/ijcis.2011.4.1.7] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
|
18
|
Mrozek D, Wieczorek D, Malysiak-Mrozek B, Kozielski S. PSS-SQL: protein secondary structure - structured query language. Annu Int Conf IEEE Eng Med Biol Soc 2010; 2010:1073-6. [PMID: 21096554 DOI: 10.1109/iembs.2010.5627303] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Secondary structure representation of proteins provides important information regarding protein general construction and shape. This representation is often used in protein similarity searching. Since existing commercial database management systems do not offer integrated exploration methods for biological data e.g. at the level of the SQL language, the structural similarity searching is usually performed by external tools. In the paper, we present our newly developed PSS-SQL language, which allows searching a database in order to identify proteins having secondary structure similar to the structure specified by the user in a PSS-SQL query. Therefore, we provide a simple and declarative language for protein structure similarity searching.
Collapse
Affiliation(s)
- Dariusz Mrozek
- Institute of Informatics, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland.
| | | | | | | |
Collapse
|