1
|
Romero O, Abelló A. A framework for multidimensional design of data warehouses from ontologies. DATA KNOWL ENG 2010. [DOI: 10.1016/j.datak.2010.07.007] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
|
15 |
64 |
2
|
Romero O, Abelló A. Automatic validation of requirements to support multidimensional design. DATA KNOWL ENG 2010. [DOI: 10.1016/j.datak.2010.03.006] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
|
15 |
36 |
3
|
Nadal S, Romero O, Abelló A, Vassiliadis P, Vansummeren S. An integration-oriented ontology to govern evolution in Big Data ecosystems. INFORM SYST 2019. [DOI: 10.1016/j.is.2018.01.006] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
|
6 |
27 |
4
|
Gallinucci E, Golfarelli M, Rizzi S, Abelló A, Romero O. Interactive multidimensional modeling of linked data for exploratory OLAP. INFORM SYST 2018. [DOI: 10.1016/j.is.2018.06.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
|
7 |
14 |
5
|
Theodorou V, Abelló A, Thiele M, Lehner W. Frequent patterns in ETL workflows: An empirical approach. DATA KNOWL ENG 2017. [DOI: 10.1016/j.datak.2017.08.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
|
8 |
12 |
6
|
Romero O, Herrero V, Abelló A, Ferrarons J. Tuning small analytics on Big Data: Data partitioning and secondary indexes in the Hadoop ecosystem. INFORM SYST 2015. [DOI: 10.1016/j.is.2014.09.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
|
10 |
11 |
7
|
Maturana CR, de Oliveira AD, Nadal S, Bilalli B, Serrat FZ, Soley ME, Igual ES, Bosch M, Lluch AV, Abelló A, López-Codina D, Suñé TP, Clols ES, Joseph-Munné J. Advances and challenges in automated malaria diagnosis using digital microscopy imaging with artificial intelligence tools: A review. Front Microbiol 2022; 13:1006659. [PMID: 36458185 PMCID: PMC9705958 DOI: 10.3389/fmicb.2022.1006659] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 09/26/2022] [Indexed: 09/03/2023] Open
Abstract
Malaria is an infectious disease caused by parasites of the genus Plasmodium spp. It is transmitted to humans by the bite of an infected female Anopheles mosquito. It is the most common disease in resource-poor settings, with 241 million malaria cases reported in 2020 according to the World Health Organization. Optical microscopy examination of blood smears is the gold standard technique for malaria diagnosis; however, it is a time-consuming method and a well-trained microscopist is needed to perform the microbiological diagnosis. New techniques based on digital imaging analysis by deep learning and artificial intelligence methods are a challenging alternative tool for the diagnosis of infectious diseases. In particular, systems based on Convolutional Neural Networks for image detection of the malaria parasites emulate the microscopy visualization of an expert. Microscope automation provides a fast and low-cost diagnosis, requiring less supervision. Smartphones are a suitable option for microscopic diagnosis, allowing image capture and software identification of parasites. In addition, image analysis techniques could be a fast and optimal solution for the diagnosis of malaria, tuberculosis, or Neglected Tropical Diseases in endemic areas with low resources. The implementation of automated diagnosis by using smartphone applications and new digital imaging technologies in low-income areas is a challenge to achieve. Moreover, automating the movement of the microscope slide and image autofocusing of the samples by hardware implementation would systemize the procedure. These new diagnostic tools would join the global effort to fight against pandemic malaria and other infectious and poverty-related diseases.
Collapse
|
Review |
3 |
10 |
8
|
Jovanovic P, Romero O, Simitsis A, Abelló A, Mayorova D. A requirement-driven approach to the design and evolution of data warehouses. INFORM SYST 2014. [DOI: 10.1016/j.is.2014.01.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
|
11 |
10 |
9
|
Wrembel R, Abelló A, Song IY. DOLAP data warehouse research over two decades: Trends and challenges. INFORM SYST 2019. [DOI: 10.1016/j.is.2019.06.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
|
6 |
7 |
10
|
Bilalli B, Abelló A, Aluja-Banet T, Wrembel R. PRESISTANT: Learning based assistant for data pre-processing. DATA KNOWL ENG 2019. [DOI: 10.1016/j.datak.2019.101727] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
|
6 |
5 |
11
|
Maturana CR, de Oliveira AD, Nadal S, Serrat FZ, Sulleiro E, Ruiz E, Bilalli B, Veiga A, Espasa M, Abelló A, Suñé TP, Segú M, López-Codina D, Clols ES, Joseph-Munné J. iMAGING: a novel automated system for malaria diagnosis by using artificial intelligence tools and a universal low-cost robotized microscope. Front Microbiol 2023; 14:1240936. [PMID: 38075929 PMCID: PMC10704928 DOI: 10.3389/fmicb.2023.1240936] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 11/06/2023] [Indexed: 01/03/2025] Open
Abstract
INTRODUCTION Malaria is one of the most prevalent infectious diseases in sub-Saharan Africa, with 247 million cases reported worldwide in 2021 according to the World Health Organization. Optical microscopy remains the gold standard technique for malaria diagnosis, however, it requires expertise, is time-consuming and difficult to reproduce. Therefore, new diagnostic techniques based on digital image analysis using artificial intelligence tools can improve diagnosis and help automate it. METHODS In this study, a dataset of 2571 labeled thick blood smear images were created. YOLOv5x, Faster R-CNN, SSD, and RetinaNet object detection neural networks were trained on the same dataset to evaluate their performance in Plasmodium parasite detection. Attention modules were applied and compared with YOLOv5x results. To automate the entire diagnostic process, a prototype of 3D-printed pieces was designed for the robotization of conventional optical microscopy, capable of auto-focusing the sample and tracking the entire slide. RESULTS Comparative analysis yielded a performance for YOLOv5x on a test set of 92.10% precision, 93.50% recall, 92.79% F-score, and 94.40% mAP0.5 for leukocyte, early and mature Plasmodium trophozoites overall detection. F-score values of each category were 99.0% for leukocytes, 88.6% for early trophozoites and 87.3% for mature trophozoites detection. Attention modules performance show non-significant statistical differences when compared to YOLOv5x original trained model. The predictive models were integrated into a smartphone-computer application for the purpose of image-based diagnostics in the laboratory. The system can perform a fully automated diagnosis by the auto-focus and X-Y movements of the robotized microscope, the CNN models trained for digital image analysis, and the smartphone device. The new prototype would determine whether a Giemsa-stained thick blood smear sample is positive/negative for Plasmodium infection and its parasite levels. The whole system was integrated into the iMAGING smartphone application. CONCLUSION The coalescence of the fully-automated system via auto-focus and slide movements and the autonomous detection of Plasmodium parasites in digital images with a smartphone software and AI algorithms confers the prototype the optimal features to join the global effort against malaria, neglected tropical diseases and other infectious diseases.
Collapse
|
research-article |
2 |
3 |
12
|
Bilalli B, Munir RF, Abelló A. A framework for assessing the peer review duration of journals: case study in computer science. Scientometrics 2020. [DOI: 10.1007/s11192-020-03742-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
|
5 |
3 |
13
|
Abstract
Data lakes (DLs) are large repositories of raw datasets from disparate sources. As more datasets are ingested into a DL, there is an increasing need for efficient techniques to profile them and to detect the relationships among their schemata, commonly known as
holistic schema matching
. Schema matching detects similarity between the information stored in the datasets to support information discovery and retrieval. Currently, this is computationally expensive with the volume of state-of-the-art DLs. To handle this challenge, we propose a novel early-pruning approach to improve efficiency, where we collect different types of
content metadata
and
schema metadata
about the datasets, and then use this metadata in early-pruning steps to pre-filter the
schema matching
comparisons. This involves computing proximities between datasets based on their metadata, discovering their relationships based on overall proximities and proposing similar dataset pairs for schema matching. We improve the effectiveness of this task by introducing a supervised mining approach for effectively detecting similar datasets that are proposed for further schema matching. We conduct extensive experiments on a real-world DL that proves the success of our approach in effectively detecting similar datasets for schema matching, with recall rates of more than 85% and efficiency improvements above 70%. We empirically show the computational cost saving in space and time by applying our approach in comparison to instance-based schema matching techniques.
Collapse
|
|
5 |
1 |
14
|
Hewasinghage M, Nadal S, Abelló A, Zimányi E. Automated database design for document stores with multicriteria optimization. Knowl Inf Syst 2023. [DOI: 10.1007/s10115-023-01828-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
AbstractDocument stores have gained popularity among NoSQL systems mainly due to the semi-structured data storage structure and the enhanced query capabilities. The database design in document stores expands beyond the first normal form by encouraging de-normalization through nesting. This hinders the process, as the number of alternatives grows exponentially with multiple choices in nesting (including different levels) and referencing (including the direction of the reference). Due to this complexity, document store data design is mostly carried out in trial-and-error or ad-hoc rule-based approaches. However, the choices affect multiple, often conflicting, aspects such as query performance, storage space, and complexity of the documents. To overcome these issues, in this paper, we apply multicriteria optimization. Our approach is driven by a query workload and a set of optimization objectives. First, we formalize a canonical model to represent alternative designs and introduce an algebra of transformations that can systematically modify a design. Then, using these transformations, we implement a local search algorithm driven by a loss function that can propose near-optimal designs with high probability. Finally, we compare our prototype against an existing document store data design solution purely driven by query cost, where our proposed designs have better performance and are more compact with less redundancy.
Collapse
|
|
2 |
|
15
|
Abelló A, Benatallah B, Bellatreche L. Special Issue on: Model and Data Engineering. JOURNAL ON DATA SEMANTICS 2014. [DOI: 10.1007/s13740-013-0033-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
|
11 |
|
16
|
Dantas de Oliveira A, Rubio Maturana C, Zarzuela Serrat F, Carvalho BM, Sulleiro E, Prats C, Veiga A, Bosch M, Zulueta J, Abelló A, Sayrol E, Joseph-Munné J, López-Codina D. Development of a low-cost robotized 3D-prototype for automated optical microscopy diagnosis: An open-source system. PLoS One 2024; 19:e0304085. [PMID: 38905190 PMCID: PMC11192333 DOI: 10.1371/journal.pone.0304085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 05/07/2024] [Indexed: 06/23/2024] Open
Abstract
In a clinical context, conventional optical microscopy is commonly used for the visualization of biological samples for diagnosis. However, the availability of molecular techniques and rapid diagnostic tests are reducing the use of conventional microscopy, and consequently the number of experienced professionals starts to decrease. Moreover, the continuous visualization during long periods of time through an optical microscope could affect the final diagnosis results due to induced human errors and fatigue. Therefore, microscopy automation is a challenge to be achieved and address this problem. The aim of the study is to develop a low-cost automated system for the visualization of microbiological/parasitological samples by using a conventional optical microscope, and specially designed for its implementation in resource-poor settings laboratories. A 3D-prototype to automate the majority of conventional optical microscopes was designed. Pieces were built with 3D-printing technology and polylactic acid biodegradable material with Tinkercad/Ultimaker Cura 5.1 slicing softwares. The system's components were divided into three subgroups: microscope stage pieces, storage/autofocus-pieces, and smartphone pieces. The prototype is based on servo motors, controlled by Arduino open-source electronic platform, to emulate the X-Y and auto-focus (Z) movements of the microscope. An average time of 27.00 ± 2.58 seconds is required to auto-focus a single FoV. Auto-focus evaluation demonstrates a mean average maximum Laplacian value of 11.83 with tested images. The whole automation process is controlled by a smartphone device, which is responsible for acquiring images for further diagnosis via convolutional neural networks. The prototype is specially designed for resource-poor settings, where microscopy diagnosis is still a routine process. The coalescence between convolutional neural network predictive models and the automation of the movements of a conventional optical microscope confer the system a wide range of image-based diagnosis applications. The accessibility of the system could help improve diagnostics and provide new tools to laboratories worldwide.
Collapse
|
research-article |
1 |
|
17
|
Munir RF, Abelló A, Romero O, Thiele M, Lehner W. Configuring Parallelism for Hybrid Layouts Using Multi-Objective Optimization. BIG DATA 2020; 8:235-247. [PMID: 32397735 DOI: 10.1089/big.2019.0068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Modern organizations typically store their data in a raw format in data lakes. These data are then processed and usually stored under hybrid layouts, because they allow projection and selection operations. Thus, they allow (when required) to read less data from the disk. However, this is not very well exploited by distributed processing frameworks (e.g., Hadoop, Spark) when analytical queries are posed. These frameworks divide the data into multiple partitions and then process each partition in a separate task, consequently creating tasks based on the total file size and not the actual size of the data to be read. This typically leads to launching more tasks than needed, which, in turn, increases the query execution time and induces significant waste of computing resources. To allow a more efficient use of resources and reduce the query execution time, we propose a method that decides the number of tasks based on the data being read. To this end, we first propose a cost-based model for estimating the size of data read in hybrid layouts. Next, we use the estimated reading size in a multi-objective optimization method to decide the number of tasks and computational resources to be used. We prototyped our solution for Apache Parquet and Spark and found that our estimations are highly correlated (0.96) with the real executions. Further, using TPC-H we show that our recommended configurations are only 5.6% away from the Pareto front and provide 2.1 × speedup compared with default solutions.
Collapse
|
|
5 |
|
18
|
Abelló A, Rull M, Grau S. [Fluid therapy in anesthesia, a discussion. own experience]. REVISTA ESPANOLA DE ANESTESIOLOGIA Y REANIMACION 1975; 22:36-54. [PMID: 1144893] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
|
|
50 |
|
19
|
Rubio Maturana C, de Oliveira AD, Zarzuela F, Mediavilla A, Martínez-Vallejo P, Silgado A, Goterris L, Muixí M, Abelló A, Veiga A, López-Codina D, Sulleiro E, Sayrol E, Joseph-Munné J. Evaluation of an Artificial Intelligence-Based Tool and a Universal Low-Cost Robotized Microscope for the Automated Diagnosis of Malaria. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2024; 22:47. [PMID: 39857500 PMCID: PMC11764607 DOI: 10.3390/ijerph22010047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Revised: 12/13/2024] [Accepted: 12/21/2024] [Indexed: 01/27/2025]
Abstract
The gold standard diagnosis for malaria is the microscopic visualization of blood smears to identify Plasmodium parasites, although it is an expert-dependent technique and could trigger diagnostic errors. Artificial intelligence (AI) tools based on digital image analysis were postulated as a suitable supportive alternative for automated malaria diagnosis. A diagnostic evaluation of the iMAGING AI-based system was conducted in the reference laboratory of the International Health Unit Drassanes-Vall d'Hebron in Barcelona, Spain. iMAGING is an automated device for the diagnosis of malaria by using artificial intelligence image analysis tools and a robotized microscope. A total of 54 Giemsa-stained thick blood smear samples from travelers and migrants coming from endemic areas were employed and analyzed to determine the presence/absence of Plasmodium parasites. AI diagnostic results were compared with expert light microscopy gold standard method results. The AI system shows 81.25% sensitivity and 92.11% specificity when compared with the conventional light microscopy gold standard method. Overall, 48/54 (88.89%) samples were correctly identified [13/16 (81.25%) as positives and 35/38 (92.11%) as negatives]. The mean time of the AI system to determine a positive malaria diagnosis was 3 min and 48 s, with an average of 7.38 FoV analyzed per sample. Statistical analyses showed the Kappa Index = 0.721, demonstrating a satisfactory correlation between the gold standard diagnostic method and iMAGING results. The AI system demonstrated reliable results for malaria diagnosis in a reference laboratory in Barcelona. Validation in malaria-endemic regions will be the next step to evaluate its potential in resource-poor settings.
Collapse
|
Evaluation Study |
1 |
|
20
|
Rubio Maturana C, Dantas de Oliveira A, Zarzuela F, Ruiz E, Sulleiro E, Mediavilla A, Martínez-Vallejo P, Nadal S, Pumarola T, López-Codina D, Abelló A, Sayrol E, Joseph-Munné J. Development of an automated artificial intelligence-based system for urogenital schistosomiasis diagnosis using digital image analysis techniques and a robotized microscope. PLoS Negl Trop Dis 2024; 18:e0012614. [PMID: 39499735 PMCID: PMC11567526 DOI: 10.1371/journal.pntd.0012614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 11/15/2024] [Accepted: 10/08/2024] [Indexed: 11/07/2024] Open
Abstract
BACKGROUND Urogenital schistosomiasis is considered a Neglected Tropical Disease (NTD) by the World Health Organization (WHO). It is estimated to affect 150 million people worldwide, with a high relevance in resource-poor settings of the African continent. The gold-standard diagnosis is still direct observation of Schistosoma haematobium eggs in urine samples by optical microscopy. Novel diagnostic techniques based on digital image analysis by Artificial Intelligence (AI) tools are a suitable alternative for schistosomiasis diagnosis. METHODOLOGY Digital images of 24 urine sediment samples were acquired in non-endemic settings. S. haematobium eggs were manually labeled in digital images by laboratory professionals and used for training YOLOv5 and YOLOv8 models, which would achieve automatic detection and localization of the eggs. Urine sediment images were also employed to perform binary classification of images to detect erythrocytes/leukocytes with the MobileNetv3Large, EfficientNetv2, and NasNetLarge models. A robotized microscope system was employed to automatically move the slide through the X-Y axis and to auto-focus the sample. RESULTS A total number of 1189 labels were annotated in 1017 digital images from urine sediment samples. YOLOv5x training demonstrated a 99.3% precision, 99.4% recall, 99.3% F-score, and 99.4% mAP0.5 for S. haematobium detection. NasNetLarge has an 85.6% accuracy for erythrocyte/leukocyte detection with the test dataset. Convolutional neural network training and comparison demonstrated that YOLOv5x for the detection of eggs and NasNetLarge for the binary image classification to detect erythrocytes/leukocytes were the best options for our digital image database. CONCLUSIONS The development of low-cost novel diagnostic techniques based on the detection and identification of S. haematobium eggs in urine by AI tools would be a suitable alternative to conventional microscopy in non-endemic settings. This technical proof-of-principle study allows laying the basis for improving the system, and optimizing its implementation in the laboratories.
Collapse
|
research-article |
1 |
|