1
|
Asudani DS, Nagwani NK, Singh P. Impact of word embedding models on text analytics in deep learning environment: a review. Artif Intell Rev 2023; 56:1-81. [PMID: 36844886 PMCID: PMC9944441 DOI: 10.1007/s10462-023-10419-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2023] [Indexed: 02/25/2023]
Abstract
The selection of word embedding and deep learning models for better outcomes is vital. Word embeddings are an n-dimensional distributed representation of a text that attempts to capture the meanings of the words. Deep learning models utilize multiple computing layers to learn hierarchical representations of data. The word embedding technique represented by deep learning has received much attention. It is used in various natural language processing (NLP) applications, such as text classification, sentiment analysis, named entity recognition, topic modeling, etc. This paper reviews the representative methods of the most prominent word embedding and deep learning models. It presents an overview of recent research trends in NLP and a detailed understanding of how to use these models to achieve efficient results on text analytics tasks. The review summarizes, contrasts, and compares numerous word embedding and deep learning models and includes a list of prominent datasets, tools, APIs, and popular publications. A reference for selecting a suitable word embedding and deep learning approach is presented based on a comparative analysis of different techniques to perform text analytics tasks. This paper can serve as a quick reference for learning the basics, benefits, and challenges of various word representation approaches and deep learning models, with their application to text analytics and a future outlook on research. It can be concluded from the findings of this study that domain-specific word embedding and the long short term memory model can be employed to improve overall text analytics task performance.
Collapse
Affiliation(s)
- Deepak Suresh Asudani
- Department of Computer Science and Engineering, National Institute of Technology, Raipur, Chhattisgarh India
| | - Naresh Kumar Nagwani
- Department of Computer Science and Engineering, National Institute of Technology, Raipur, Chhattisgarh India
| | - Pradeep Singh
- Department of Computer Science and Engineering, National Institute of Technology, Raipur, Chhattisgarh India
| |
Collapse
|
2
|
Dimauro G, Griseta ME, Camporeale MG, Clemente F, Guarini A, Maglietta R. An intelligent non-invasive system for automated diagnosis of anemia exploiting a novel dataset. Artif Intell Med 2023; 136:102477. [PMID: 36710064 DOI: 10.1016/j.artmed.2022.102477] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 12/19/2022] [Accepted: 12/19/2022] [Indexed: 12/27/2022]
Abstract
Anemia is a condition in which the oxygen-carrying capacity of red blood cells is insufficient to meet the body's physiological needs. It affects billions of people worldwide. An early diagnosis of this disease could prevent the advancement of other disorders. Traditional methods used to detect anemia consist of venipuncture, which requires a patient to frequently undergo laboratory tests. Therefore, anemia diagnosis using noninvasive and cost-effective methods is an open challenge. The pallor of the fingertips, palms, nail beds, and eye conjunctiva can be observed to establish whether a patient suffers from anemia. This article addresses the above challenges by presenting a novel intelligent system, based on machine learning, that supports the automated diagnosis of anemia. This system is innovative from different points of view. Specifically, it has been trained on a dataset that contains eye conjunctiva photos of Indian and Italian patients. This dataset, which was created using a very strict experimental set, is now made available to the Scientific Community. Moreover, compared to previous systems in the literature, the proposed system uses a low-cost device, which makes it suitable for widespread use. The performance of the learning algorithms utilizing two different areas of the mucous membrane of the eye is discussed. In particular, the RUSBoost algorithm, when appropriately trained on palpebral conjunctiva images, shows good performance in classifying anemic and nonanemic patients. The results are very robust, even when considering different ethnicities.
Collapse
Affiliation(s)
- Giovanni Dimauro
- Department of Computer Science, University of Bari 'Aldo Moro', Bari, Italy.
| | - Maria Elena Griseta
- Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, National Research Council of Italy, Bari, Italy.
| | | | - Felice Clemente
- Haematology Dept. of National Cancer Institute 'Giovanni Paolo II', Bari, Italy.
| | - Attilio Guarini
- Haematology Dept. of National Cancer Institute 'Giovanni Paolo II', Bari, Italy.
| | - Rosalia Maglietta
- Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, National Research Council of Italy, Bari, Italy.
| |
Collapse
|
3
|
Alharbi F, Vakanski A. Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review. Bioengineering (Basel) 2023; 10:bioengineering10020173. [PMID: 36829667 PMCID: PMC9952758 DOI: 10.3390/bioengineering10020173] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/24/2023] [Accepted: 01/26/2023] [Indexed: 01/31/2023] Open
Abstract
Cancer is a term that denotes a group of diseases caused by the abnormal growth of cells that can spread in different parts of the body. According to the World Health Organization (WHO), cancer is the second major cause of death after cardiovascular diseases. Gene expression can play a fundamental role in the early detection of cancer, as it is indicative of the biochemical processes in tissue and cells, as well as the genetic characteristics of an organism. Deoxyribonucleic acid (DNA) microarrays and ribonucleic acid (RNA)-sequencing methods for gene expression data allow quantifying the expression levels of genes and produce valuable data for computational analysis. This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods. Both conventional and deep learning-based approaches are reviewed, with an emphasis on the application of deep learning models due to their comparative advantages for identifying gene patterns that are distinctive for various types of cancers. Relevant works that employ the most commonly used deep neural network architectures are covered, including multi-layer perceptrons, as well as convolutional, recurrent, graph, and transformer networks. This survey also presents an overview of the data collection methods for gene expression analysis and lists important datasets that are commonly used for supervised machine learning for this task. Furthermore, we review pertinent techniques for feature engineering and data preprocessing that are typically used to handle the high dimensionality of gene expression data, caused by a large number of genes present in data samples. The paper concludes with a discussion of future research directions for machine learning-based gene expression analysis for cancer classification.
Collapse
|
4
|
Guo Y, Shen H, Li W, Li C, Jin C. Deep Effective k-mer representation learning for polyadenylation signal prediction via co-occurrence embedding. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.109887] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
5
|
Impact of COVID-19 on electricity energy consumption: A quantitative analysis on electricity. INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS 2022. [PMCID: PMC8872829 DOI: 10.1016/j.ijepes.2022.108084] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
In addition to the tremendous loss of life due to coronavirus disease 2019 (COVID-19), the pandemic created challenges for the energy system, as strict confinement measures such as lockdown and social distancing compelled by governments worldwide resulted in a significant reduction in energy demand. In this study, a novel, quantitative and uncomplex method for estimating the energy consumption loss due to the pandemic, which was derived from epidemiological data in the beginning stages, is provided; the method bonds a data-driven prediction (LSTM network) of energy consumption due to COVID-19 to an econometric model (ARDL) so that the long- and short-term impact can be synthesized with adequate statistical validation. The results show that energy loss is statistically correlated with the time-changing effective reproductive number (Rt) of the disease, which can be viewed as quantifying confinement intensity and the severity of the earlier stages of the pandemic. We detected a 1.62% decrease in electricity consumption loss caused by each percent decrease in Rt on average. We verify our method by applying it to Germany and 5 U.S. states with various social features and discuss implications and universality. Our results bridge the knowledge gap between key energy and epidemiological parameters and provide policymakers with a more precise estimate of the pandemic’s impact on electricity demand so that strategies can be formulated to minimize losses caused by similar crises.
Collapse
|
6
|
Sahoo S, Kumar S, Abedin MZ, Lim WM, Jakhar SK. Deep learning applications in manufacturing operations: a review of trends and ways forward. JOURNAL OF ENTERPRISE INFORMATION MANAGEMENT 2022. [DOI: 10.1108/jeim-01-2022-0025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
PurposeDeep learning (DL) technologies assist manufacturers to manage their business operations. This research aims to present state-of-the-art insights on the trends and ways forward for DL applications in manufacturing operations.Design/methodology/approachUsing bibliometric analysis and the SPAR-4-SLR protocol, this research conducts a systematic literature review to present a scientific mapping of top-tier research on DL applications in manufacturing operations.FindingsThis research discovers and delivers key insights on six knowledge clusters pertaining to DL applications in manufacturing operations: automated system modelling, intelligent fault diagnosis, forecasting, sustainable manufacturing, environmental management, and intelligent scheduling.Research limitations/implicationsThis research establishes the important roles of DL in manufacturing operations. However, these insights were derived from top-tier journals only. Therefore, this research does not discount the possibility of the availability of additional insights in alternative outlets, such as conference proceedings, where teasers into emerging and developing concepts may be published.Originality/valueThis research contributes seminal insights into DL applications in manufacturing operations. In this regard, this research is valuable to readers (academic scholars and industry practitioners) interested to gain an understanding of the important roles of DL in manufacturing operations as well as the future of its applications for Industry 4.0, such as Maintenance 4.0, Quality 4.0, Logistics 4.0, Manufacturing 4.0, Sustainability 4.0, and Supply Chain 4.0.
Collapse
|
7
|
Wang J, Yu Z, Luan Z, Ren J, Zhao Y, Yu G. RDAU-Net: Based on a Residual Convolutional Neural Network With DFP and CBAM for Brain Tumor Segmentation. Front Oncol 2022; 12:805263. [PMID: 35311076 PMCID: PMC8924611 DOI: 10.3389/fonc.2022.805263] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 01/14/2022] [Indexed: 12/20/2022] Open
Abstract
Due to the high heterogeneity of brain tumors, automatic segmentation of brain tumors remains a challenging task. In this paper, we propose RDAU-Net by adding dilated feature pyramid blocks with 3D CBAM blocks and inserting 3D CBAM blocks after skip-connection layers. Moreover, a CBAM with channel attention and spatial attention facilitates the combination of more expressive feature information, thereby leading to more efficient extraction of contextual information from images of various scales. The performance was evaluated on the Multimodal Brain Tumor Segmentation (BraTS) challenge data. Experimental results show that RDAU-Net achieves state-of-the-art performance. The Dice coefficient for WT on the BraTS 2019 dataset exceeded the baseline value by 9.2%.
Collapse
Affiliation(s)
- Jingjing Wang
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| | - Zishu Yu
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| | - Zhenye Luan
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| | - Jinwen Ren
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| | - Yanhua Zhao
- Obstetrics and Gynecology, Tengzhou Xigang Central Health Center, Tengzhou, China
| | - Gang Yu
- College of Physics and Electronics Science, Shandong Normal University, Jinan, China
| |
Collapse
|
8
|
Singh A, Dargar SK, Gupta A, Kumar A, Srivastava AK, Srivastava M, Kumar Tiwari P, Ullah MA. Evolving Long Short-Term Memory Network-Based Text Classification. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022; 2022:4725639. [PMID: 35237308 PMCID: PMC8885205 DOI: 10.1155/2022/4725639] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 12/10/2021] [Accepted: 01/12/2022] [Indexed: 11/18/2022]
Abstract
Recently, long short-term memory (LSTM) networks are extensively utilized for text classification. Compared to feed-forward neural networks, it has feedback connections, and thus, it has the ability to learn long-term dependencies. However, the LSTM networks suffer from the parameter tuning problem. Generally, initial and control parameters of LSTM are selected on a trial and error basis. Therefore, in this paper, an evolving LSTM (ELSTM) network is proposed. A multiobjective genetic algorithm (MOGA) is used to optimize the architecture and weights of LSTM. The proposed model is tested on a well-known factory reports dataset. Extensive analyses are performed to evaluate the performance of the proposed ELSTM network. From the comparative analysis, it is found that the LSTM network outperforms the competitive models.
Collapse
Affiliation(s)
- Arjun Singh
- Computer and Communication Engineering, School of Computing and IT, Manipal University Jaipur, Jaipur, India
| | - Shashi Kant Dargar
- Department of Electronics and Communication Engineering, Kalasalingam Academy of Research and Education, Virudhunagar, Tamilnadu, India
| | - Amit Gupta
- Department of Electronics and Communication Engineering, Narasaraopeta Engineering College, Narasaraopeta, Andhra Pradesh, India
| | - Ashish Kumar
- Department of Computer Science and Engineering, School of Computing and IT, Manipal University Jaipur, Jaipur, India
| | | | | | | | - Mohammad Aman Ullah
- Department of Computer Science and Engineering, International Islamic University Chittagong, Chittagong, Bangladesh
| |
Collapse
|
9
|
Wu P, Li X, Ling C, Ding S, Shen S. Sentiment classification using attention mechanism and bidirectional long short-term memory network. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107792] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
10
|
Morilla I. Repairing the human with artificial intelligence in oncology. Artif Intell Cancer 2021; 2:60-68. [DOI: 10.35713/aic.v2.i5.60] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 10/26/2021] [Accepted: 10/27/2021] [Indexed: 02/06/2023] Open
Abstract
Artificial intelligence is a groundbreaking tool to learn and analyse higher features extracted from any dataset at large scale. This ability makes it ideal to facing any complex problem that may generally arise in the biomedical domain or oncology in particular. In this work, we envisage to provide a global vision of this mathematical discipline outgrowth by linking some other related subdomains such as transfer, reinforcement or federated learning. Complementary, we also introduce the recently popular method of topological data analysis that improves the performance of learning models.
Collapse
Affiliation(s)
- Ian Morilla
- Laboratoire Analyse, Géométrie et Applications - Institut Galilée, Sorbonne Paris Nord University, Paris 75006, France
| |
Collapse
|
11
|
Conceição SIR, Couto FM. Text Mining for Building Biomedical Networks Using Cancer as a Case Study. Biomolecules 2021; 11:biom11101430. [PMID: 34680062 PMCID: PMC8533101 DOI: 10.3390/biom11101430] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 09/24/2021] [Accepted: 09/27/2021] [Indexed: 12/15/2022] Open
Abstract
In the assembly of biological networks it is important to provide reliable interactions in an effort to have the most possible accurate representation of real-life systems. Commonly, the data used to build a network comes from diverse high-throughput essays, however most of the interaction data is available through scientific literature. This has become a challenge with the notable increase in scientific literature being published, as it is hard for human curators to track all recent discoveries without using efficient tools to help them identify these interactions in an automatic way. This can be surpassed by using text mining approaches which are capable of extracting knowledge from scientific documents. One of the most important tasks in text mining for biological network building is relation extraction, which identifies relations between the entities of interest. Many interaction databases already use text mining systems, and the development of these tools will lead to more reliable networks, as well as the possibility to personalize the networks by selecting the desired relations. This review will focus on different approaches of automatic information extraction from biomedical text that can be used to enhance existing networks or create new ones, such as deep learning state-of-the-art approaches, focusing on cancer disease as a case-study.
Collapse
|