1
|
Asghar R, Kumar S, Shaukat A, Hynds P. Classification of white blood cells (leucocytes) from blood smear imagery using machine and deep learning models: A global scoping review. PLoS One 2024; 19:e0292026. [PMID: 38885231 PMCID: PMC11182552 DOI: 10.1371/journal.pone.0292026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 05/13/2024] [Indexed: 06/20/2024] Open
Abstract
Machine learning (ML) and deep learning (DL) models are being increasingly employed for medical imagery analyses, with both approaches used to enhance the accuracy of classification/prediction in the diagnoses of various cancers, tumors and bloodborne diseases. To date however, no review of these techniques and their application(s) within the domain of white blood cell (WBC) classification in blood smear images has been undertaken, representing a notable knowledge gap with respect to model selection and comparison. Accordingly, the current study sought to comprehensively identify, explore and contrast ML and DL methods for classifying WBCs. Following development and implementation of a formalized review protocol, a cohort of 136 primary studies published between January 2006 and May 2023 were identified from the global literature, with the most widely used techniques and best-performing WBC classification methods subsequently ascertained. Studies derived from 26 countries, with highest numbers from high-income countries including the United States (n = 32) and The Netherlands (n = 26). While WBC classification was originally rooted in conventional ML, there has been a notable shift toward the use of DL, and particularly convolutional neural networks (CNN), with 54.4% of identified studies (n = 74) including the use of CNNs, and particularly in concurrence with larger datasets and bespoke features e.g., parallel data pre-processing, feature selection, and extraction. While some conventional ML models achieved up to 99% accuracy, accuracy was shown to decrease in concurrence with decreasing dataset size. Deep learning models exhibited improved performance for more extensive datasets and exhibited higher levels of accuracy in concurrence with increasingly large datasets. Availability of appropriate datasets remains a primary challenge, potentially resolvable using data augmentation techniques. Moreover, medical training of computer science researchers is recommended to improve current understanding of leucocyte structure and subsequent selection of appropriate classification models. Likewise, it is critical that future health professionals be made aware of the power, efficacy, precision and applicability of computer science, soft computing and artificial intelligence contributions to medicine, and particularly in areas like medical imaging.
Collapse
Affiliation(s)
- Rabia Asghar
- Spatiotemporal Environmental Epidemiology Research (STEER) Group, Technological University Dublin, Dublin, Ireland
| | - Sanjay Kumar
- National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Arslan Shaukat
- National University of Sciences and Technology (NUST), Islamabad, Pakistan
| | - Paul Hynds
- Spatiotemporal Environmental Epidemiology Research (STEER) Group, Technological University Dublin, Dublin, Ireland
| |
Collapse
|
2
|
Saini P, Kumar K, Kashid S, Saini A, Negi A. Video summarization using deep learning techniques: a detailed analysis and investigation. Artif Intell Rev 2023. [PMCID: PMC10015543 DOI: 10.1007/s10462-023-10444-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
Abstract
One of the critical multimedia analysis problems in today’s digital world is video summarization (VS). Many VS methods have been suggested based on deep learning methods. Nevertheless, These are inefficient in processing, extracting, and deriving information in the minimum amount of time from long-duration videos. Detailed analysis and investigation of numerous deep learning approach accomplished to determine root of problems connected with different deep learning methods in identifying and summarizing the essential activities in such videos. Various deep learning techniques have been investigated and examined to detect the event and summarization capability for detecting and summarizing multiple activities. Keyframe selection Event detection, categorization, and the activity feature summarization correspond to each activity. The limitations related to each category are also discussed in depth. Concerns about detecting low activity using the deep network on various types of public datasets are also discussed. Viable strategies are suggested to evaluate and improve the generated video summaries on such datasets. Moreover, Potential recommended applications based on literature are listed out. Various deep learning tools for experimental analysis have also been discussed in the paper. Future directions are presented for further exploration of research in VS using deep learning strategies.
Collapse
Affiliation(s)
- Parul Saini
- Department of Computer Science and Engineering, National Institute of Technology Uttarakhand, Srinagar Garhwal, Uttarakhand 246174 India
| | - Krishan Kumar
- Department of Computer Science and Engineering, National Institute of Technology Uttarakhand, Srinagar Garhwal, Uttarakhand 246174 India
| | - Shamal Kashid
- Department of Computer Science and Engineering, National Institute of Technology Uttarakhand, Srinagar Garhwal, Uttarakhand 246174 India
| | - Ashray Saini
- Department of Computer Science and Engineering, National Institute of Technology Uttarakhand, Srinagar Garhwal, Uttarakhand 246174 India
| | - Alok Negi
- Department of Computer Science and Engineering, National Institute of Technology Uttarakhand, Srinagar Garhwal, Uttarakhand 246174 India
| |
Collapse
|
3
|
Gupta D, Sharma A. A comprehensive study of automatic video summarization techniques. Artif Intell Rev 2023. [DOI: 10.1007/s10462-023-10429-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
4
|
Mahum R, Irtaza A, Rehman SU, Meraj T, Rauf HT. A Player-Specific Framework for Cricket Highlights Generation Using Deep Convolutional Neural Networks. ELECTRONICS 2022; 12:65. [DOI: 10.3390/electronics12010065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
Automatic ways to generate video summarization is a key technique to manage huge video content nowadays. The aim of video summaries is to provide important information in less time to viewers. There exist some techniques for video summarization in the cricket domain, however, to the best of our knowledge our proposed model is the first one to deal with specific player summaries in cricket videos successfully. In this study, we provide a novel framework and a valuable technique for cricket video summarization and classification. For video summary specific to the player, the proposed technique exploits the fact i.e., presence of Score Caption (SC) in frames. In the first stage, optical character recognition (OCR) is applied to extract text summary from SC to find all frames of the specific player such as the Start Frame (SF) to the Last Frame (LF). In the second stage, various frames of cricket videos are used in the supervised AlexNet classifier for training along with class labels such as positive and negative for binary classification. A pre-trained network is trained for binary classification of those frames which are attained from the first phase exhibiting the performance of a specific player along with some additional scenes. In the third phase, the person identification technique is employed to recognize frames containing the specific player. Then, frames are cropped and SIFT features are extracted from identified person to further cluster these frames using the fuzzy c-means clustering method. The reason behind the third phase is to further optimize the video summaries as the frames attained in the second stage included the partner player’s frame as well. The proposed framework successfully utilizes the cricket videoo dataset. Additionally, the technique is very efficient and useful in broadcasting cricket video highlights of a specific player. The experimental results signify that our proposed method surpasses the previously stated results, improving the overall accuracy of up to 95%.
Collapse
|
5
|
Wang X, Li Y, Wang H, Huang L, Ding S. A Video Summarization Model Based on Deep Reinforcement Learning with Long-Term Dependency. SENSORS (BASEL, SWITZERLAND) 2022; 22:7689. [PMID: 36236789 PMCID: PMC9571073 DOI: 10.3390/s22197689] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Revised: 09/23/2022] [Accepted: 10/03/2022] [Indexed: 06/16/2023]
Abstract
Deep summarization models have succeeded in the video summarization field based on the development of gated recursive unit (GRU) and long and short-term memory (LSTM) technology. However, for some long videos, GRU and LSTM cannot effectively capture long-term dependencies. This paper proposes a deep summarization network with auxiliary summarization losses to address this problem. We introduce an unsupervised auxiliary summarization loss module with LSTM and a swish activation function to capture the long-term dependencies for video summarization, which can be easily integrated with various networks. The proposed model is an unsupervised framework for deep reinforcement learning that does not depend on any labels or user interactions. Additionally, we implement a reward function (R(S)) that jointly considers the consistency, diversity, and representativeness of generated summaries. Furthermore, the proposed model is lightweight and can be successfully deployed on mobile devices and enhance the experience of mobile users and reduce pressure on server operations. We conducted experiments on two benchmark datasets and the results demonstrate that our proposed unsupervised approach can obtain better summaries than existing video summarization methods. Furthermore, the proposed algorithm can generate higher F scores with a nearly 6.3% increase on the SumMe dataset and a 2.2% increase on the TVSum dataset compared to the DR-DSN model.
Collapse
|
6
|
Zhu W, Han Y, Lu J, Zhou J. Relational Reasoning Over Spatial-Temporal Graphs for Video Summarization. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3017-3031. [PMID: 35385384 DOI: 10.1109/tip.2022.3163855] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In this paper, we propose a dynamic graph modeling approach to learn spatial-temporal representations for video summarization. Most existing video summarization methods extract image-level features with ImageNet pre-trained deep models. Differently, our method exploits object-level and relation-level information to capture spatial-temporal dependencies. Specifically, our method builds spatial graphs on the detected object proposals. Then, we construct a temporal graph by using the aggregated representations of spatial graphs. Afterward, we perform relational reasoning over spatial and temporal graphs with graph convolutional networks and extract spatial-temporal representations for importance score prediction and key shot selection. To eliminate relation clutters caused by densely connected nodes, we further design a self-attention edge pooling module, which disregards meaningless relations of graphs. We conduct extensive experiments on two popular benchmarks, including the SumMe and TVSum datasets. Experimental results demonstrate that the proposed method achieves superior performance against state-of-the-art video summarization methods.
Collapse
|
7
|
Muhammad W, Ahmed I, Ahmad J, Nawaz M, Alabdulkreem E, Ghadi Y. A video summarization framework based on activity attention modeling using deep features for smart campus surveillance system. PeerJ Comput Sci 2022; 8:e911. [PMID: 35494862 PMCID: PMC9044333 DOI: 10.7717/peerj-cs.911] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 02/10/2022] [Indexed: 06/14/2023]
Abstract
Like other business domains, digital monitoring has now become an integral part of almost every academic institution. These surveillance systems cover all the routine activities happening on the campus while producing a massive volume of video data. Selection and searching the desired video segment in such a vast video repository is highly time-consuming. Effective video summarization methods are thus needed for fast navigation and retrieval of video content. This paper introduces a keyframe extraction method to summarize academic activities to produce a short representation of the target video while preserving all the essential activities present in the original video. First, we perform fine-grain activity recognition using a realistic Campus Activities Dataset (CAD) by modeling activity attention scores using a deep CNN model. In the second phase, we use the generated attention scores for each activity category to extract significant video frames. Finally, we evaluate the inter-frame similarity index used to reduce the number of redundant frames and extract only the representative keyframes. The proposed framework is tested on different videos, and the experimental results show the performance of the proposed summarization process.
Collapse
Affiliation(s)
- Wasim Muhammad
- Center of Excellence in Information Technology, Institute of Management Sciences (IMSciences), Peshawar, Peshawar, KPK, Pakistan
| | - Imran Ahmed
- Center of Excellence in Information Technology, Institute of Management Sciences (IMSciences), Peshawar, Peshawar, KPK, Pakistan
| | - Jamil Ahmad
- Center of Excellence in Information Technology, Institute of Management Sciences (IMSciences), Peshawar, Peshawar, KPK, Pakistan
- Department of Computer Science, Islamia College Peshawar (Chartered University), Peshawar, Pakistan
| | - Muhammad Nawaz
- Center of Excellence in Information Technology, Institute of Management Sciences (IMSciences), Peshawar, Peshawar, KPK, Pakistan
| | - Eatedal Alabdulkreem
- Computer Sciences Department, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia
| | - Yazeed Ghadi
- Department of Computer Science, Al Ain University, Al Ain, UAE
| |
Collapse
|
8
|
Att-BiL-SL: Attention-Based Bi-LSTM and Sequential LSTM for Describing Video in the Textual Formation. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app12010317] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
With the advancement of the technological field, day by day, people from around the world are having easier access to internet abled devices, and as a result, video data is growing rapidly. The increase of portable devices such as various action cameras, mobile cameras, motion cameras, etc., can also be considered for the faster growth of video data. Data from these multiple sources need more maintenance to process for various usages according to the needs. By considering these enormous amounts of video data, it cannot be navigated fully by the end-users. Throughout recent times, many research works have been done to generate descriptions from the images or visual scene recordings to address the mentioned issue. This description generation, also known as video captioning, is more complex than single image captioning. Various advanced neural networks have been used in various studies to perform video captioning. In this paper, we propose an attention-based Bi-LSTM and sequential LSTM (Att-BiL-SL) encoder-decoder model for describing the video in textual format. The model consists of two-layer attention-based bi-LSTM and one-layer sequential LSTM for video captioning. The model also extracts the universal and native temporal features from the video frames for smooth sentence generation from optical frames. This paper includes the word embedding with a soft attention mechanism and a beam search optimization algorithm to generate qualitative results. It is found that the architecture proposed in this paper performs better than various existing state of the art models.
Collapse
|
9
|
Khan MA, Mittal M, Goyal LM, Roy S. A deep survey on supervised learning based human detection and activity classification methods. MULTIMEDIA TOOLS AND APPLICATIONS 2021; 80:27867-27923. [DOI: 10.1007/s11042-021-10811-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 03/03/2021] [Accepted: 03/10/2021] [Indexed: 08/25/2024]
|
10
|
Khan N, Muhammad K, Hussain T, Nasir M, Munsif M, Imran AS, Sajjad M. An Adaptive Game-Based Learning Strategy for Children Road Safety Education and Practice in Virtual Space. SENSORS 2021; 21:s21113661. [PMID: 34070237 PMCID: PMC8197389 DOI: 10.3390/s21113661] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 05/14/2021] [Accepted: 05/18/2021] [Indexed: 11/16/2022]
Abstract
Virtual reality (VR) has been widely used as a tool to assist people by letting them learn and simulate situations that are too dangerous and risky to practice in real life, and one of these is road safety training for children. Traditional video- and presentation-based road safety training has average output results as it lacks physical practice and the involvement of children during training, without any practical testing examination to check the learned abilities of a child before their exposure to real-world environments. Therefore, in this paper, we propose a 3D realistic open-ended VR and Kinect sensor-based training setup using the Unity game engine, wherein children are educated and involved in road safety exercises. The proposed system applies the concepts of VR in a game-like setting to let the children learn about traffic rules and practice them in their homes without any risk of being exposed to the outside environment. Thus, with our interactive and immersive training environment, we aim to minimize road accidents involving children and contribute to the generic domain of healthcare. Furthermore, the proposed framework evaluates the overall performance of the students in a virtual environment (VE) to develop their road-awareness skills. To ensure safety, the proposed system has an extra examination layer for children’s abilities evaluation, whereby a child is considered fit for real-world practice in cases where they fulfil certain criteria by achieving set scores. To show the robustness and stability of the proposed system, we conduct four types of subjective activities by involving a group of ten students with average grades in their classes. The experimental results show the positive effect of the proposed system in improving the road crossing behavior of the children.
Collapse
Affiliation(s)
- Noman Khan
- Visual Analytics for Knowledge Laboratory, Department of Software, Sejong University, Seoul 143-747, Korea; (N.K.); (K.M.)
- Digital Image Processing Laboratory, Department of Computer Science, Islamia College University Peshawar, Peshawar 25000, Pakistan; (M.N.); (M.M.)
| | - Khan Muhammad
- Visual Analytics for Knowledge Laboratory, Department of Software, Sejong University, Seoul 143-747, Korea; (N.K.); (K.M.)
| | - Tanveer Hussain
- Department of Software, Sejong University, Seoul 143-747, Korea;
| | - Mansoor Nasir
- Digital Image Processing Laboratory, Department of Computer Science, Islamia College University Peshawar, Peshawar 25000, Pakistan; (M.N.); (M.M.)
| | - Muhammad Munsif
- Digital Image Processing Laboratory, Department of Computer Science, Islamia College University Peshawar, Peshawar 25000, Pakistan; (M.N.); (M.M.)
| | - Ali Shariq Imran
- Norwegian Colour and Visual Computing Laboratory, Department of Computer Science (IDI), Norwegian University of Science and Technology (NTNU), 2815 Gjøvik, Norway;
| | - Muhammad Sajjad
- Visual Analytics for Knowledge Laboratory, Department of Software, Sejong University, Seoul 143-747, Korea; (N.K.); (K.M.)
- Digital Image Processing Laboratory, Department of Computer Science, Islamia College University Peshawar, Peshawar 25000, Pakistan; (M.N.); (M.M.)
- Norwegian Colour and Visual Computing Laboratory, Department of Computer Science (IDI), Norwegian University of Science and Technology (NTNU), 2815 Gjøvik, Norway;
- Correspondence: or
| |
Collapse
|
11
|
|
12
|
Khan J, Li JP, Haq AU, Khan GA, Ahmad S, Abdullah Alghamdi A, Golilarz NA. Efficient secure surveillance on smart healthcare IoT system through cosine-transform encryption. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-201770] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The emerging technologies with IoT (Internet of Things) systems are elevated as a prototype and combination of the smart connectivity ecosystem. These ecosystems are appropriately connected in a smart healthcare system which are generating finest monitoring activities among the patients, well-organized diagnosis process, intensive support and care against the traditional healthcare operations. But facilitating these highly technological adaptations, the preserving personal information of the patients are on the risk with data leakage and privacy theft in the current revolution. Concerning secure protection and privacy theft of the patient’s information. We emphasized this paper on secure monitoring with the help of intelligently recorded summary’s keyframe extraction and applied two rounds lightweight cosine-transform encryption. This article includes firstly, a regimented process of keyframe extraction which is employed to retrieve meaningful frames of image through visual sensor with sending alert (quick notice) to authority. Secondly, employed two rounds of lightweight cosine-transform encryption operation of agreed (detected) keyframes to endure security and safety for the further any kinds of attacks from the adversary. The combined methodology corroborates highly usefulness with engendering appropriate results, little execution of encryption time (0.2277-0.2607), information entropy (7.9996), correlation coefficient (0.0010), robustness (NPCR 99.6383, UACI 33.3516), uniform histogram deviation (R 0.0359, G 0.0492, B 0.0582) and other well adopted secure ideology than any other keyframe or image encryption approaches. Furthermore, this incorporating method can effectively reduce vital communication cost, bandwidth issues, storage, data transmission cost and effective timely judicious analysis over the occurred activities and keep protection by using effective encryption methodology to remain attack free from any attacker or adversary, and provide confidentiality about patient’s privacy in the smart healthcare system.
Collapse
Affiliation(s)
- Jalaluddin Khan
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Jian Ping Li
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Amin Ul Haq
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Ghufran Ahmad Khan
- School of Information Science and Technology, Southwest Jiaotong University, Chengdu, China
| | - Sultan Ahmad
- Department of Computer Science, College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Alkharj, Saudi Arabia
| | | | - Noorbakhsh Amiri Golilarz
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
13
|
Detection of Ki67 Hot-Spots of Invasive Breast Cancer Based on Convolutional Neural Networks Applied to Mutual Information of H&E and Ki67 Whole Slide Images. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10217761] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Ki67 hot-spot detection and its evaluation in invasive breast cancer regions play a significant role in routine medical practice. The quantification of cellular proliferation assessed by Ki67 immunohistochemistry is an established prognostic and predictive biomarker that determines the choice of therapeutic protocols. In this paper, we present three deep learning-based approaches to automatically detect and quantify Ki67 hot-spot areas by means of the Ki67 labeling index. To this end, a dataset composed of 100 whole slide images (WSIs) belonging to 50 breast cancer cases (Ki67 and H&E WSI pairs) was used. Three methods based on CNN classification were proposed and compared to create the tumor proliferation map. The best results were obtained by applying the CNN to the mutual information acquired from the color deconvolution of both the Ki67 marker and the H&E WSIs. The overall accuracy of this approach was 95%. The agreement between the automatic Ki67 scoring and the manual analysis is promising with a Spearman’s ρ correlation of 0.92. The results illustrate the suitability of this CNN-based approach for detecting hot-spots areas of invasive breast cancer in WSI.
Collapse
|
14
|
Cover the Violence: A Novel Deep-Learning-Based Approach Towards Violence-Detection in Movies. APPLIED SCIENCES-BASEL 2019. [DOI: 10.3390/app9224963] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Movies have become one of the major sources of entertainment in the current era, which are based on diverse ideas. Action movies have received the most attention in last few years, which contain violent scenes, because it is one of the undesirable features for some individuals that is used to create charm and fantasy. However, these violent scenes have had a negative impact on kids, and they are not comfortable even for mature age people. The best way to stop under aged people from watching violent scenes in movies is to eliminate these scenes. In this paper, we proposed a violence detection scheme for movies that is comprised of three steps. First, the entire movie is segmented into shots, and then a representative frame from each shot is selected based on the level of saliency. Next, these selected frames are passed from a light-weight deep learning model, which is fine-tuned using a transfer learning approach to classify violence and non-violence shots in a movie. Finally, all the non-violence scenes are merged in a sequence to generate a violence-free movie that can be watched by children and as well violence paranoid people. The proposed model is evaluated on three violence benchmark datasets, and it is experimentally proved that the proposed scheme provides a fast and accurate detection of violent scenes in movies compared to the state-of-the-art methods.
Collapse
|
15
|
Ullah FUM, Ullah A, Muhammad K, Haq IU, Baik SW. Violence Detection Using Spatiotemporal Features with 3D Convolutional Neural Network. SENSORS 2019; 19:s19112472. [PMID: 31151184 PMCID: PMC6603512 DOI: 10.3390/s19112472] [Citation(s) in RCA: 79] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Revised: 05/23/2019] [Accepted: 05/24/2019] [Indexed: 11/18/2022]
Abstract
The worldwide utilization of surveillance cameras in smart cities has enabled researchers to analyze a gigantic volume of data to ensure automatic monitoring. An enhanced security system in smart cities, schools, hospitals, and other surveillance domains is mandatory for the detection of violent or abnormal activities to avoid any casualties which could cause social, economic, and ecological damages. Automatic detection of violence for quick actions is very significant and can efficiently assist the concerned departments. In this paper, we propose a triple-staged end-to-end deep learning violence detection framework. First, persons are detected in the surveillance video stream using a light-weight convolutional neural network (CNN) model to reduce and overcome the voluminous processing of useless frames. Second, a sequence of 16 frames with detected persons is passed to 3D CNN, where the spatiotemporal features of these sequences are extracted and fed to the Softmax classifier. Furthermore, we optimized the 3D CNN model using an open visual inference and neural networks optimization toolkit developed by Intel, which converts the trained model into intermediate representation and adjusts it for optimal execution at the end platform for the final prediction of violent activity. After detection of a violent activity, an alert is transmitted to the nearest police station or security department to take prompt preventive actions. We found that our proposed method outperforms the existing state-of-the-art methods for different benchmark datasets.
Collapse
Affiliation(s)
- Fath U Min Ullah
- Intelligent Media Laboratory, Digital Contents Research Institute, Sejong University, Seoul 143-747, Korea.
| | - Amin Ullah
- Intelligent Media Laboratory, Digital Contents Research Institute, Sejong University, Seoul 143-747, Korea.
| | - Khan Muhammad
- Department of Software, Sejong University, Seoul 143-747, Korea.
| | - Ijaz Ul Haq
- Intelligent Media Laboratory, Digital Contents Research Institute, Sejong University, Seoul 143-747, Korea.
| | - Sung Wook Baik
- Intelligent Media Laboratory, Digital Contents Research Institute, Sejong University, Seoul 143-747, Korea.
| |
Collapse
|
16
|
A Robust Regression-Based Stock Exchange Forecasting and Determination of Correlation Between Stock Markets. SUSTAINABILITY 2018. [DOI: 10.3390/su10103702] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Knowledge-based decision support systems for financial management are an important part of investment plans. Investors are avoiding investing in traditional investment areas such as banks due to low return on investment. The stock exchange is one of the major areas for investment presently. Various non-linear and complex factors affect the stock exchange. A robust stock exchange forecasting system remains an important need. From this line of research, we evaluate the performance of a regression-based model to check the robustness over large datasets. We also evaluate the effect of top stock exchange markets on each other. We evaluate our proposed model on the top 4 stock exchanges—New York, London, NASDAQ and Karachi stock exchange. We also evaluate our model on the top 3 companies—Apple, Microsoft, and Google. A huge (Big Data) historical data is gathered from Yahoo finance consisting of 20 years. Such huge data creates a Big Data problem. The performance of our system is evaluated on a 1-step, 6-step, and 12-step forecast. The experiments show that the proposed system produces excellent results. The results are presented in terms of Mean Absolute Error (MAE) and Root Mean Square Error (RMSE).
Collapse
|