1
|
de la Cruz G, Lira M, Luaces O, Remeseiro B. Eye-LRCN: A Long-Term Recurrent Convolutional Network for Eye Blink Completeness Detection. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5130-5140. [PMID: 36083963 DOI: 10.1109/tnnls.2022.3202643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Computer vision syndrome causes vision problems and discomfort mainly due to dry eye. Several studies show that dry eye in computer users is caused by a reduction in the blink rate and an increase in the prevalence of incomplete blinks. In this context, this article introduces Eye-LRCN, a new eye blink detection method that also evaluates the completeness of the blink. The method is based on a long-term recurrent convolutional network (LRCN), which combines a convolutional neural network (CNN) for feature extraction with a bidirectional recurrent neural network that performs sequence learning and classifies the blinks. A Siamese architecture is used during CNN training to overcome the high-class imbalance present in blink detection and the limited amount of data available to train blink detection models. The method was evaluated on three different tasks: blink detection, blink completeness detection, and eye state detection. We report superior performance to the state-of-the-art methods in blink detection and blink completeness detection, and remarkable results in eye state detection.
Collapse
|
2
|
Saealal MS, Ibrahim MZ, Mulvaney DJ, Shapiai MI, Fadilah N. Using cascade CNN-LSTM-FCNs to identify AI-altered video based on eye state sequence. PLoS One 2022; 17:e0278989. [PMID: 36520851 PMCID: PMC9754287 DOI: 10.1371/journal.pone.0278989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 11/28/2022] [Indexed: 12/23/2022] Open
Abstract
Deep learning is notably successful in data analysis, computer vision, and human control. Nevertheless, this approach has inevitably allowed the development of DeepFake video sequences and images that could be altered so that the changes are not easily or explicitly detectable. Such alterations have been recently used to spread false news or disinformation. This study aims to identify Deepfaked videos and images and alert viewers to the possible falsity of the information. The current work presented a novel means of revealing fake face videos by cascading the convolution network with recurrent neural networks and fully connected network (FCN) models. The system detection approach utilizes the eye-blinking state in temporal video frames. Notwithstanding, it is deemed challenging to precisely depict (i) artificiality in fake videos and (ii) spatial information within the individual frame through this physiological signal. Spatial features were extracted using the VGG16 network and trained with the ImageNet dataset. The temporal features were then extracted in every 20 sequences through the LSTM network. On another note, the pre-processed eye-blinking state served as a probability to generate a novel BPD dataset. This newly-acquired dataset was fed to three models for training purposes with each entailing four, three, and six hidden layers, respectively. Every model constitutes a unique architecture and specific dropout value. Resultantly, the model optimally and accurately identified tampered videos within the dataset. The study model was assessed using the current BPD dataset based on one of the most complex datasets (FaceForensic++) with 90.8% accuracy. Such precision was successfully maintained in datasets that were not used in the training process. The training process was also accelerated by lowering the computation prerequisites.
Collapse
Affiliation(s)
- Muhammad Salihin Saealal
- Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang, Pekan Campus, Pekan, Pahang, Malaysia
- Electrical Engineering Technology Department, Faculty of Electric and Electronic Engineering Technology, Universiti Teknikal Malaysia Melaka, Durian Tunggal, Melaka, Malaysia
| | - Mohd Zamri Ibrahim
- Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang, Pekan Campus, Pekan, Pahang, Malaysia
- * E-mail:
| | - David. J. Mulvaney
- School of Electronic, Electrical and Systems Engineering, Loughborough University, Loughborough, United Kingdom
| | - Mohd Ibrahim Shapiai
- Centre for Artificial Intelligence and Robotics, Malaysia-Japan International Institue of Technology, Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia
| | - Norasyikin Fadilah
- Faculty of Electrical and Electronics Engineering Technology, Universiti Malaysia Pahang, Pekan Campus, Pekan, Pahang, Malaysia
| |
Collapse
|
3
|
Deep Learning Approach Based on Residual Neural Network and SVM Classifier for Driver’s Distraction Detection. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12136626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
In the last decade, distraction detection of a driver gained a lot of significance due to increases in the number of accidents. Many solutions, such as feature based, statistical, holistic, etc., have been proposed to solve this problem. With the advent of high processing power at cheaper costs, deep learning-based driver distraction detection techniques have shown promising results. The study proposes ReSVM, an approach combining deep features of ResNet-50 with the SVM classifier, for distraction detection of a driver. ReSVM is compared with six state-of-the-art approaches on four datasets, namely: State Farm Distracted Driver Detection, Boston University, DrivFace, and FT-UMT. Experiments demonstrate that ReSVM outperforms the existing approaches and achieves a classification accuracy as high as 95.5%. The study also compares ReSVM with its variants on the aforementioned datasets.
Collapse
|
4
|
Lu Y, Maftouni M, Yang T, Zheng P, Young D, Kong ZJ, Li Z. A novel disassembly process of end-of-life lithium-ion batteries enhanced by online sensing and machine learning techniques. JOURNAL OF INTELLIGENT MANUFACTURING 2022; 34:2463-2475. [PMID: 35462703 PMCID: PMC9018251 DOI: 10.1007/s10845-022-01936-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 03/09/2022] [Indexed: 06/14/2023]
Abstract
UNLABELLED An effective lithium-ion battery (LIB) recycling infrastructure is of great importance to alleviate the concerns over the disposal of waste LIBs and the sustainability of critical elements for producing LIB components. The End-of-life (EOL) LIBs are in various sizes and shapes, which create significant challenges to automate a few unit operations (e.g., disassembly at the cell level) of the recycling process. Meanwhile, hazardous and flammable materials are contained in LIBs, posing great threats to the human exposure. Therefore, it is difficult to dismantle the LIBs safely and efficiently to recover critical materials. Automation has become a competitive solution in manufacturing world, which allows for mass production at outstanding speeds and with great repeatability or quality. It is imperative to develop automatic disassembly solution to effectively disassemble the LIBs while safeguarding human workers against the hazards environment. In this work, we demonstrate an automatic battery disassembly platform enhanced by online sensing and machine learning technologies. The computer vision is used to classify different types of batteries based on their brands and sizes. The real-time temperature data is captured from a thermal camera. A data-driven model is built to predict the cutting temperature pattern and the temperature spike can be mitigated by the close-loop control system. Furthermore, quality control is conducted using a neural network model to detect and mitigate the cutting defects. The integrated disassembly platform can realize the real-time diagnosis and closed-loop control of the cutting process to optimize the cutting quality and improve the safety. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s10845-022-01936-x.
Collapse
Affiliation(s)
- Yingqi Lu
- Department of Mechanical Engineering, Virginia Tech, Blacksburg, USA
| | - Maede Maftouni
- Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, USA
| | - Tairan Yang
- Department of Mechanical Engineering, Virginia Tech, Blacksburg, USA
| | | | | | - Zhenyu James Kong
- Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, USA
| | - Zheng Li
- Department of Mechanical Engineering, Virginia Tech, Blacksburg, USA
| |
Collapse
|
5
|
Enabling Intelligent Recovery of Critical Materials from Li-Ion Battery through Direct Recycling Process with Internet-of-Things. MATERIALS 2021; 14:ma14237153. [PMID: 34885314 PMCID: PMC8658619 DOI: 10.3390/ma14237153] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 11/09/2021] [Accepted: 11/18/2021] [Indexed: 11/18/2022]
Abstract
The rapid market expansion of Li-ion batteries (LIBs) leads to concerns over the appropriate disposal of hazardous battery waste and the sustainability in the supply of critical materials for LIB production. Technologies and strategies to extend the life of LIBs and reuse the materials have long been sought. Direct recycling is a more effective recycling approach than existing ones with respect to cost, energy consumption, and emissions. This approach has become increasingly more feasible due to digitalization and the adoption of the Internet-of-Things (IoT). To address the question of how IoT could enhance direct recycling of LIBs, we first highlight the importance of direct recycling in tackling the challenges in the supply chain of LIB and discuss the characteristics and application of IoT technologies, which could enhance direct recycling. Finally, we share our perspective on a paradigm where IoT could be integrated into the direct recycling process of LIBs to enhance the efficiency, intelligence, and effectiveness of the recycling process.
Collapse
|
6
|
Recognition of Blinks Activity Patterns during Stress Conditions Using CNN and Markovian Analysis. SIGNALS 2021. [DOI: 10.3390/signals2010006] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
This paper investigates eye behaviour through blinks activity during stress conditions. Although eye blinking is a semi-voluntary action, it is considered to be affected by one’s emotional states such as arousal or stress. The blinking rate provides information towards this direction, however, the analysis on the entire eye aperture timeseries and the corresponding blinking patterns provide enhanced information on eye behaviour during stress conditions. Thus, two experimental protocols were established to induce affective states (neutral, relaxed and stress) systematically through a variety of external and internal stressors. The study populations included 24 and 58 participants respectively performing 12 experimental affective trials. After the preprocessing phase, the eye aperture timeseries and the corresponding features were extracted. The behaviour of inter-blink intervals (IBI) was investigated using the Markovian Analysis to quantify incidence dynamics in sequences of blinks. Moreover, Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) network models were employed to discriminate stressed versus neutral tasks per cognitive process using the sequence of IBI. The classification accuracy reached a percentage of 81.3% which is very promising considering the unimodal analysis and the noninvasiveness modality used.
Collapse
|
7
|
Hao D, Peng J, Wang Y, Liu J, Zhou X, Zheng D. Evaluation of convolutional neural network for recognizing uterine contractions with electrohysterogram. Comput Biol Med 2019; 113:103394. [PMID: 31445226 PMCID: PMC6839746 DOI: 10.1016/j.compbiomed.2019.103394] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 08/16/2019] [Accepted: 08/17/2019] [Indexed: 10/29/2022]
Abstract
Uterine contraction (UC) activity is commonly used to monitor the approach of labour and delivery. Electrohysterograms (EHGs) have recently been used to monitor UC and distinguish between efficient and inefficient contractions. In this study, we aimed to identify UC in EHG signals using a convolutional neural network (CNN). An open-access database (Icelandic 16-electrode EHG database from 45 pregnant women with 122 recordings, DB1) was used to develop a CNN model, and 14000 segments with a length of 45 s (7000 from UCs and 7000 from non-UCs, which were determined with reference to the simultaneously recorded tocography signals) were manually extracted from the 122 EHG recordings. Five-fold cross-validation was applied to evaluate the ability of the CNN to identify UC based on its sensitivity (SE), specificity (SP), accuracy (ACC), and area under the receiver operating characteristic curve (AUC). The CNN model developed using DB1 was then applied to an independent clinical database (DB2) to further test its generalisation for recognizing UCs. The EHG signals in DB2 were recorded from 20 pregnant women using our multi-channel system, and 308 segments (154 from UCs and 154 from non-UCs) were extracted. The CNN model from five-fold cross-validation achieved average SE, SP, ACC, and AUC of 0.87, 0.98, 0.93, and 0.92 for DB1, and 0.88, 0.97, 0.93, and 0.87 for DB2, respectively. In summary, we demonstrated that CNN could effectively identify UCs using EHG signals and could be used as a tool for monitoring maternal and foetal health.
Collapse
Affiliation(s)
- Dongmei Hao
- College of Life Science and Bioengineering, Beijing University of Technology, Intelligent Physiological Measurement and Clinical Translation, Beijing International Platform for Scientific and Technological Cooperation, Beijing, 100024, China.
| | - Jin Peng
- College of Life Science and Bioengineering, Beijing University of Technology, Intelligent Physiological Measurement and Clinical Translation, Beijing International Platform for Scientific and Technological Cooperation, Beijing, 100024, China; Medical Device and Technology Research Group, Faculty of Health, Education, Medicine and Social Care, Anglia Ruskin University, Chelmsford, CM1 1SQ, UK
| | - Ying Wang
- College of Life Science and Bioengineering, Beijing University of Technology, Intelligent Physiological Measurement and Clinical Translation, Beijing International Platform for Scientific and Technological Cooperation, Beijing, 100024, China
| | - Juntao Liu
- Department of Obstetrics, Peking Union Medical College Hospital, Beijing, 100730, China
| | - Xiya Zhou
- Department of Obstetrics, Peking Union Medical College Hospital, Beijing, 100730, China
| | - Dingchang Zheng
- Medical Device and Technology Research Group, Faculty of Health, Education, Medicine and Social Care, Anglia Ruskin University, Chelmsford, CM1 1SQ, UK.
| |
Collapse
|
8
|
Park SH, Yoon HS, Park KR. Faster R-CNN and Geometric Transformation-Based Detection of Driver's Eyes Using Multiple Near-Infrared Camera Sensors. SENSORS 2019; 19:s19010197. [PMID: 30621110 PMCID: PMC6338982 DOI: 10.3390/s19010197] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 12/31/2018] [Accepted: 01/03/2019] [Indexed: 11/16/2022]
Abstract
Studies are being actively conducted on camera-based driver gaze tracking in a vehicle environment for vehicle interfaces and analyzing forward attention for judging driver inattention. In existing studies on the single-camera-based method, there are frequent situations in which the eye information necessary for gaze tracking cannot be observed well in the camera input image owing to the turning of the driver's head during driving. To solve this problem, existing studies have used multiple-camera-based methods to obtain images to track the driver's gaze. However, this method has the drawback of an excessive computation process and processing time, as it involves detecting the eyes and extracting the features of all images obtained from multiple cameras. This makes it difficult to implement it in an actual vehicle environment. To solve these limitations of existing studies, this study proposes a method that uses a shallow convolutional neural network (CNN) for the images of the driver's face acquired from two cameras to adaptively select camera images more suitable for detecting eye position; faster R-CNN is applied to the selected driver images, and after the driver's eyes are detected, the eye positions of the camera image of the other side are mapped through a geometric transformation matrix. Experiments were conducted using the self-built Dongguk Dual Camera-based Driver Database (DDCD-DB1) including the images of 26 participants acquired from inside a vehicle and the Columbia Gaze Data Set (CAVE-DB) open database. The results confirmed that the performance of the proposed method is superior to those of the existing methods.
Collapse
Affiliation(s)
- Sung Ho Park
- Division of Electronics and Electrical Engineering, Dongguk University, 30 Pildong-ro 1-gil, Jung-gu, Seoul 100-715, Korea.
| | | | | |
Collapse
|
9
|
Lee JH, Kang T, Choi BK, Han IH, Kim BC, Ro JH. Application of Deep Learning System into the Development of Communication Device for Quadriplegic Patient. Korean J Neurotrauma 2019; 15:88-94. [PMID: 31720261 PMCID: PMC6826084 DOI: 10.13004/kjnt.2019.15.e17] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2019] [Revised: 06/19/2019] [Accepted: 07/12/2019] [Indexed: 01/10/2023] Open
Abstract
Objective Methods Results Conclusion
Collapse
Affiliation(s)
- Jung Hwan Lee
- Department of Neurosurgery, Pusan National University Hospital, Busan, Korea
| | - Taewoo Kang
- Busan Cancer Center, Pusan National University Hospital, Busan, Korea
| | - Byung Kwan Choi
- Department of Neurosurgery, Pusan National University Hospital, Busan, Korea
| | - In Ho Han
- Department of Neurosurgery, Pusan National University Hospital, Busan, Korea
| | - Byung Chul Kim
- Department of Neurosurgery, Pusan National University Hospital, Busan, Korea
| | - Jung Hoon Ro
- Department of Biomedical Engineering, Pusan National University Hospital, Busan, Korea
| |
Collapse
|
10
|
Li B, Fu H, Wen D, Lo W. Etracker: A Mobile Gaze-Tracking System with Near-Eye Display Based on a Combined Gaze-Tracking Algorithm. SENSORS (BASEL, SWITZERLAND) 2018; 18:E1626. [PMID: 29783738 PMCID: PMC5981618 DOI: 10.3390/s18051626] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 05/15/2018] [Accepted: 05/15/2018] [Indexed: 11/18/2022]
Abstract
Eye tracking technology has become increasingly important for psychological analysis, medical diagnosis, driver assistance systems, and many other applications. Various gaze-tracking models have been established by previous researchers. However, there is currently no near-eye display system with accurate gaze-tracking performance and a convenient user experience. In this paper, we constructed a complete prototype of the mobile gaze-tracking system 'Etracker' with a near-eye viewing device for human gaze tracking. We proposed a combined gaze-tracking algorithm. In this algorithm, the convolutional neural network is used to remove blinking images and predict coarse gaze position, and then a geometric model is defined for accurate human gaze tracking. Moreover, we proposed using the mean value of gazes to resolve pupil center changes caused by nystagmus in calibration algorithms, so that an individual user only needs to calibrate it the first time, which makes our system more convenient. The experiments on gaze data from 26 participants show that the eye center detection accuracy is 98% and Etracker can provide an average gaze accuracy of 0.53° at a rate of 30⁻60 Hz.
Collapse
Affiliation(s)
- Bin Li
- Xi'an Institute of Optics and Precision Mechanics of CAS, Xi'an 710119, China.
- University of Chinese Academy of Sciences, Beijing 100049, China.
- Department of Computer Science, Chu Hai College of Higher Education, Tuen Mun, Hong Kong, China.
| | - Hong Fu
- Department of Computer Science, Chu Hai College of Higher Education, Tuen Mun, Hong Kong, China.
| | - Desheng Wen
- Xi'an Institute of Optics and Precision Mechanics of CAS, Xi'an 710119, China.
| | - WaiLun Lo
- Department of Computer Science, Chu Hai College of Higher Education, Tuen Mun, Hong Kong, China.
| |
Collapse
|
11
|
Arsalan M, Naqvi RA, Kim DS, Nguyen PH, Owais M, Park KR. IrisDenseNet: Robust Iris Segmentation Using Densely Connected Fully Convolutional Networks in the Images by Visible Light and Near-Infrared Light Camera Sensors. SENSORS 2018; 18:s18051501. [PMID: 29748495 PMCID: PMC5981870 DOI: 10.3390/s18051501] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Revised: 05/01/2018] [Accepted: 05/08/2018] [Indexed: 11/21/2022]
Abstract
The recent advancements in computer vision have opened new horizons for deploying biometric recognition algorithms in mobile and handheld devices. Similarly, iris recognition is now much needed in unconstraint scenarios with accuracy. These environments make the acquired iris image exhibit occlusion, low resolution, blur, unusual glint, ghost effect, and off-angles. The prevailing segmentation algorithms cannot cope with these constraints. In addition, owing to the unavailability of near-infrared (NIR) light, iris recognition in visible light environment makes the iris segmentation challenging with the noise of visible light. Deep learning with convolutional neural networks (CNN) has brought a considerable breakthrough in various applications. To address the iris segmentation issues in challenging situations by visible light and near-infrared light camera sensors, this paper proposes a densely connected fully convolutional network (IrisDenseNet), which can determine the true iris boundary even with inferior-quality images by using better information gradient flow between the dense blocks. In the experiments conducted, five datasets of visible light and NIR environments were used. For visible light environment, noisy iris challenge evaluation part-II (NICE-II selected from UBIRIS.v2 database) and mobile iris challenge evaluation (MICHE-I) datasets were used. For NIR environment, the institute of automation, Chinese academy of sciences (CASIA) v4.0 interval, CASIA v4.0 distance, and IIT Delhi v1.0 iris datasets were used. Experimental results showed the optimal segmentation of the proposed IrisDenseNet and its excellent performance over existing algorithms for all five datasets.
Collapse
Affiliation(s)
- Muhammad Arsalan
- Division of Electronics and Electrical Engineering, Dongguk University, 30 Pildong-ro 1-gil, Jung-gu, Seoul 100-715, Korea.
| | - Rizwan Ali Naqvi
- Division of Electronics and Electrical Engineering, Dongguk University, 30 Pildong-ro 1-gil, Jung-gu, Seoul 100-715, Korea.
| | - Dong Seop Kim
- Division of Electronics and Electrical Engineering, Dongguk University, 30 Pildong-ro 1-gil, Jung-gu, Seoul 100-715, Korea.
| | - Phong Ha Nguyen
- Division of Electronics and Electrical Engineering, Dongguk University, 30 Pildong-ro 1-gil, Jung-gu, Seoul 100-715, Korea.
| | - Muhammad Owais
- Division of Electronics and Electrical Engineering, Dongguk University, 30 Pildong-ro 1-gil, Jung-gu, Seoul 100-715, Korea.
| | - Kang Ryoung Park
- Division of Electronics and Electrical Engineering, Dongguk University, 30 Pildong-ro 1-gil, Jung-gu, Seoul 100-715, Korea.
| |
Collapse
|
12
|
|
13
|
Cao X, Wang P, Meng C, Bai X, Gong G, Liu M, Qi J. Region Based CNN for Foreign Object Debris Detection on Airfield Pavement. SENSORS (BASEL, SWITZERLAND) 2018; 18:E737. [PMID: 29494524 PMCID: PMC5876630 DOI: 10.3390/s18030737] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/10/2017] [Revised: 01/04/2018] [Accepted: 01/29/2018] [Indexed: 02/05/2023]
Abstract
In this paper, a novel algorithm based on convolutional neural network (CNN) is proposed to detect foreign object debris (FOD) based on optical imaging sensors. It contains two modules, the improved region proposal network (RPN) and spatial transformer network (STN) based CNN classifier. In the improved RPN, some extra select rules are designed and deployed to generate high quality candidates with fewer numbers. Moreover, the efficiency of CNN detector is significantly improved by introducing STN layer. Compared to faster R-CNN and single shot multiBox detector (SSD), the proposed algorithm achieves better result for FOD detection on airfield pavement in the experiment.
Collapse
Affiliation(s)
- Xiaoguang Cao
- Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China.
| | - Peng Wang
- Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China.
| | - Cai Meng
- Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China.
| | - Xiangzhi Bai
- Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China.
- State Key Laboratory of Virtual Reality Technology and Systems, Beijing University of Aeronautics and Astronautics, Beijing 100191, China.
| | - Guoping Gong
- Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China.
| | - Miaoming Liu
- Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China.
| | - Jun Qi
- Image Processing Center, Beijing University of Aeronautics and Astronautics, Beijing 100191, China.
| |
Collapse
|