1
|
Chen S, Aramvith S, Miyanaga Y. Learning-Based Rate Control for High Efficiency Video Coding. Sensors (Basel) 2023; 23:3607. [PMID: 37050667 PMCID: PMC10098671 DOI: 10.3390/s23073607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 03/24/2023] [Accepted: 03/27/2023] [Indexed: 06/19/2023]
Abstract
High efficiency video coding (HEVC) has dramatically enhanced coding efficiency compared to the previous video coding standard, H.264/AVC. However, the existing rate control updates its parameters according to a fixed initialization, which can cause errors in the prediction of bit allocation to each coding tree unit (CTU) in frames. This paper proposes a learning-based mapping method between rate control parameters and video contents to achieve an accurate target bit rate and good video quality. The proposed framework contains two main structural codings, including spatial and temporal coding. We initiate an effective learning-based particle swarm optimization for spatial and temporal coding to determine the optimal parameters at the CTU level. For temporal coding at the picture level, we introduce semantic residual information into the parameter updating process to regulate the bit correctly on the actual picture. Experimental results indicate that the proposed algorithm is effective for HEVC and outperforms the state-of-the-art rate control in the HEVC reference software (HM-16.10) by 0.19 dB on average and up to 0.41 dB for low-delay P coding structure.
Collapse
Affiliation(s)
- Sovann Chen
- Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand
| | - Supavadee Aramvith
- Multimedia Data Analytics and Processing Research Unit, Department of Electrical Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok 10330, Thailand
| | | |
Collapse
|
2
|
Ni CT, Huang YC, Chen PY. A Hardware-Friendlyand High-Efficiency H.265/ HEVC Encoder for Visual Sensor Networks. Sensors (Basel) 2023; 23:s23052625. [PMID: 36904828 PMCID: PMC10007421 DOI: 10.3390/s23052625] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Revised: 02/21/2023] [Accepted: 02/23/2023] [Indexed: 06/12/2023]
Abstract
Visual sensor networks (VSNs) have numerous applications in fields such as wildlife observation, object recognition, and smart homes. However, visual sensors generate vastly more data than scalar sensors. Storing and transmitting these data is challenging. High-efficiency video coding (HEVC/H.265) is a widely used video compression standard. Compare to H.264/AVC, HEVC reduces approximately 50% of the bit rate at the same video quality, which can compress the visual data with a high compression ratio but results in high computational complexity. In this study, we propose a hardware-friendly and high-efficiency H.265/HEVC accelerating algorithm to overcome this complexity for visual sensor networks. The proposed method leverages texture direction and complexity to skip redundant processing in CU partition and accelerate intra prediction for intra-frame encoding. Experimental results revealed that the proposed method could reduce encoding time by 45.33% and increase the Bjontegaard delta bit rate (BDBR) by only 1.07% as compared to HM16.22 under all-intra configuration. Moreover, the proposed method reduced the encoding time for six visual sensor video sequences by 53.72%. These results confirm that the proposed method achieves high efficiency and a favorable balance between the BDBR and encoding time reduction.
Collapse
|
3
|
Kaczyński M, Piotrowski Z, Pietrow D. High-Quality Video Watermarking Based on Deep Neural Networks for Video with HEVC Compression. Sensors (Basel) 2022; 22:7552. [PMID: 36236650 PMCID: PMC9572223 DOI: 10.3390/s22197552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 10/01/2022] [Accepted: 10/02/2022] [Indexed: 06/16/2023]
Abstract
This article presents a method for transparent watermarking of high-capacity watermarked video under H.265/HEVC (High-Efficiency Video Coding) compression conditions while maintaining high-quality encoded image. The aim of this paper is to present a method for watermark embedding using neural networks under conditions of subjecting video to lossy compression of the HEVC codec using the YUV420p color model chrominance channel for watermarking. This paper presents a method for training a deep neural network to embed a watermark when a compression channel is present. The discussed method is characterized by high accuracy of the video with an embedded watermark compared to the original. The PSNR (peak signal-to-noise ratio) values obtained are over 44 dB. The watermark capacity is 96 bits for an image with a resolution of 128 × 128. The method enables the complete recovery of a watermark from a single video frame compressed by the HEVC codec within the range of compression values defined by the CRF (constant rate factor) up to 22.
Collapse
|
4
|
Almomani I, Alkhayer A, El-Shafai W. A Crypto-Steganography Approach for Hiding Ransomware within HEVC Streams in Android IoT Devices. Sensors (Basel) 2022; 22:2281. [PMID: 35336452 DOI: 10.3390/s22062281] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 03/07/2022] [Accepted: 03/14/2022] [Indexed: 11/25/2022]
Abstract
Steganography is a vital security approach that hides any secret content within ordinary data, such as multimedia. This hiding aims to achieve the confidentiality of the IoT secret data; whether it is benign or malicious (e.g., ransomware) and for defensive or offensive purposes. This paper introduces a hybrid crypto-steganography approach for ransomware hiding within high-resolution video frames. This proposed approach is based on hybridizing an AES (advanced encryption standard) algorithm and LSB (least significant bit) steganography process. Initially, AES encrypts the secret Android ransomware data, and then LSB embeds it based on random selection criteria for the cover video pixels. This research examined broad objective and subjective quality assessment metrics to evaluate the performance of the proposed hybrid approach. We used different sizes of ransomware samples and different resolutions of HEVC (high-efficiency video coding) frames to conduct simulation experiments and comparison studies. The assessment results prove the superior efficiency of the introduced hybrid crypto-steganography approach compared to other existing steganography approaches in terms of (a) achieving the integrity of the secret ransomware data, (b) ensuring higher imperceptibility of stego video frames, (3) introducing a multi-level security approach using the AES encryption in addition to the LSB steganography, (4) performing randomness embedding based on RPS (random pixel selection) for concealing secret ransomware bits, (5) succeeding in fully extracting the ransomware data at the receiver side, (6) obtaining strong subjective and objective qualities for all tested evaluation metrics, (7) embedding different sizes of secret data at the same time within the video frame, and finally (8) passing the security scanning tests of 70 antivirus engines without detecting the existence of the embedded ransomware.
Collapse
|
5
|
Martínez-Rach MO, Migallón H, López-Granado O, Galiano V, Malumbres MP. Performance Overview of the Latest Video Coding Proposals: HEVC, JEM and VVC. J Imaging 2021; 7:39. [PMID: 34460638 DOI: 10.3390/jimaging7020039] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 02/10/2021] [Accepted: 02/11/2021] [Indexed: 11/16/2022] Open
Abstract
The audiovisual entertainment industry has entered a race to find the video encoder offering the best Rate/Distortion (R/D) performance for high-quality high-definition video content. The challenge consists in providing a moderate to low computational/hardware complexity encoder able to run Ultra High-Definition (UHD) video formats of different flavours (360°, AR/VR, etc.) with state-of-the-art R/D performance results. It is necessary to evaluate not only R/D performance, a highly important feature, but also the complexity of future video encoders. New coding tools offering a small increase in R/D performance at the cost of greater complexity are being advanced with caution. We performed a detailed analysis of two evolutions of High Efficiency Video Coding (HEVC) video standards, Joint Exploration Model (JEM) and Versatile Video Coding (VVC), in terms of both R/D performance and complexity. The results show how VVC, which represents the new direction of future standards, has, for the time being, sacrificed R/D performance in order to significantly reduce overall coding/decoding complexity.
Collapse
|
6
|
Hassan A, Ghafoor M, Tariq SA, Zia T, Ahmad W. High Efficiency Video Coding ( HEVC)-Based Surgical Telementoring System Using Shallow Convolutional Neural Network. J Digit Imaging 2019; 32:1027-1043. [PMID: 30980262 PMCID: PMC6841856 DOI: 10.1007/s10278-019-00206-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Surgical telementoring systems have gained lots of interest, especially in remote locations. However, bandwidth constraint has been the primary bottleneck for efficient telementoring systems. This study aims to establish an efficient surgical telementoring system, where the qualified surgeon (mentor) provides real-time guidance and technical assistance for surgical procedures to the on-spot physician (surgeon). High Efficiency Video Coding (HEVC/H.265)-based video compression has shown promising results for telementoring applications. However, there is a trade-off between the bandwidth resources required for video transmission and quality of video received by the remote surgeon. In order to efficiently compress and transmit real-time surgical videos, a hybrid lossless-lossy approach is proposed where surgical incision region is coded in high quality whereas the background region is coded in low quality based on distance from the surgical incision region. For surgical incision region extraction, state-of-the-art deep learning (DL) architectures for semantic segmentation can be used. However, the computational complexity of these architectures is high resulting in large training and inference times. For telementoring systems, encoding time is crucial; therefore, very deep architectures are not suitable for surgical incision extraction. In this study, we propose a shallow convolutional neural network (S-CNN)-based segmentation approach that consists of encoder network only for surgical region extraction. The segmentation performance of S-CNN is compared with one of the state-of-the-art image segmentation networks (SegNet), and results demonstrate the effectiveness of the proposed network. The proposed telementoring system is efficient and explicitly considers the physiological nature of the human visual system to encode the video by providing good overall visual impact in the location of surgery. The results of the proposed S-CNN-based segmentation demonstrated a pixel accuracy of 97% and a mean intersection over union accuracy of 79%. Similarly, HEVC experimental results showed that the proposed surgical region-based encoding scheme achieved an average bitrate reduction of 88.8% at high-quality settings in comparison with default full-frame HEVC encoding. The average gain in encoding performance (signal-to-noise) of the proposed algorithm is 11.5 dB in the surgical region. The bitrate saving and visual quality of the proposed optimal bit allocation scheme are compared with the mean shift segmentation-based coding scheme for fair comparison. The results show that the proposed scheme maintains high visual quality in surgical incision region along with achieving good bitrate saving. Based on comparison and results, the proposed encoding algorithm can be considered as an efficient and effective solution for surgical telementoring systems for low-bandwidth networks.
Collapse
Affiliation(s)
- Ali Hassan
- Department of Computer Science, COMSATS University, Islamabad, Pakistan
| | - Mubeen Ghafoor
- Department of Computer Science, COMSATS University, Islamabad, Pakistan
| | - Syed Ali Tariq
- Department of Computer Science, COMSATS University, Islamabad, Pakistan.
| | - Tehseen Zia
- Department of Computer Science, COMSATS University, Islamabad, Pakistan
| | - Waqas Ahmad
- Department of Information Systems and Technology, Mid Sweden University, Sundsvall, Sweden
| |
Collapse
|
7
|
Jeong J, Jang D, Son J, Ryu ES. 3DoF+ 360 Video Location-Based Asymmetric Down-Sampling for View Synthesis to Immersive VR Video Streaming. Sensors (Basel) 2018; 18:s18093148. [PMID: 30231529 PMCID: PMC6164630 DOI: 10.3390/s18093148] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 09/13/2018] [Accepted: 09/14/2018] [Indexed: 05/14/2023]
Abstract
Recently, with the increasing demand for virtual reality (VR), experiencing immersive contents with VR has become easier. However, a tremendous amount of calculation and bandwidth is required when processing 360 videos. Moreover, additional information such as the depth of the video is required to enjoy stereoscopic 360 contents. Therefore, this paper proposes an efficient method of streaming high-quality 360 videos. To reduce the bandwidth when streaming and synthesizing the 3DoF+ 360 videos, which supports limited movements of the user, a proper down-sampling ratio and quantization parameter are offered from the analysis of the graph between bitrate and peak signal-to-noise ratio. High-efficiency video coding (HEVC) is used to encode and decode the 360 videos, and the view synthesizer produces the video of intermediate view, providing the user with an immersive experience.
Collapse
Affiliation(s)
- JongBeom Jeong
- Department of Computer Engineering, Gachon University, Seongnam 13120, Korea.
| | - Dongmin Jang
- Department of Computer Engineering, Gachon University, Seongnam 13120, Korea.
| | - Jangwoo Son
- Department of Computer Engineering, Gachon University, Seongnam 13120, Korea.
| | - Eun-Seok Ryu
- Department of Computer Engineering, Gachon University, Seongnam 13120, Korea.
| |
Collapse
|
8
|
Abenza PPG, Malumbres MP, Piñol P, López-Granado O. Source Coding Options to Improve HEVC Video Streaming in Vehicular Networks. Sensors (Basel) 2018; 18:E3107. [PMID: 30223525 DOI: 10.3390/s18093107] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 09/11/2018] [Accepted: 09/12/2018] [Indexed: 11/25/2022]
Abstract
Video delivery in Vehicular Ad-hoc NETworks has a great number of applications. However, multimedia streaming over this kind of networks is a very challenging issue because (a) it is one of the most resource-demanding applications; (b) it requires high bandwidth communication channels; (c) it shows moderate to high node mobility patterns and (d) it is common to find high communication interference levels that derive in moderate to high loss rates. In this work, we present a simulation framework based on OMNeT++ network simulator, Veins framework, and the SUMO mobility traffic simulator that aims to study, evaluate, and also design new techniques to improve video delivery over Vehicular Ad-hoc NETworks. Using the proposed simulation framework we will study different coding options, available at the HEVC video encoder, that will help to improve the perceived video quality in this kind of networks. The experimental results show that packet losses significantly reduce video quality when low interference levels are found in an urban scenario. By using different INTRA refresh options combined with appropriate tile coding, we will improve the resilience of HEVC video delivery services in VANET urban scenarios.
Collapse
|
9
|
Abstract
With increasing utilization of medical imaging in clinical practice and the growing dimensions of data volumes generated by various medical imaging modalities, the distribution, storage, and management of digital medical image data sets requires data compression. Over the past few decades, several image compression standards have been proposed by international standardization organizations. This paper discusses the current status of these image compression standards in medical imaging applications together with some of the legal and regulatory issues surrounding the use of compression in medical settings.
Collapse
Affiliation(s)
- Feng Liu
- College of Electronic Information and Optical Engineering, Nankai University, Haihe Education Park, 38 Tongyan Road, Jinnan District, Tianjin 300353, P. R. China
| | - Miguel Hernandez-Cabronero
- Department of Electrical and Computer Engineering, The University of Arizona; 1230 E. Speedway Blvd, Tucson, AZ, 85721, U.S.A
| | - Victor Sanchez
- Department of Computer Science, University of Warwick, Coventry, CV4 7AL, United Kingdom
| | - Michael W. Marcellin
- Department of Electrical and Computer Engineering, The University of Arizona; 1230 E. Speedway Blvd, Tucson, AZ, 85721, U.S.A
| | - Ali Bilgin
- Department of Electrical and Computer Engineering, The University of Arizona; 1230 E. Speedway Blvd, Tucson, AZ, 85721, U.S.A
- Department of Biomedical Engineering, The University of Arizona; 1127 E. James E. Rogers Way, Tucson, AZ, 85721, U.S.A
- Department of Medical Imaging, The University of Arizona; 1501 N. Campbell Ave., Tucson, AZ, 85724, U.S.A
| |
Collapse
|
10
|
de Melo WC, de Lima Filho EB, da Silva Júnior WS. SEMG signal compression based on two-dimensional techniques. Biomed Eng Online 2016; 15:41. [PMID: 27091454 PMCID: PMC4835940 DOI: 10.1186/s12938-016-0158-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Accepted: 04/05/2016] [Indexed: 11/19/2022] Open
Abstract
Background Recently, two-dimensional techniques have been successfully employed for compressing surface electromyographic (SEMG) records as images, through the use of image and video encoders. Such schemes usually provide specific compressors, which are tuned for SEMG data, or employ preprocessing techniques, before the two-dimensional encoding procedure, in order to provide a suitable data organization, whose correlations can be better exploited by off-the-shelf encoders. Besides preprocessing input matrices, one may also depart from those approaches and employ an adaptive framework, which is able to directly tackle SEMG signals reassembled as images. Methods This paper proposes a new two-dimensional approach for SEMG signal compression, which is based on a recurrent pattern matching algorithm called multidimensional multiscale parser (MMP). The mentioned encoder was modified, in order to efficiently work with SEMG signals and exploit their inherent redundancies. Moreover, a new preprocessing technique, named as segmentation by similarity (SbS), which has the potential to enhance the exploitation of intra- and intersegment correlations, is introduced, the percentage difference sorting (PDS) algorithm is employed, with different image compressors, and results with the high efficiency video coding (HEVC), H.264/AVC, and JPEG2000 encoders are presented. Results Experiments were carried out with real isometric and dynamic records, acquired in laboratory. Dynamic signals compressed with H.264/AVC and HEVC, when combined with preprocessing techniques, resulted in good percent root-mean-square difference \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\times$$\end{document}× compression factor figures, for low and high compression factors, respectively. Besides, regarding isometric signals, the modified two-dimensional MMP algorithm outperformed state-of-the-art schemes, for low compression factors, the combination between SbS and HEVC proved to be competitive, for high compression factors, and JPEG2000, combined with PDS, provided good performance allied to low computational complexity, all in terms of percent root-mean-square difference \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\times$$\end{document}× compression factor. Conclusion The proposed schemes are effective and, specifically, the modified MMP algorithm can be considered as an interesting alternative for isometric signals, regarding traditional SEMG encoders. Besides, the approach based on off-the-shelf image encoders has the potential of fast implementation and dissemination, given that many embedded systems may already have such encoders available, in the underlying hardware/software architecture.
Collapse
|
11
|
Pan Z, Chen L, Sun X. Low Complexity HEVC Encoder for Visual Sensor Networks. Sensors (Basel) 2015; 15:30115-25. [PMID: 26633415 PMCID: PMC4721709 DOI: 10.3390/s151229788] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Revised: 11/24/2015] [Accepted: 11/26/2015] [Indexed: 11/16/2022]
Abstract
Visual sensor networks (VSNs) can be widely applied in security surveillance, environmental monitoring, smart rooms, etc. However, with the increased number of camera nodes in VSNs, the volume of the visual information data increases significantly, which becomes a challenge for storage, processing and transmitting the visual data. The state-of-the-art video compression standard, high efficiency video coding (HEVC), can effectively compress the raw visual data, while the higher compression rate comes at the cost of heavy computational complexity. Hence, reducing the encoding complexity becomes vital for the HEVC encoder to be used in VSNs. In this paper, we propose a fast coding unit (CU) depth decision method to reduce the encoding complexity of the HEVC encoder for VSNs. Firstly, the content property of the CU is analyzed. Then, an early CU depth decision method and a low complexity distortion calculation method are proposed for the CUs with homogenous content. Experimental results show that the proposed method achieves 71.91% on average encoding time savings for the HEVC encoder for VSNs.
Collapse
Affiliation(s)
- Zhaoqing Pan
- School of Computer and Software, Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science and Technology, Nanjing 210044, China.
- School of Computer Science and Engineering, Hebei University of Technology, Tianjin 300401, China.
| | - Liming Chen
- School of Computer and Software, Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science and Technology, Nanjing 210044, China.
| | - Xingming Sun
- School of Computer and Software, Jiangsu Engineering Center of Network Monitoring, Nanjing University of Information Science and Technology, Nanjing 210044, China.
| |
Collapse
|