1
|
Yang Z, Gao W, Li G, Yan Y. SUR-Driven Video Coding Rate Control for Jointly Optimizing Perceptual Quality and Buffer Control. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:5451-5464. [PMID: 37768799 DOI: 10.1109/tip.2023.3312919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Rate control plays an important role in video coding and has attracted lots of attention from researchers. However, the problems of human visual experience and buffer stability still remain. For scenes with drastic motions, parts of distortions can be masked due to the limitation of the Human Visual System (HVS), while buffers tend to suffer more overflow and underflow cases from the fluctuating bits. In this paper, we propose a novel joint rate control scheme, which is composed of the proposed SUR-based perception modeling and the proposed SUR-based Perception-Buffer Rate Control (PBRC), for HEVC to maximize human visual perception quality while preventing the underflow and overflow of buffers. First of all, to effectively model human visual quality, we introduce the perception-related Satisfied-User-Ratio (SUR) metric into the rate control process. Secondly, a time-efficient video quality prediction method called Fast Visual Multimethod Assessment Fusion (VMAF) Quality Prediction (FVQP) is designed for the generation of SUR curves within an affordable computational complexity. Thirdly, a dual-objective optimization framework is established. By jointly conducting perception modeling and PBRC, we can flexibly adjust the optimization priority between human visual quality and buffer stability, and thus the quality of achieved reconstructed videos can be effectively improved because of the decrease in frame skipping. Experimental results demonstrate that the proposed joint rate control scheme improves the human visual experience when considering frame skipping and more effectively stabilizes buffer stability than existing methods.
Collapse
|
2
|
Wang X, Yin H, Lu Y, Zhao S, Chen Y. Semantically Adaptive JND Modeling with Object-Wise Feature Characterization, Context Inhibition and Cross-Object Interaction. SENSORS (BASEL, SWITZERLAND) 2023; 23:3149. [PMID: 36991860 PMCID: PMC10059135 DOI: 10.3390/s23063149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/12/2023] [Accepted: 01/15/2023] [Indexed: 06/19/2023]
Abstract
Performance bottlenecks in the optimization of JND modeling based on low-level manual visual feature metrics have emerged. High-level semantics bear a considerable impact on perceptual attention and subjective video quality, yet most existing JND models do not adequately account for this impact. This indicates that there is still much room and potential for performance optimization in semantic feature-based JND models. To address this status quo, this paper investigates the response of visual attention induced by heterogeneous semantic features with an eye on three aspects, i.e., object, context, and cross-object, to further improve the efficiency of JND models. On the object side, this paper first focuses on the main semantic features that affect visual attention, including semantic sensitivity, objective area and shape, and central bias. Following that, the coupling role of heterogeneous visual features with HVS perceptual properties are analyzed and quantified. Second, based on the reciprocity of objects and contexts, the contextual complexity is measured to gauge the inhibitory effect of contexts on visual attention. Third, cross-object interactions are dissected using the principle of bias competition, and a semantic attention model is constructed in conjunction with a model of attentional competition. Finally, to build an improved transform domain JND model, a weighting factor is used by fusing the semantic attention model with the basic spatial attention model. Extensive simulation results validate that the proposed JND profile is highly consistent with HVS and highly competitive among state-of-the-art models.
Collapse
Affiliation(s)
- Xia Wang
- School of Communication Engineering, Hangzhou Dianzi University, No. 2 Street, Xiasha, Hangzhou 310018, China
- Lishui Institute of Hangzhou Dianzi University, Nanmingshan Street, Liandu, Lishui 323000, China
| | - Haibing Yin
- School of Communication Engineering, Hangzhou Dianzi University, No. 2 Street, Xiasha, Hangzhou 310018, China
- Lishui Institute of Hangzhou Dianzi University, Nanmingshan Street, Liandu, Lishui 323000, China
| | - Yu Lu
- School of Communication Engineering, Hangzhou Dianzi University, No. 2 Street, Xiasha, Hangzhou 310018, China
| | - Shiling Zhao
- School of Communication Engineering, Hangzhou Dianzi University, No. 2 Street, Xiasha, Hangzhou 310018, China
- Lishui Institute of Hangzhou Dianzi University, Nanmingshan Street, Liandu, Lishui 323000, China
| | - Yong Chen
- Hangzhou Arcvideo Technology Co., Ltd., No. 3 Xidoumen Road, Xihu, Hangzhou 310012, China
| |
Collapse
|
3
|
Zhang Z, Shang X, Li G, Wang G. Just Noticeable Difference Model for Images with Color Sensitivity. SENSORS (BASEL, SWITZERLAND) 2023; 23:2634. [PMID: 36904837 PMCID: PMC10007073 DOI: 10.3390/s23052634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 02/20/2023] [Accepted: 02/23/2023] [Indexed: 06/18/2023]
Abstract
The just noticeable difference (JND) model reflects the visibility limitations of the human visual system (HVS), which plays an important role in perceptual image/video processing and is commonly applied to perceptual redundancy removal. However, existing JND models are usually constructed by treating the color components of three channels equally, and their estimation of the masking effect is inadequate. In this paper, we introduce visual saliency and color sensitivity modulation to improve the JND model. Firstly, we comprehensively combined contrast masking, pattern masking, and edge protection to estimate the masking effect. Then, the visual saliency of HVS was taken into account to adaptively modulate the masking effect. Finally, we built color sensitivity modulation according to the perceptual sensitivities of HVS, to adjust the sub-JND thresholds of Y, Cb, and Cr components. Thus, the color-sensitivity-based JND model (CSJND) was constructed. Extensive experiments and subjective tests were conducted to verify the effectiveness of the CSJND model. We found that consistency between the CSJND model and HVS was better than existing state-of-the-art JND models.
Collapse
Affiliation(s)
| | - Xiwu Shang
- Correspondence: ; Tel.: +86-021-6779-1084
| | | | | |
Collapse
|
4
|
Construction of multivalued cryptographic boolean function using recurrent neural network and its application in image encryption scheme. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10295-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
5
|
Color-Dense Illumination Adjustment Network for Removing Haze and Smoke from Fire Scenario Images. SENSORS 2022; 22:s22030911. [PMID: 35161660 PMCID: PMC8838094 DOI: 10.3390/s22030911] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 01/14/2022] [Accepted: 01/21/2022] [Indexed: 12/04/2022]
Abstract
The atmospheric particles and aerosols from burning usually cause visual artifacts in single images captured from fire scenarios. Most existing haze removal methods exploit the atmospheric scattering model (ASM) for visual enhancement, which inevitably leads to inaccurate estimation of the atmosphere light and transmission matrix of the smoky and hazy inputs. To solve these problems, we present a novel color-dense illumination adjustment network (CIANet) for joint recovery of transmission matrix, illumination intensity, and the dominant color of aerosols from a single image. Meanwhile, to improve the visual effects of the recovered images, the proposed CIANet jointly optimizes the transmission map, atmospheric optical value, the color of aerosol, and a preliminary recovered scene. Furthermore, we designed a reformulated ASM, called the aerosol scattering model (ESM), to smooth out the enhancement results while keeping the visual effects and the semantic information of different objects. Experimental results on both the proposed RFSIE and NTIRE’20 demonstrate our superior performance favorably against state-of-the-art dehazing methods regarding PSNR, SSIM and subjective visual quality. Furthermore, when concatenating CIANet with Faster R-CNN, we witness an improvement of the objection performance with a large margin.
Collapse
|
6
|
Goundar S, Bhardwaj A, Prakash SS, Sadal P. Use of Artificial Neural Network for Forecasting Health Insurance Entitlements. JOURNAL OF INFORMATION TECHNOLOGY RESEARCH 2022. [DOI: 10.4018/jitr.299372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
A number of numerical practices exist that actuaries use to predict annual medical claims expense in an insurance company. This amount needs to be included in the yearly financial budgets. Inappropriate estimating generally has negative effects on the overall performance of the business. This paper presents the development of Artificial Neural Network model that is appropriate for predicting the anticipated annual medical claims. Once the implementation of the neural network models were finished, the focus was to decrease the Mean Absolute Percentage Error by adjusting the parameters such as epoch, learning rate and neuron in different layers. Both Feed Forward and Recurrent Neural Networks were implemented to forecast the yearly claims amount. In conclusion, the Artificial Neural Network Model that was implemented proved to be an effective tool for forecasting the anticipated annual medical claims. Recurrent neural network outperformed Feed Forward neural network in terms of accuracy and computation power required to carry out the forecasting.
Collapse
|
7
|
Imperceptible–Visible Watermarking to Information Security Tasks in Color Imaging. MATHEMATICS 2021. [DOI: 10.3390/math9192374] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Digital image watermarking algorithms have been designed for intellectual property, copyright protection, medical data management, and other related fields; furthermore, in real-world applications such as official documents, banknotes, etc., they are used to deliver additional information about the documents’ authenticity. In this context, the imperceptible–visible watermarking (IVW) algorithm has been designed as a digital reproduction of the real-world watermarks. This paper presents a new improved IVW algorithm for copyright protection that can deliver additional information to the image content. The proposed algorithm is divided into two stages: in the embedding stage, a human visual system-based strategy is used to embed an owner logotype or a 2D quick response (QR) code as a watermark into a color image, maintaining a high watermark imperceptibility and low image-quality degradation. In the exhibition, a new histogram binarization function approach is introduced to exhibit any watermark with enough quality to be recognized or decoded by any application, which is focused on reading QR codes. The experimental results show that the proposed algorithm can embed one or more watermark patterns, maintaining the high imperceptibility and visual quality of the embedded and the exhibited watermark. The performance evaluation shows that the method overcomes several drawbacks reported in previous algorithms, including geometric and image processing attacks such as JPEG and JPEG2000.
Collapse
|
8
|
Zhang Q, Wang S, Zhang X, Ma S, Gao W. Just Recognizable Distortion for Machine Vision Oriented Image and Video Coding. Int J Comput Vis 2021. [DOI: 10.1007/s11263-021-01505-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
9
|
Shen X, Ni Z, Yang W, Zhang X, Wang S, Kwong S. Just Noticeable Distortion Profile Inference: A Patch-Level Structural Visibility Learning Approach. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 30:26-38. [PMID: 33141668 DOI: 10.1109/tip.2020.3029428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this paper, we propose an effective approach to infer the just noticeable distortion (JND) profile based on patch-level structural visibility learning. Instead of pixel-level JND profile estimation, the image patch, which is regarded as the basic processing unit to better correlate with the human perception, can be further decomposed into three conceptually independent components for visibility estimation. In particular, to incorporate the structural degradation into the patch-level JND model, a deep learning-based structural degradation estimation model is trained to approximate the masking of structural visibility. In order to facilitate the learning process, a JND dataset is further established, including 202 pristine images and 7878 distorted images generated by advanced compression algorithms based on the upcoming Versatile Video Coding (VVC) standard. Extensive experimental results further show the superiority of the proposed approach over the state-of-the-art. Our dataset is available at: https://github.com/ShenXuelin-CityU/PWJNDInfer.
Collapse
|
10
|
Goundar S, Prakash S, Sadal P, Bhardwaj A. Health Insurance Claim Prediction Using Artificial Neural Networks. INTERNATIONAL JOURNAL OF SYSTEM DYNAMICS APPLICATIONS 2020. [DOI: 10.4018/ijsda.2020070103] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A number of numerical practices exist that actuaries use to predict annual medical claim expense in an insurance company. This amount needs to be included in the yearly financial budgets. Inappropriate estimating generally has negative effects on the overall performance of the business. This study presents the development of artificial neural network model that is appropriate for predicting the anticipated annual medical claims. Once the implementation of the neural network models was finished, the focus was to decrease the mean absolute percentage error by adjusting the parameters, such as epoch, learning rate, and neurons in different layers. Both feed forward and recurrent neural networks were implemented to forecast the yearly claims amount. In conclusion, the artificial neural network model that was implemented proved to be an effective tool for forecasting the anticipated annual medical claims for BSP Life. Recurrent neural network outperformed the feed forward neural network in terms of accuracy and computation power required to carry out the forecasting.
Collapse
Affiliation(s)
- Sam Goundar
- The University of the South Pacific, Suva, Fiji
| | | | | | | |
Collapse
|
11
|
Zhang X, Yang C, Wang H, Xu W, Kuo CCJ. Satisfied-User-Ratio Modeling for Compressed Video. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:3777-3789. [PMID: 31976895 DOI: 10.1109/tip.2020.2965994] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
With explosive increase of internet video services, perceptual modeling for video quality has attracted more attentions to provide high quality-of-experience (QoE) for end-users subject to bandwidth constraints, especially for compressed video quality. In this paper, a novel perceptual model for satisfied-user-ratio (SUR) on compressed video quality is proposed by exploiting compressed video bitrate changes and spatial-temporal statistical characteristics extracted from both uncompressed original video and reference video. In the proposed method, an efficient video feature set is explored and established to model SUR curves against bitrate variations by leveraging the Gaussian Processes Regression (GPR) framework. In particular, the proposed model is based on the recently released large-scale video quality dataset, VideoSet, and takes both spatial and temporal masking effects into consideration. To make it more practical, we further optimize the proposed method from three aspects including feature source simplification, computation complexity reduction and video codec adaption. Based on experimental results on VideoSet, the proposed method can accurately model SUR curves for various video contents and predict their required bitrates at given SUR values. Subjective experiments are conducted to further verify the generalization ability of the proposed SUR model.
Collapse
|