1
|
Coskun S, Nur Yilmaz G, Battisti F, Alhussein M, Islam S. Measuring 3D Video Quality of Experience (QoE) Using A Hybrid Metric Based on Spatial Resolution and Depth Cues. J Imaging 2023; 9:281. [PMID: 38132699 PMCID: PMC10744539 DOI: 10.3390/jimaging9120281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 12/10/2023] [Accepted: 12/12/2023] [Indexed: 12/23/2023] Open
Abstract
A three-dimensional (3D) video is a special video representation with an artificial stereoscopic vision effect that increases the depth perception of the viewers. The quality of a 3D video is generally measured based on the similarity to stereoscopic vision obtained with the human vision system (HVS). The reason for the usage of these high-cost and time-consuming subjective tests is due to the lack of an objective video Quality of Experience (QoE) evaluation method that models the HVS. In this paper, we propose a hybrid 3D-video QoE evaluation method based on spatial resolution associated with depth cues (i.e., motion information, blurriness, retinal-image size, and convergence). The proposed method successfully models the HVS by considering the 3D video parameters that directly affect depth perception, which is the most important element of stereoscopic vision. Experimental results show that the measurement of the 3D-video QoE by the proposed hybrid method outperforms the widely used existing methods. It is also found that the proposed method has a high correlation with the HVS. Consequently, the results suggest that the proposed hybrid method can be conveniently utilized for the 3D-video QoE evaluation, especially in real-time applications.
Collapse
Affiliation(s)
- Sahin Coskun
- Department of Electrical-Electronics Engineering, Graduate School of Natural and Applied Sciences, Gazi University, Ankara 06560, Turkey;
| | - Gokce Nur Yilmaz
- Department of Computer Engineering, TED University, Ankara 06420, Turkey
| | - Federica Battisti
- Department of Information Engineering, University of Padova, 35131 Padova, Italy
| | - Musaed Alhussein
- Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia
| | - Saiful Islam
- Department of Computer Engineering, TED University, Ankara 06420, Turkey
| |
Collapse
|
2
|
Zhang Z, Tian S, Zou W, Morin L, Zhang L. EDDMF: An Efficient Deep Discrepancy Measuring Framework for Full-Reference Light Field Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2023; 32:6426-6440. [PMID: 37966926 DOI: 10.1109/tip.2023.3329663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2023]
Abstract
The increasing demand for immersive experience has greatly promoted the quality assessment research of Light Field Image (LFI). In this paper, we propose an efficient deep discrepancy measuring framework for full-reference light field image quality assessment. The main idea of the proposed framework is to efficiently evaluate the quality degradation of distorted LFIs by measuring the discrepancy between reference and distorted LFI patches. Firstly, a patch generation module is proposed to extract spatio-angular patches and sub-aperture patches from LFIs, which greatly reduces the computational cost. Then, we design a hierarchical discrepancy network based on convolutional neural networks to extract the hierarchical discrepancy features between reference and distorted spatio-angular patches. Besides, the local discrepancy features between reference and distorted sub-aperture patches are extracted as complementary features. After that, the angular-dominant hierarchical discrepancy features and the spatial-dominant local discrepancy features are combined to evaluate the patch quality. Finally, the quality of all patches is pooled to obtain the overall quality of distorted LFIs. To the best of our knowledge, the proposed framework is the first patch-based full-reference light field image quality assessment metric based on deep-learning technology. Experimental results on four representative LFI datasets show that our proposed framework achieves superior performance as well as lower computational complexity compared to other state-of-the-art metrics.
Collapse
|
3
|
Gu F, Zhang Z. No-Reference Quality Assessment of Stereoscopic Video Based on Temporal Adaptive Model for Improved Visual Communication. SENSORS (BASEL, SWITZERLAND) 2022; 22:8084. [PMID: 36365783 PMCID: PMC9654346 DOI: 10.3390/s22218084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Revised: 10/07/2022] [Accepted: 10/17/2022] [Indexed: 06/16/2023]
Abstract
An objective stereo video quality assessment (SVQA) strives to be consistent with human visual perception while ensuring a low time and labor cost of evaluation. The temporal-spatial characteristics of video make the data processing volume of quality evaluation surge, making an SVQA more challenging. Aiming at the effect of distortion on the stereoscopic temporal domain, a stereo video quality assessment method based on the temporal-spatial relation is proposed in this paper. Specifically, a temporal adaptive model (TAM) for a video is established to describe the space-time domain of the video from both local and global levels. This model can be easily embedded into any 2D CNN backbone network. Compared with the improved model based on 3D CNN, this model has obvious advantages in operating efficiency. Experimental results on NAMA3DS1-COSPAD1 database, WaterlooIVC 3D Video Phase I database, QI-SVQA database and SIAT depth quality database show that the model has excellent performance.
Collapse
Affiliation(s)
- Fenghao Gu
- School of Art and Design, Changzhou University, Changzhou 213164, China
| | - Zhichao Zhang
- College of Electrical Engineering, North China University of Science and Technology, Qinhuangdao 066008, China
| |
Collapse
|
4
|
PM2.5 Concentration Measurement Based on Image Perception. ELECTRONICS 2022. [DOI: 10.3390/electronics11091298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
PM2.5 in the atmosphere causes severe air pollution and dramatically affects the normal production and lives of residents. The real-time monitoring of PM2.5 concentrations has important practical significance for the construction of ecological civilization. The mainstream PM2.5 concentration prediction algorithms based on electrochemical sensors have some disadvantages, such as high economic cost, high labor cost, time delay, and more. To this end, we propose a simple and effective PM2.5 concentration prediction algorithm based on image perception. Specifically, the proposed method develops a natural scene statistical prior to estimating the saturation loss caused by the ’haze’ formed by PM2.5. After extracting the prior features, this paper uses the feedforward neural network to achieve the mapping function from the proposed prior features to the PM2.5 concentration values. Experiments constructed on the public Air Quality Image Dataset (AQID) show the superiority of our proposed PM2.5 concentration measurement method compared to state-of-the-art related PM2.5 concentration monitoring methods.
Collapse
|
5
|
Si J, Huang B, Yang H, Lin W, Pan Z. A no-Reference Stereoscopic Image Quality Assessment Network Based on Binocular Interaction and Fusion Mechanisms. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2022; 31:3066-3080. [PMID: 35394908 DOI: 10.1109/tip.2022.3164537] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In contemporary society full of stereoscopic images, how to assess visual quality of 3D images has attracted an increasing attention in field of Stereoscopic Image Quality Assessment (SIQA). Compared with 2D-IQA, SIQA is more challenging because some complicated features of Human Visual System (HVS), such as binocular interaction and binocular fusion, must be considered. In this paper, considering both binocular interaction and fusion mechanisms of the HVS, a hierarchical no-reference stereoscopic image quality assessment network (StereoIF-Net) is proposed to simulate the whole quality perception of 3D visual signals in human cortex, including two key modules: BIM and BFM. In particular, Binocular Interaction Modules (BIMs) are constructed to simulate binocular interaction in V2-V5 visual cortex regions, in which a novel cross convolution is designed to explore the interaction details in each region. In the BIMs, different output channel numbers are designed to imitate various receptive fields in V2-V5. Furthermore, a Binocular Fusion Module (BFM) with automatic learned weights is proposed to model binocular fusion of the HVS in higher cortex layers. The verification experiments are conducted on the LIVE 3D, IVC and Waterloo-IVC SIQA databases and three indices including PLCC, SROCC and RMSE are employed to evaluate the assessment consistency between StereoIF-Net and the HVS. The proposed StereoIF-Net achieves almost the best results compared with advanced SIQA methods. Specifically, the metric values on LIVE 3D, IVC and WIVC-I are the best, and are the second-best on the WIVC-II.
Collapse
|
6
|
Yang J, Zhao Y, Jiang B, Lu W, Gao X. No-Reference Quality Evaluation of Stereoscopic Video Based on Spatio-Temporal Texture. IEEE TRANSACTIONS ON MULTIMEDIA 2020; 22:2635-2644. [PMID: 0 DOI: 10.1109/tmm.2019.2961209] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
|
7
|
No-reference stereoscopic image quality evaluator with segmented monocular features and perceptual binocular features. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.04.049] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
8
|
Zhou W, Shi L, Chen Z, Zhang J. Tensor Oriented No-Reference Light Field Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2020; 29:4070-4084. [PMID: 32012015 DOI: 10.1109/tip.2020.2969777] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Light field image (LFI) quality assessment is becoming more and more important, which helps to better guide the acquisition, processing and application of immersive media. However, due to the inherent high dimensional characteristics of LFI, the LFI quality assessment turns into a multi-dimensional problem that requires consideration of the quality degradation in both spatial and angular dimensions. Therefore, we propose a novel Tensor oriented No-reference Light Field image Quality evaluator (Tensor-NLFQ) based on tensor theory. Specifically, since the LFI is regarded as a low-rank 4D tensor, the principal components of four oriented sub-aperture view stacks are obtained via Tucker decomposition. Then, the Principal Component Spatial Characteristic (PCSC) is designed to measure the spatial-dimensional quality of LFI considering its global naturalness and local frequency properties. Finally, the Tensor Angular Variation Index (TAVI) is proposed to measure angular consistency quality by analyzing the structural similarity distribution between the first principal component and each view in the view stack. Extensive experimental results on four publicly available LFI quality databases demonstrate that the proposed Tensor-NLFQ model outperforms state-of-the-art 2D, 3D, multi-view, and LFI quality assessment algorithms.
Collapse
|
9
|
No-Reference Depth Map Quality Evaluation Model Based on Depth Map Edge Confidence Measurement in Immersive Video Applications. FUTURE INTERNET 2019. [DOI: 10.3390/fi11100204] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
When it comes to evaluating perceptual quality of digital media for overall quality of experience assessment in immersive video applications, typically two main approaches stand out: Subjective and objective quality evaluation. On one hand, subjective quality evaluation offers the best representation of perceived video quality assessed by the real viewers. On the other hand, it consumes a significant amount of time and effort, due to the involvement of real users with lengthy and laborious assessment procedures. Thus, it is essential that an objective quality evaluation model is developed. The speed-up advantage offered by an objective quality evaluation model, which can predict the quality of rendered virtual views based on the depth maps used in the rendering process, allows for faster quality assessments for immersive video applications. This is particularly important given the lack of a suitable reference or ground truth for comparing the available depth maps, especially when live content services are offered in those applications. This paper presents a no-reference depth map quality evaluation model based on a proposed depth map edge confidence measurement technique to assist with accurately estimating the quality of rendered (virtual) views in immersive multi-view video content. The model is applied for depth image-based rendering in multi-view video format, providing comparable evaluation results to those existing in the literature, and often exceeding their performance.
Collapse
|
10
|
Zhou W, Chen Z, Li W. Dual-Stream Interactive Networks for No-Reference Stereoscopic Image Quality Assessment. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:3946-3958. [PMID: 30843835 DOI: 10.1109/tip.2019.2902831] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
The goal of objective stereoscopic image quality assessment (SIQA) is to predict the human perceptual quality of stereoscopic/3D images automatically and accurately. Compared with traditional 2D image quality assessment, the quality assessment of stereoscopic images is more challenging because of complex binocular vision mechanisms and multiple quality dimensions. In this paper, inspired by the hierarchical dual-stream interactive nature of the human visual system, we propose a stereoscopic image quality assessment network (StereoQA-Net) for no-reference stereoscopic image quality assessment. The proposed StereoQA-Net is an end-to-end dual-stream interactive network containing left and right view sub-networks, where the interaction of the two sub-networks exists in multiple layers. We evaluate our method on the LIVE stereoscopic image quality databases. The experimental results show that our proposed StereoQA-Net outperforms state-of-the-art algorithms on both symmetrically and asymmetrically distorted stereoscopic image pairs of various distortion types. In a more general case, the proposed StereoQA-Net can effectively predict the perceptual quality of local regions. In addition, cross-dataset experiments also demonstrate the generalization ability of our algorithm.
Collapse
|
11
|
Appina B, Dendi SVR, Manasa K, Channappayya SS, Bovik AC. Study of Subjective Quality and Objective Blind Quality Prediction of Stereoscopic Videos. IEEE TRANSACTIONS ON IMAGE PROCESSING : A PUBLICATION OF THE IEEE SIGNAL PROCESSING SOCIETY 2019; 28:5027-5040. [PMID: 31094690 DOI: 10.1109/tip.2019.2914950] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
We present a new subjective and objective study on full high-definition (HD) stereoscopic (3D or S3D) video quality. In subjective study, we constructed an S3D video dataset with 12 pristine and 288 test videos, and the test videos are generated by applying the H.264 and H.265 compression, blur and frame freeze artifacts. We also propose a no reference (NR) objective video quality assessment (QA) algorithm that relies on measurements of the statistical dependencies between the motion and disparity subband coefficients of S3D videos. Inspired by the Generalized Gaussian Distribution (GGD) approach in liu2011statistical, we model the joint statistical dependencies between the motion and disparity components as following a Bivariate Generalized Gaussian Distribution (BGGD). We estimate the BGGD model parameters (α,β) and the coherence measure (Ψ) from the eigenvalues of the sample covariance matrix (M) of the BGGD. In turn, we model the BGGD parameters of pristine S3D videos using a Multivariate Gaussian (MVG) distribution. The likelihood of a test video's MVG model parameters coming from the pristine MVG model is computed and shown to play a key role in the overall quality estimation. We also estimate the global motion content of each video by averaging the SSIM scores between pairs of successive video frames. To estimate the test S3D video's spatial quality, we apply the popular 2D NR unsupervised NIQE image QA model on a frame-by-frame basis on both views. The overall quality of a test S3D video is finally computed by pooling the test S3D video's likelihood estimates, global motion strength and spatial quality scores. The proposed algorithm, which is 'completely blind' (requiring no reference videos or training on subjective scores) is called the Motion and Disparity based 3D video quality evaluator (MoDi3D). We show that MoDi3D delivers competitive performance over a wide variety of datasets including the IRCCYN dataset, the WaterlooIVC Phase I dataset, the LFOVIA dataset and our proposed LFOVIAS3DPh2 S3D video dataset.
Collapse
|
12
|
|