Image Quality Assessment Based on Contourlet and ESD Method

In recent years, the development of the digital image processing promotes the research of the image quality assessment (IQA). A novel metric for full-reference image quality assessment is presented. The metric combined the contourlet transform with the energy of structural distortion (ESD), namely the CT-ESD. The calculation of the ESD is carried out in each subband of the contourlet transform. Then the comparisons between the reference and the distorted images on each subband are integrated by weighting sum. The superiority of the contourlet transform integrates well into new IQA metric. Experiments performed on the database TID2013 demonstrate that the CT-ESD can achieve high consistency with the subjective evaluation.


Introduction
Digital images are experiencing a tremendous growth in both theoretical developments and sophisticated applications.The primary issue in digital image processing is the image quality assessment (IQA) [1].Image quality assessment is the procedure mapping the changes of the images to corresponding visual preference.It can be divided into subjective evaluation and objective evaluation.Obviously, humans, as the ultimate receivers of the images, are the best candidate to assess the quality of images.Nevertheless, the inconvenience and huge consumption of subjective evaluation leads to its less favor.And the objective image quality assessment has attracted increasing attentions from more and more researchers.
According to the dependence of the reference image, the objective image quality can fall into three categories: full-reference (FR), reduced-reference (RR) and noreference (NR).Full-reference image quality assessment algorithms has the strongest dependence on the reference image.They take both a reference image and a distorted image as input and produce a scalar which measure the quality of the distorted image.Reduced-reference image quality assessment algorithms has less dependence on reference images.Instead of reference images, certain features of reference images are available to evaluate the quality.No-reference image quality assessment (also known as blind image quality assessment) algorithms perform image quality estimation with no reference image information but only the distorted images.In this paper we focus on the full-reference image quality assessment.

Conventional
full-reference image quality assessment algorithms calculate the pixel-wise difference between a distorted image and its corresponding reference image, for example, the meansquared error (MSE) and the peak signal-to-noise ratio (PSNR).Despite its intensive usage, these metrics are not preferable choice because they correlate poorly with human visual subjectivity.
More FR image quality assessment algorithms have been developed these years.As the ultimate objective of IQA is human visual preference, it is taken for granted to combine the characteristics of human visual system (HVS) with the IQA.The easy way to combine with HVS is weighting the frequency domain error with a contrast sensitivity function (CSF), which is called the weighting signal-to-noise ratio (WSNR) [ Besides HVS and statistics based IQA, there are structure based IQA metrics.Though 'image structure' remains non uniform definition, this class of metric is of more preference.Among these metrics, the universal image quality index (UQI) [10] and the structural similarity index (SSIM) [11] are of essential importance.The UQI employs cross-correlation and measures of luminance and contrast differences to estimate quality.The SSIM is actually the extended version of the UQI.The difference is that the SSIM added small constants to the numerator and the denominator of each measure.There are a lot of IQA metrics based on SSIM.Take CW-SSIM [12] as an example, it combines a complex wavelet with the SSIM and it is insensitive to translation, scaling and rotation of images in comparison with SSIM.Except for these SSIM based metrics, there are other structure based IQA.The energy of structural distortion (ESD) [13] is an IQA metric of this kind.Cailing Wang et al. extended ESD to the spectral image in spectrum aspect and spatial aspect [14].
In this paper, we proposed a full-reference image quality assessment metric based on the contourlet transform and the energy of structural distortion (ESD).Section presents the background knowledge of the contourlet transform and the ESD.Section presents our proposed IQA metric named as contourlet transform based ESD (CT-ESD).Section presents the subjective validation details.Then we conclude the paper in Section .

Theory Basis
It is known to all that 2-D wavelet transform is short of efficiency for image representation in spite of its excellent performance for 1-D signal processing.Therefore, to develop multiscale geometric analysis is necessary.Contourlet transform [15] is an attractive alternative.It is constructed originally in discrete domain, meanwhile, it has a precise connection with continuous domain expansions.Contourlet transform allows for a different number of directions at different scale while achieve nearly critical sampling.Its implement based on iterated filter banks leads to computationally efficient.Moreover, Contourlet transform can provide a sparse representation for natural images with smooth contours.It possesses the qualities of directionality and anisotropy which are important for image representations.Thus we applied Contourlet transform in IQA metric design.
Contourlet transform is actually a double filter bank structure.The first filter bank is Laplacian pyramid (LP).
It is aimed at capture the point discontinuities.Here we focus on the decomposition of LP.At each level, the input image is filtered by lowpass analysis filter H and downsampled by sampling matrix M to generate a lowpass version of the input.At the same time, a bandpass image is obtained by the subtraction between the input at this level and the prediction.The prediction is the lowpass version mentioned above with sequentially upsampling and filtering.The sampling matrix in upsampling is M and the filter is synthesis filter G.The second filter bank is directional filter bank (DFB).This filter bank links point discontinuities into linear structures.It decomposes the bandpass image from LP into 2 l subbands via l-level binary tree.Each It is noticeable that the square of i S is 1, in other words, the energy of i S is 1.The inner product is set to be the simplest one.When ^, 1,..., , 1,..., ^, 1,..., , 1,..., (3) It should be noted that in the calculation of i Ec , the object in the inner product is i S .The final score of ESD is defined as (4) in which K is the number of blocks.The lower the score of ESD is, the higher image quality is.The fact that the distorted image is identical to the reference one will result in the value of ESD being negative infinity.So we insist that the calculation of ESD should add a small constant before the logarithm operation.

Metric Proposed
The ultimate objective of IQA is to obtain an evaluation of image quality which approximate the perception by human visual system (HVS).(5) It should be noted that each subband of Contourlet transform generates an energy value matrix.
At each scale , the variance of the energy is calculated (6) The sum ensures j EV is a scalar.
The final metric proposed is defined as (7) in which D is a constant to avoid the logarithm of zero and small enough to have no influence on the result.j w is the weighting coefficient of the j EV at scale j.
Moreover, the sum of all the j w is one ( 1 For different scale, its weighting coefficient can change according to its contribution to the HVS.The flow chat of the CT-ESD is in Figure 2.

Experiments
At the beginning of this section, we need to introduce the image database briefly.In this paper, we choose the Spearman rank-order correlation coefficient (SROCC) and the Kendall rankorder correlation coefficient (KROCC) just as the authors of TID2013 do.They prefer to employ rank order correlation coefficients to avoid fitting procedures which may be not unique.
The value of the SROCC and the KROCC between the mean opinion score (MOS) and the proposed metric score is expected to be high which implies the new metric is a good substitution for HVS.
As there are there are parameters j w , 1, 2,..., j J in our proposed metric, we choose the j w which can lead to higher SROCC and KROCC.Experiments prove that high parameter corresponding to medium frequency subband can lead to high rank-order correlation coefficients.This phenomenon is essentially in agreement with the charac-teristic of HVS.In this paper we choose 3 level Laplacian decomposition in the contourlet.It produces coefficients of four cells corresponding to one low frequency subband and three pyramidal levels.And each pyramidal level contains 3 2 frequency bands.Here, we set the parameters as follows: And the comparisons to the conventional IQA such as PSNR and MSE are meaningful and necessary.The data is provided in [17] for most of these metrics.And it also can be calculated by Metrix MUX Visual Quality Assessment Package [18].
The following two tables show the SROCC and KROCC between the metrics and the MOS respectively for each subset.In the tables, each column represents a subset of the database TID2013 except for the last column.The last column 'Full' means the whole set of the database TID2013.Each row represents a metric.For example, in the Table 1, the value at the row 'CT-ESD' and the column 'Full' is 0.7250.It indicates that the metric CT-ESD gets 0.7250 for the SROCC with the MOS on the whole set.As a result of the challenging difficulty of the database TID2013, the values of the SROCC and KROCC have not appear the high values very close to 1.
From the tables, we can conclude that CT-ESD outperforms the other metrics in the table on the whole distorted images set.Though the CT-ESD fails to outperform the others on each subset, its performance is more evenly in different subset.Especially, the 'Exotic' subset contains distortions types which are not happen frequently but among the most difficult ones for IQA.In this subset the CT-ESD has the highest KROCC value 0.5021 of that column in the Table 2.In Table 1, the CT-ESD is 0.6894, only lower than the VSNR 0.7064 on this subset.On the 'New' subset and the 'Color' subset, which drag down the rank-order correlation coefficient for most metrics, the CT-ESD has preferable performance.It gets 0.6776 and 0.6021 for the SROCC which are both ahead in the corresponding column.
It is noteworthy that compared to ESD, CT-ESD is better on the subsets 'Noise', 'Actual', and 'Exotic'.These three subset relate to distortion types most common and difficult respectively.And CT-ESD has a significant improvement on the subset 'Exotic'.Together these subsets, CT-ESD outperforms the ESD in the full database TID2013.Whether the SROCC or the KROCC, it wins the other metrics at the 'Full' column with the highest values 0.7250 and 0.5514 respectively.
In Figure 3 we can see the scatter plot between the MOS and the metrics.From Figure 3(a) and (b), it is easy to conclude that the scatters in (b) are more compact and have more apparent tendency.Figure 3(b) represents the scatter plot between the MOS and the CT-ESD.So this means the CT-ESD has better evaluation of the image quality.Figure 3(c) and (d) show that different metrics may lead to the scatter plot into different shapes.From them we can know that the CT-ESD has better performance as it performs better on the compactness and tendency.The data and the charts above show that it is meaningful to introduce the contourlet transform into the IQA metric.We attribute the preferable performance of the CT-ESD to the attendance of the contourlet transform.The multi-resolution and the sparse representation of the contourlet transform make the IQA metric more effective and more similarity to the HVS.These advantages lead to the success of the IQA.

Discussions
In this paper, we proposed a novel full-reference IQA combined the ESD with the contourlet transform named Moreover, its preferable performance on the database TID2013 proves its validity and practicability, especially for the 'Exotic' type distortions.Our proposed metric still has the potential of improvement.We will focus on the utilization of the multi-direction of the contourlet in the IQA and the IQA metric aimed at the 'Color' type distortion.We insist that the contourlet transform as a 'real' two dimensional transform will be a powerful tool in the IQA.
2]. N. Damera Venkata et al. proposed the noise quality measure (NQM) which consists of the contrast pyramid processing and the signal-to-noise ratio computation [3].Karen Egizarian et al. presented the IQA metric named as PSNR-HVS [4].It removes the mean shift and then stretches the contrast by a scanning window before the calculation of the modified PSNR.The modification of PSNR is taking HVS into account when in the calculation of the MSE in PSNR.In the next year, these researchers proposed the PSNR-HVS-M metric [5] based on PSNR-HVS.The new metric is based on model of inter-coefficient masking of DCT basis functions and the modifications version of PSNR.Damon M. Chandler and Sheila S. Hemami presented an IQA metric based on near-threshold and suprathreshold properties of human vision called the visual signal-to-noise ratio (VSNR) [6].HVS-based IQA metrics have a high dependence on the accuracy on the models of HVS.So many researchers employ other models such as image statistics.Hamid Rahim Sheikh et al. took the IQA problem as an information fidelity problem.They proposed a new metric named as information fidelity criterion (IFC) [7] which measured the statistical information that a distorted image had of the reference image.Similarly, Hamid Rahim Sheikh and Alan C. Bovik presented visual information fidelity (VIF) [8] which was derived from a statistical model for natural scenes, a model for image distortions and a HVS model.In the other direction, Zhou Wang and Qiang Li found a novel weighting approach based on a Gaussian scale mixture (GSM) model.It had pretty well and consistent performance improvement of both PSNR and SSIM based IQA named as information content weighted PSNR (IW-PSNR) and SSIM (IW-SSIM) respectively [9].

Figure 1 .
Figure 1.The flow diagram of the contourlet transform.First, the LP is employed to decompose the image into multiscale.Then in each bandpass channel, the DFB divides it into different directional subbands.After we learn about the contourlet transform, we will continue to introduce the energy of structural distortion (ESD) [13].The reference and distorted image are divided into blocks with size H L u .The blocks are actually 2D vectors denoted by i b and i bc respectively for the reference and distorted image.The element of i b is in distorted image is calculated the same way as follows:

I.
by HVS and Contourlet transform.Moreover, the Contourlet transform is sparse and effective for image representation which is fundamental to IQA.Subsequently, it is naturally inclined to employ the Contourlet transform into the IQA metric.We propose an IQA metric combined the Contourlet transform and the energy of structural distortion (CT-ESD).The procedures are list below.Input the distorted image D The superscript D indicates the distorted image and R indicates the reference image.The subscript j represents the scale index and k means the orientation index.For example, , D j k c denotes the Contourlet coefficients of the distorted image at the jth level in the kth direction.And the low frequency subband has no direction division.The Contourlet coefficients ^, way as the equation (1).Calculate the energy of structural information of the distorted image and the reference image which are denoted as ^; ,

Figure 2 .
Figure 2. The flow chart of the IQA metric combined the Contourlet transform and the energy of structural distortion (CT-ESD).
the low frequency subband and three pyramidal levels.Obviously, 2 w is the biggest one which means the weighting coefficient of the medium frequency subband is the biggest.The performance of the proposed CT-ESD metric will be validated and compared with some representative IQA metrics of the HVS based, statistics based and structural based IQA metrics respectively.These metrics are NQM [3], PSNR-HVS [4], VSNR[6], UQI[10], SSIM[11], IFC[7], VIF[8], IW-PSNR[9], and ESD [13].

Figure 3 .
Figure 3.The scatter plot between the MOS and some metrics.(a) the plot between the MOS and the ESD;(b) the plot between the MOS and the CT-ESD;(c) the plot between the MOS and the IFC;(d) the plot between the MOS and the SSIM.
ESD.It is a structural based IQA metric which measures the loss of the structure information to evaluate the image quality.The attendance of the contourlet lead to the improvement of the ESD image quality assessment.

Table 1 the
SROCC values of the metrics for the database TID2013

Table 2 the
KROCC values of the metrics for the database TID2013