Double HEVC Compression Detection with Different Bitrates Based on Co-occurrence Matrix of PU Types and DCT Coefficients

Detection of double video compression is of particular importance in video forensics, as it reveals partly the video processing history. In this paper, a double compression method is proposed for HEVC–the latest video coding standard. Firstly, four 5×5 co-occurrence matrixes were derived from DCT coefficients along four directions respectively, i.e., horizontal, vertical, main diagonal and minor diagonal. Then four 4×4 co-occurrence matrixes were derived from PU types which are innovative features of HEVC and rarely been utilized by researchers. Finally, these two feature set are combined and sent to support vector machine (SVM) to detect re-compressed videos. In order to reduce the feature dimension, only the co-occurrence matrixes of DCT coefficients and PU types in horizontal direction are adopted to identify whether the video has undergone double compression. Experimental results show the effectiveness and the robustness against frame deletion of the proposed scheme.


Introduction
Nowadays due to the rapid development of network and the popularity of multimedia technology, digital videos have been widely used in many fields.Meanwhile, with the wide spreading of powerful and easy-to-use video editing software, digital videos are exposed to vigorous possible forgery kinds which destroyed the authenticity and integrity of videos.Therefore, video forensics technology is becoming a hot topic in information security field.Generally, due to the large amount of video data and strong correlation between them, digital videos are stored and transmitted in the form of compression.However, the video tampering process has to be operated on uncompressed domain.Namely, tampering with video often involves decoding the video streams into image sequences and then recoding the tampered video into compressed ones.Thus, double compression is necessary for video tampering process, and it plays an important role in revealing the tampering possibility.
The study of video recompression forensics began in 2006.As an important detection means of video forensics, there are various kinds of methods effective for double compression detection.Chen et al. [1] distinguished double compression by judging whether the probability distribution of the first digits of the non-zero MPEG quantized AC coefficients would be disturbed.In Ref [2] Liao introduced that the probability distribution of quantized nonzero AC coefficients would be obviously disturbed only when the QP used in the second compression is smaller than that in the first compression.Ref [3] introduced the sequence of average residual of P-frames (SARP) and used its time-and frequency-domain features to classify the tampered videos and original videos.Sun and Xu in [4][5] detected double compression for MPEG videos by studying the distribution of quantized DCT coefficients.In Ref [6], a method based on the statistical feature of macroblock mode (MBM) which consists of macroblock type and motion vector in P-frames was proposed to detect double MPEG compression.Wang et al. [7] proposed a detection method based on the indication that in doubly MPEG compressed videos, the distribution of the doubly quantized DCT coefficients contains empty bins when the second quantization step is smaller than the first one.Ref [8] detected doubly-compressed MPEG video sequence by the presence of specific static and temporal statistics.In Ref [9], 162-D Markov statistical characteristic based DCT coefficient were adopted to distinguish double MPEG-4 compression artefacts.
The detection algorithms of double compression above are all based on the video coding standards which precede the latest high efficient video coding (HEVC).For instance, Ref [1,[4][5][6][7][8][9] studied the double compression for MPEG videos and ref [2][3] aims at H.264 videos.As the latest generation of coding standards, HEVC provides lower video data rate.Compared with H.264/AVC, it improves compression efficiency of high-resolution and high-fidelity videos.There is no doubt that it will play an important role in the applications of high definition and ultra-high-definition videos.Therefore, double HEVC compression detection is of significance.But only a few of works have been done on this issue.Huang et al. [10] proposed a method by constructing co-occurrence matrix of DCT coefficients to detect double compression.In ref [11], before and after tampering, the PU type of I-frame will change.In one of our previous works, we use this phenomenon as a breakthrough, and make the best of the standard deviation of 4×4 PU blocks difference (SDoPU) to capture this change.This statistical features based on the same QP indicates the occurrence of double compression.Both [10] and [11] discussed detection of double compression for HEVC videos based on the quantization parameter.There are few studies detecting double HEVC compression with different bit rates.Motivated by this phenomenon, a new method based on Co-occurrence Matrix of PU types and DCT coefficients was presented in this paper to detect double HEVC compression with different bitrates.The experimental results show that the proposed method has high classification accuracy and strong robustness against frame-deletion attack.
The rest of this paper is organized as follows.Section 2 gives an interpretation of the error analysis for double HEVC compression, and then introduces the PU types and quantized DCT coefficient.In section 3 gives an introduction of Co-occurrence matrix and detailed procedures of deriving Co-occurrence Matrix from PU types and DCT coefficients.In section 4, we present experimental results and discussion, followed by conclusions and future work in section 5.

Error Analysis of I Frame for Double HEVC Compression
HEVC video compression is widely used in the field of multimedia, it achieves better image quality while ensuring high compression rate.The traditional prediction, transformation, quantization, entropy coding and other links is still used for HEVC.But it also uses a more flexible coding block partition structure [12].The main generation procedure of singly-compressed video and re-compressed video is shown in Fig. 1.Letting ori V represents the uncompressed YUV video and 2 R represents the target bit rate.
Fig. 1 (a) shows the construction of singly-compressed video.After the coding process of Fig. 1 (a), a singlycompressed video is obtained,

Coding with bitrate
Decoding Coding with bitrate Decoding Figure 1.The generation procedure of singly-compressed video and re-compressed video (a) Singly-compressed video (b) Recompressed video During the above-mentioned process, discrete cosine transform (DCT) coefficients are achieved, then those DCT coefficients are quantized according to the given QP value, and this process will produce a Quantization error.Besides, when the video is reconstructed, it will result in rounding errors and truncation errors.In summary, there is irreversible quantization error and reconstruction error in the coding process of HEVC videos, which makes Besides, the rate control algorithm regulates the QP and PU division types, so that the value of QP used for re-compressed video will be different from the singly-compressed video, which further affects the quantized DCT coefficients, resulting in the change of quantized DCT coefficients and PU divisions for singlycompressed video and re-compressed video, then we will give a detailed analysis and discussion of DCT coefficients and PU types as follows.

The Interpretation of DCT Coefficients
DCT is the main transform way from space domain to frequency domain for HEVC.Eq.( 1) illustrate how the residual pixels of a specific N×N TU are transformed into corresponding DCT coefficients in the frequency domain,

E
indicates the i th row and j th column residual pixel in N×N TU, denotes the n th DCT coefficient in the m th row of transform matrix.The traditional scalar quantization is used to all of DCT coefficients, and quantized DCT coefficients are calculated by Eq. ( 2), where , ( ) floor defines a down integer function; step Q is the quantization step, which is related to quantization parameter (QP).The value of step Q is doubled if QP is increased by 6.For HEVC video compression coding, there are 52 quantization parameters which range from 0 to 51.
After the transformation and quantization, most of DCT coefficients are fallen into a small interval and obey the Laplace distribution with the mean of 0 [13].Further, according to our statistical analysis for HEVC videos, the histogram can reflect the overall distribution of the quantized DCT coefficients, and Fig. 2 shows that the probability of DCT coefficients in a fixed interval [-4, 4] is about 99%.  2 shows that the purple bar represents the probability distribution of DCT coefficients for the singly-compressed video with the bitrate of 400Kbps, the blue, orange and green columns denotes the probability for the re-compressed video with the bitrate of 100Kbps-400Kbps, 200Kbps-400Kbps and 300Kbps-400Kbps.As can be seen from Fig. 2, the probability values of quantized DCT coefficients of singly-compressed videos is lower than that of re-compressed videos and the distribution of it has a big difference, which indicates that we can use the probability distributions of DCT coefficients to detect re-compressed video.The gray level co-occurrence matrix is defined by the joint probability density of the pixels pairs with different direction and interval.It often used as a common way to describe the texture by studying the spatial correlation of gray, which can reveal the distribution characteristics of the entire data and also illustrate the dependence very well.So we will extract the co-occurrence of DCT coefficients as the classification feature to distinguish the re-compressed videos from singly-compressed videos.

The Analysis of PU Types
As the latest generation of video coding standard, HEVC still adopts the traditional hybrid coding framework.However, it gives up the concept of macro-block utilized in H.264/AVC but introduces three basic units, i.e., coding unit (CU), prediction unit (PU) and transform unit (TU), which makes it more flexible compared with the previous video compression coding standard.CU is a root unit for HEVC compression, its size sets to 8×8, 16×16, 32×32 and 64×64.Assuming that the size of the coding unit CU is 2N×2N, N ^4 8 16 32`, the CU can be subdivided into one or multiple prediction unit PU depending on the prediction mode.HEVC provides eight kinds of PU division structure as shown in Fig. 3. PU blocks carry abundant prediction information which indicates how CU is predicted.Fig. 4 shows the PU partition types of the first I frame in the singlycompressed video and the re-compressed video.Singlycompressed video (Fig. 4 (a)) is obtained by compressing YUV video akiyo directly with the bitrate of 200Kbps.Re-compressed video (Fig. 4 (b)) is obtained by compressing akiyo with the bitrate of 100Kbps and then with the bitrate of 200Kbps.Among them, the boundary of the thick solid lines show the maximum size (64×64) of CU, and so on, the dotted line denotes the smallest size (4×4) for PU blocks.It is indicated that for the same I frame, PU types delicately vary before and after recompressed.Tab. 1 shows the number of each PU type for Fig. 4(a) and Fig. 4(b).
It is observed that the number of 4×4 PU blocks for the first I-frame of singly-compressed and re-compressed video is 680 and 568, and the number of 8×8 PU type and 16×16 PU type has varied.Besides that, the PU partition type and its neighbour relationship have also changed, for example, there are one 16×16 PU block for singlycompressed video in the lower left corner while four 8×8 PU blocks for re-compressed video, in the upper right corner it is different too, and so on.So we can extract the co-occurrence of PU types as another classification feature to reveal the changes of video statistical data.3 The Proposed Scheme of Double HEVC Compression Detection

The Introduction of Co-Occurrence Matrix
The gray level co-occurrence matrix is the joint probability of pixel pairs with different direction and interval.Owing to the strong relevance between neighbouring image data (such as pixels or coefficients), co-occurrence matrix is widely used in image classification and image recognition.Suppose the size of coefficients matrix C of an image is M N u , the cooccurrence matrix of which can be defined as follows:

The Extraction of CMoDCTCs
We first extract the quantized DCT coefficients in CTUs for given video during the decoding process, then set a threshold =4 T to truncate it and take the absolute value of it.Then all DCT coefficients of CTUs are arranged one after the other in a two-dimensional matrix C defined as Eq.( 7), where i C represents DCT coefficients of the i th CTU; q is the number of CTUs in one video with N frames; X ª º « » is a ceiling function.The value of q changes with the resolution of the given video.For example, 176 144 ITA 2017 Once the DCT coefficients matrix C is got, we can calculate the 5 5 u probability co-occurrence matrix of it in horizon, vertical, major diagonal and minor diagonal according to Eq. (3) to Eq. ( 6) in Section 3.1, and then four corresponding 25-dimensional features are achieved in four directions.Finally, the 25-dimensional cooccurrence matrices of the four directions are combined to form a 100-dimensional quantized DCT coefficient cooccurrence matrix CMoDCTCs.

The Extraction of CMoPUTs
We first extract the declared bit rate R of the given video during the decoding process and use the open source HEVC stream analysis software GitlHEVC Analyzer [14] to get PU partition of each I frame.Then let 8 8 u image block to be one basic unit to mark PU types and calculate its frequency matrix in horizon, vertical, major diagonal and minor diagonal.Fig. 5 (a) shows an example of how to mark PU types in one image.We mark 4 4 u PU type, 8 8 u PU type, 16 16  u PU type, 32 32 u PU type as 0, 1, 2, 3 separately.u probability co-occurrence matrix of it in horizon, vertical, major diagonal and minor diagonal according to Eq. (3) to Eq. ( 6) in Section 3.1.Thus, four 16-dimensional features can be built correspondingly.Finally, the 16-dimensional co-occurrence matrices of the four directions are combined to form a 64-dimensional PU Types co-occurrence matrix CMoPUTs.

Video Database
In our experiments, 17 widely known YUV sequences [15] were selected as source sequences, these videos are QCIF format (with the resolution of 176×144) and have various kinds of content, such as people, scene, news, etc.In order to increase sample quantity, each video was split into non-overlapped clips with length of 100 frames.Finally, 36 QCIF video clips were generated in total.HM10.0 [16] with encoder_lowdelay_P_main configuration file was adopted to conduct HEVC encoding and decoding process, we set the frame rate to 30 f/s and the GOP structure as IPPP.For QCIF video clips, the values of compressed bitrates B1 and B2 are selected from {100, 200, 300} (kbps) and {200, 300, 400} (kbps), respectively.For the sake of the video sample library consists of singly-compressed and re-compressed videos.All of 36 video clips (original standard YUV sequences) are first compressed with the bitrates of B1.Then each compressed HEVC video streams is decoded and recoded with different bitrates B2.Thus, 36 groups of datasets are generated in total.

Experimental Results and Discussion
For each combination of (B1, B2), LIBSVM [17] with PolySVC kernel was selected as classifier to distinguish original videos with bitrate B2 from re-compressed ones with B1 followed by B2. 30 originals videos and 30 recompressed ones were randomly chosen for training a classification model, and the rest were used for testing.AR was adopted as the evaluation metric of the proposed method and was calculated as follows: Where / ( ) 100% TPR TP TP FN u and / ( ) 100% TNR TN TN FP u . TP and TN represent true positive rate (the rate that re-compressed videos were labelled as re-compressed) and true negative rate (the rate that singly-compressed videos were labelled as singlycompressed), respectively; FP and FN represent false positive rate (the rate that singly-compressed videos were labelled as re-compressed) and false negative rate (the rate that re-compressed videos were labelled as singlycompressed), respectively.Due to the contingency, the entire experimental procedure is repeated for 20 times by randomly choosing training samples for each combination (B1, B2), and the average of AR is treated as the final detection accuracy.
In this paper, we first combine CMoDCTCs and CMoPUTs as effective 164-dimensional features to detect HEVC double compression.Table 2 gives the detection results.It can be observed that classification rates of distinguishing the re-compressed videos from singlycompressed ones are all 1, which demonstrate that the proposed method works well for detecting re-compressed videos.
ITA 2017 In order to reduce the joint feature dimension and the computational complexity, we combine PhoDCT and PhoPUTs to form 41-dimensional features to detect HEVC double compression.Table 3 gives the detection results of distinguishing re-compressed videos from original ones.We can see that the classification rates are above 0.9 in all cases.Among them, four video groups' detection rate reached 1, and the lowest AR has reached 0.9667.A conclusion is summarized that our method can also achieve good classification ability after reducing the joint feature dimension.
Table 2. AR of distinguishing re-compressed videos from original ones based on mixed feature (164-D) Table 3. AR of distinguishing re-compressed videos from original ones based on mixed feature (41-D) Frame-deletion is another common forgery type for digital videos.Hence, the robustness of the proposed method against it is tested in this section.10-framesdeleted videos, 30-frames-deleted videos and 50-framesdeleted videos are constructed, and the classification rate between each of them and original videos are displayed in Table 4 -Table 6.We can see that the classification rates are all above 0.8 except two detection rate which are expressed in bold in Table 5 and Table 6.Furthermore, there are four video groups' classification rate are above 0.9 in each table of Tabl4 4 -Table 6.The results demonstrate that the combined feature set of the proposed method is robust against frame deletion.

Conclusions
With the development of the advanced video editor, the authenticity and integrity of the videos can't be guaranteed anymore.In this paper, an effective method to detect double HEVC compression based on different bitrates is proposed.We extract the DCT coefficient and PU types of the original videos and the re-compressed ones, and then model co-occurrence matrixes based on DCT coefficient and PU types to reveal the artefacts.Experimental results show that the proposed algorithm performs well when the first compressed bitrates B1 is lower than the second compressed bitrates B2 for HEVC compression coding.Our future work will focus on how to detect double compression videos with the same compressed bitrates and the first compressed bitrates higher than the second ones.

2 RS 2 R 2 RP 2 RD 1 R 1 RV is achieved by decoding 1 RS . Then, recompressing 1 RV with the bitrate 2 R
Fig.1 (a), and the rebuilt video

Figure 2 .
Figure 2. Histogram of quantized DCT coefficients Fig.2shows that the purple bar represents the probability distribution of DCT coefficients for the singly-compressed video with the bitrate of 400Kbps, the blue, orange and green columns denotes the probability for the re-compressed video with the bitrate of 100Kbps-400Kbps, 200Kbps-400Kbps and 300Kbps-400Kbps.As can be seen from Fig.2, the probability values of quantized DCT coefficients of singly-compressed videos is lower than that of re-compressed videos and the distribution of it has a big difference, which indicates that we can use the probability distributions of DCT coefficients to detect re-compressed video.The gray level co-occurrence matrix is defined by the joint probability density of the pixels pairs with different direction and interval.It often used as a common way to describe the texture by studying the spatial correlation of gray, which can reveal the distribution characteristics of the entire data and also illustrate the dependence very well.So we will extract the co-occurrence of DCT coefficients as the classification feature to distinguish the re-compressed videos from singly-compressed videos.

Figure 3 .
Figure 3. PU types under different prediction modes.

DOI: 10 Figure 4 .
Figure 4.The PU types of the first I frame of singly-compressed video and doubly-compressed video (a)The PU types of singly-compressed video (b)The PU types of re-compressed video Table 1.Number of each PU type for Fig. 4(a) and Fig. 4(b) denotes the distance of the selected adjacent element pairs, and is given by 1 in this paper; h P v P d P m P represent the co-occurrence matrix in horizontal, vertical, major diagonal and minor diagonal direction respectively; the probability of adjacent coefficient pairs with values ( , ) a b .

3. 2
Co-Occurrence Matrix of DCT Coefficients and PU TypesAs described in Section 2, it is investigated that DCT coefficients and PU types have been changed after recompressing HEVC videos with different bitrates.The neighboring DCT coefficients and PU types have a certain correlation more or less, but it will vary with compression times.The co-occurrence matrix can well capture this change.Therefore, we first combine the cooccurrence matrix of DCT coefficients (CMoDCTCs) and the co-occurrence matrix of PU Types (CMoPUTs) to form a joint feature, and then send the joint feature into SVM for video classification.The extraction of the joint feature is divided into two parts, namely, the extraction of CMoDCTCs and the extraction of CMoPUTs.

Figure 5 .
Figure 5.An example for calculating frequency matrix of PUtype pair (a) The Mark of PU types in one image.(b) Horizontal frequency matrix of PU-type pair Fig.5 (b) shows the horizontal frequency matrix of Fig.5 (a), where r, c denote the PU types of the PU-type pair; the number in each cell means the times of the PUtype pair occurs with the interval =1 d in horizontal direction.Once the frequency matrix is achieved, we can calculate the 4 4u probability co-occurrence matrix of it in horizon, vertical, major diagonal and minor diagonal according to Eq. (3) to Eq. (6) in Section 3.1.Thus, four 16-dimensional features can be built correspondingly.Finally, the 16-dimensional co-occurrence matrices of the four directions are combined to form a 64-dimensional PU Types co-occurrence matrix CMoPUTs.

Table 4 .
AR of distinguishing re-compressed videos undergone 10 frames deletion from original ones based on mixed feature (41-D)

Table 6 .
AR of distinguishing re-compressed videos undergone 50 frames deletion from original ones based on mixed feature (41-D)