A Hyper spectral Images Classification Method Based on Maximum Scatter Discriminant Analysis

To overcome “small sample size problem” problem faced by some hyper spectral classification methods, the Maximum Scatter Discriminant criterion is used to analyzed hyperspectral data. Maximum Scatter Discriminant analysis searches for the project axes by maximizing the difference of between-class scatter and within-class scatter matrices, which avoid to calculate the inverse of matrices. Experiment results on Indian Pines HSI data set show that the proposed method outperforms the other methods in terms of recognition accuracy. The proposed method is an effective and feasible method for hyper pectral data classification.


Introduction
With the constant development of imaging spectroscopy, Hyper spectral Images (HSI) collected by imaging spectrometers provides detailed spectral information about objects in hundreds of spectral bands in the electromagnetic, which increases the possibility of more accurately discriminating materials of interest [1].However, the information content in a hyper spectral image is highly correlated with neighboring bands and suffers from the curse of dimensionality [2].Dimensionality reduction is a common pre-processing step in the classification of hyper spectral images for various applications such as land cover mapping, and any spectral feature in the sample space can be represented in a lower-dimensional subspace without losing a significant amount of information.Principal Component Analysis (PCA) [3] and Linear Discriminant Analysis (LDA) [4] is two popular technique for finding such a lower dimensional subspace.PCA performs dimensionality reduction by projecting the original data onto a low dimensional linear subspace which is spanned by the leading eigenvectors of the covariance matrix of all the sample points.PCA projections are optimal for reconstruction from a low dimensional basis, but they may not be optimal from a discrimination standpoint due to not taking the class information of the input data into account.LDA is a supervised learning algorithm, which preserves discriminant information by maximizing the ratio of between-class scatter and within-class scatter.However, LDA often suffers from the so-called "small sample size problem" since the number of samples is generally smaller than the dimension of the sample space in face recognition task.To overcome the"small sample size problem", Song et al. [5] proposed a method, which adopts the difference of both between-class scatter and within-class scatter as discriminant criterion, was named as Maximum Scatter Discriminant (MSD) analysis.The goal of MSD is to search for the project axes by maximizing the difference of between-class scatter and within-class scatter matrices rather than the ratio of between-class against within-class scatter matrices.
The rest of the paper is organized as follows: the maximum scatter difference criterion is introduced in Section 2. In Section 3, a series of experiments and results are described.Finally, conclusions are summarized in Section 4.publisher.

Maximum Scatter Difference (MSD) Criterion
Suppose there are c known pattern classes.P is the total number of training samples, and Ni is the number of training samples in class i.In class i, the jth training sample, which is an n-dimension vector, is denoted by j i x Where m i is the mean of ith class samples, and m 0 is the mean of all training samples.
Unlike classical LDA, which search for the project axes on which the ratio of the between-class and the within-class scatter matrices of the projected samples reaches its maximum, the goal of maximum scatter difference criterion is find the project axes on which the difference of the between-class and the within-class scatter matrices is maximized.The criterion can be written as following: ( ) max arg ( ) A S S A      .In fact, in hyper spectral images recognition task, the dimension of hyper spectral data is very high, which will consume much memory and time.To overcome this problem, we first project the hyper spectral data set to a PCA subspace to reduce the dimension, while using PCA as preprocessing can reduce the noise.Thus, a detailed description of the proposed method is stated as follows: 1. PCA projection: the training samples are projected into a PCA subspace which retains a certain level energy by throwing away the smallest principle components.The transformation matrix of PCA is denoted by A PCA .

Constructing the within-class scatter matrix
x w S and between-class scatter matrix b w S .

Feature Extraction and Classification
Supposed projection matrix A has been obtained, projecting a given spectral vector x onto A, yielding an n-dimensional column vector c: We called this vector c as the feature vector of x.For hyper spectral data recognition, projecting each training vector 12 , , ,

M x x
x onto A, we obtain the corresponding feature vectors 12 , , , M c c c .Given a test vector x, first use Eq. ( 4) to get its feature vector c, then a nearest neighbor classifier is used for classification.Here the distance between c and i c is defined by Where () cj denotes the j-th element of vector c

Experiments
In this section, we conduct a set of experiments on Indian Pines HSI data se t [6] to further evaluate the effectiveness of the proposed method and the proposed method is also compare with PCA and LDA.The Indian Pines HSI data set  10% pixels are randomly selected from each class to construct the training data set, the remaining images being used as the test images.Every algorithm is repeated 10 times, the recognition accuracies are plotted for the proposed method, PCA and LDA with varying dimension of feature vectors in Fig. 2. It can be observed from Fig. 2 that the proposed approach has better recognition rate and is more robust.Compared with LDA, the proposed method avoid to calculate the inverse of scatter matrices, which reduces the algorithm complexity

. Conclusions
In this paper, an efficient hyper spectral images classification method which adopts maximum scatter discriminant analysis, the major advantage of the proposed is to overcome the "small sample size problem" faced by LDA.Experimental results on Indian Pines HSI data set reveal that the proposed method has better recognition accuracy and is more robust.
is a scene of the Northwest Indiana gathered by the AVIRIS sensor in 1992.It consists of 145×145 pixels and 220 spectral bands within the range of 375-2500 nm.Several spectral bands with noise and water absorption phenomena are removed from the data set, leaving a total of 200 radiance channels to be used in the experiments.This HSI is available at https://engineering.purdue.edu/~biehl/.Sixteen ground truth classes of interest are considered in the data set.The HSI in false color and its corresponding ground truth are shown in Figure 1(a) and (b) respectively.

Figure 1 .
Figure 1.The Indian Pines Hsi Data Set.(a) the Hsi in False Color.(b) Corresponding Ground Truth.

Figure 2 .
Figure 2. Performance Comparison of Recognition Rates Using LDA, MSD and PCA with Varying Dimension on Indian Pines HSI Data Set.