Drowsiness Detection using EEG signals and Machine Learning Algorithms

. Abstract. Drowsiness is described as a state of reduced consciousness and vigilance accompanied by a desire or want to sleep. Driver tiredness is frequently detected using wearable sensors that track vehicle movement and camera-based systems that track driver behavior. Many alternative EEG-based drowsiness detection systems are developed due to the potential of electroencephalogram (EEG) signals to observe human mood and the ease with which they may be obtained. This paper applies Deep learning architecture like Convolutional Neural networks (CNN) and algorithms for the classiﬁcation of EEG data for Drowsiness Detection. The key measures of video-based approaches include the detection of physical features; nevertheless, problems such as brightness limitations and practical challenges such as driver attention limits its usefulness. The main measure of video-based methods is the degree of closure of the eyelids; however, its success is limited by constraints like as brightness restrictions and practical challenges such as driver distraction. We have extracted statistical features and trained using various classiﬁers like Logistic Regression, Naïve Bayes, SVM, and K Nearest Neighbours and compared the accuracy using a deep learning CNN model. Results demonstrate that CNN achieved an accuracy of 94.75% by delegating feature extraction on itself. Upon comparing existing state–of– the–art drowsiness detection systems, the testing results reveal a higher detection capability. The results show that the the suggested method can be used to develop a reliable EEG-based driving drowsiness detection system.


Introduction
Drowsiness is a leading cause of vehicular accidents where drivers may experience loss in sense of vision, and overall vigilance. Driving when drowsy would result in longer reaction time, especially at high speeds, and reduced performance on attention-demanding events [1]. Many contributions have been made to explore drowsiness detection and can be categorised into two major classes -one that uses physical features viz. Steering Wheel Movement (SWM), Eye Blinking Based Technique, Yawn Detection, and Head Nodding detection while the other uses physiological features viz. electroencephalogram (EEG), electrocardiogram (ECG) [2]. We consider EEG in this paper due to its high temporal resolution, inexpensiveness, and practicality [3]. Comparing with existing systems, we see that although EEG can provide an accurate assessment of alertness levels, contamination from persuading EEG features such as eye activity, muscle noise, and acoustic anomalies such as line noise, electronic interference, and so on is one of the many technical challenges that must be overcome when implementing EEG-based monitoring systems. To overcome this, outlier removal or dimensionality reduction algorithms on raw data can help remove unusual readings that can be excluded to save computational energy. According to Jung et al. [3] and Lin et al., EEG correlates of fatigue and drowsiness appear to be variable amongst di↵erent subjects, which suggests the need for individually customizable systems. In this paper, we present a detailed comparison of two EEG drowsiness detection approaches where one uses machine learning (ML) algo-rithms namely K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Logistic Regression (LR) and Naive Bayes while the other uses Convolutional Neural Network (CNN) -a deep learning (DL) algorithm. Conventionally CNN is considered a notable methodology for solving problems in image classification. The architecture for CNN seems suitable for other data as well since there is no need for manual feature extraction [4]. The following is how this paper is structured. Section 2 discusses the dataset used for analysis as well as feature extraction for machine learning algorithms. Section 3 describes every algorithm we have used along with their respective results.

Datasets and Inputs
We have used the "EEG Eye State Dataset" from UCI Machine Learning Repository as the common dataset for all computations [5]. This data is acquired from an EEG neuroheadset (Emotiv EEG Neuroheadset) for a duration of 117 seconds and collects 14304 samples. Along with 14 electrode values recorded at time t the dataset also provides whether or not the subject had their eye state opened (EO) or closed (EC) at time t. This data is then split into two sections randomly at each run -80% of rows are set aside for training while 20% is set aside for testing. The acquisition for the 14 electrode measurements, originally labelled TF7, O2, F3, P8, T8, F4, FC6, AF3, FC5, T7, F8, P7, O1, and AF4 are shown in

Feature Extraction and Selection
For ML algorithms, we select the following features [6]: Output (OP), Statistical features viz. mean (µ), standard deviation (σ), kurtosis (β 1 ), skewness (β 2 ) and discrete cosine Transform (DCT) [7].  Table 1 shows the rows obtained by feature extraction on the original dataset. The extracted feature data frame is divided into X and Y where- 3 Algorithms

Support Vector Machine
SVM classification training aims to find the optimum hyperplane that maximises the di↵erence between classes using inputs X and outputs Y [8].The accuracies obtained after tuning various parameters for SVM are shown in Table  3.

K-Nearest Neighbors
The K Nearest Neighbors algorithm assumes similarities between new cases / data and available cases and assigns new cases to the category closest to the available categories [9]. We have observed that maximum accuracy is achieved when the number of neighbors is set at 5 as depicted in Fig 5.

Logistic Regression
Logistic regression (LR) is a popular machine learning technique for performing predictive analysis on data that is categorical or has binary classes.The accuracies obtained using various solvers are as shown in Table 4. The maximum accuracy is observed when the 'lbfgs' solver is used.

Naive Bayes Classifier
The Naive Bayes technique is a series of supervised learning algorithms based on applying Bayes' theorem to each pair of class variable value features under the "naive" assumption of conditional independence. The accuracy obtained is 58.58%.

Convolutional Neural Network
A schematic diagram of the CNN used in our drowsiness detection system is depicted in Fig 4. The EEG signal propagates through two convolutional layers, followed by three fully connected layers. First, two fully connected layers are equipped with the Rectifier Linear Unit (ReLU) activation function. Whereas the last fully connected layer is equipped with the activation function of the sigmoid. Moreover, dropout processing [10] is used to minimize the risk of over-fitting.

Convolutional layers
These layers mainly focus on filter application and feature extraction [11] based on input EEG signals. The convolution operation is represented by following equation: where ⇤ is the convolution operation, Y i presents feature map, b i bias term, W in sub-kernel of the channel, X n input signal. We have used 2 convolutional layers.

Dense layers
The role of the dense layer is to describe the connection with the intermediate and next layers of neurons. In our architecture, we used 3 fully connected layers. For the best classification results, in the first dense layer of our model, we used a hidden layer of 10 neurons. In the second dense layer, we used a hidden layer of 100 neurons for improved accuracy. In the third dense layer, the value of the last neuron is equal to 1. A single neuron is sufficient to designate class "1" or "0", hence, binary classification is applied in this work.

Evaluation Metrics
For evaluating the performance of ML and DL techniques mentioned in this paper, we used metrics like Accuracy, Precision, Recall, and F1-score. Table 2 shows the Formula and Description of the Evaluation Metrics that we have used to measure the quality of the machine learning models.

Results
In this paper, we have compared the performance of ML algorithms to DL algorithms. In the algorithms like SVM, Logistic regression, Naive Bayes, and KNN we have used extracted statistical features as input, whereas CNN does not require feature extraction. CNN achieves an accuracy of 94.75% and 97.5% for one dense layer and two dense layers respectively with 11.5% aggregate error loss. Table  5 depicts the performances of various models in decreasing order of their accuracy. Fig 6 shows the the evaluation metrics of the models, we can observe that CNN shows the highest accuracy among all other classifiers.

Conclusion and Discussion
In this work, a comparison of ML approaches to DL approaches for drowsiness detection using EEG is presented. Out of all techniques mentioned in this paper, CNN is found to be highly efficient as it uses convolution of signals and filters for generating invariant features that are passed to the next layer. Multiple pooling operations aggregate feature maps obtained from convolution layers to an extreme spatial scale, which is a key factor for increasing the overall efficiency of CNN architecture.    an accuracy graph with 80% training data and 20% testing data varying the number of epochs. When all 14 electrode values and two classes (0 and 1) are used, the confusion matrix obtained by CNN is shown in Fig 8, where the x-axis depicts the true class labels while the y-axis depicts CNN class predictions. The correct classifications are shown along the first diagonal, while all of the other entries show misclassifications.