COVID-19 Detection using Deep Learning Classifiers with Histogram Equalization and Contour-Based X-Ray Images

. The global health crisis caused by COVID-19 has significantly impacted both lifestyle and healthcare. Accurate and prompt medical diagnosis is crucial in combating the spread of the disease. However, the time required for laboratory interpretation and the high cost of a Computed Tomography (CT) scan can lead to inaccurate predictions of this disease. Several existing works have addressed this issue by using the Chest X-ray (CXR) images, however, achieving high accuracy is still a challenge in this domain. In this paper, features extracted from various modified CXR images that could produce more informative features, coupled with deep learning architectures, were evaluated to address the accuracy issue. First, the original CXR images were preprocessed and generated two subsequent different sets: the enhanced CXR images using histogram equalisation and the CXR contour images using contour-based methods. VGG16, InceptionV3, and Xception were used as feature extractors and classifiers; trained on public datasets to classify the CXR images into three categories: health, pneumonia, and COVID-19. The results demonstrate that the proposed work is able to accurately differentiate CXR images based on their respective classes. The best individual model was trained using InceptionV3 with histogram equalisation, achieving an accuracy of 98.25%.


Introduction
Coronavirus disease (COVID- 19) is an infectious disease caused by the SARS-CoV-2 virus [1].COVID-19 has various effects on people depending on their age and medical condition, and most people infected with the virus will experience mild to moderate respiratory illness, while senior citizens and people who have chronic medical health will experience severe illness [1,2].Pneumonia is a lung infection that causes the air sacs within the lungs to become inflamed.The pneumonia symptoms are identical to those of COVID-19, including fever, chest pain, cough, fatigue, and shortness of breath, which could be a cause of mortality [3].
Deep learning (DL) has demonstrated its effectiveness in multi-class classification tasks, such as distinguishing and categorizing between healthy individuals, pneumonia patients, and COVID-19 patients when applied to chest X-ray (CXR) images [3][4][5].Some other works have utilized computed tomography (CT) scan images for COVID-19 detection, for example [6,7].However, the accuracy of disease detection using deep learning on imbalanced CXR images remains a challenge [3].Employing additional tasks of data augmentation, undersampling, and oversampling could resolve this issue [8].Using a variety of features could enrich the information brought by the features and thus increase the detection performances [9].On the other hand, although the CT scan images produced high sensitivity in COVID-19 detection, they have the disadvantage of incurring a high cost and exposure to radiation [10], and limited accessibility to the CT scan imaging [11].
This paper presents an ensemble deep learning approach to classify CXR images into normal, pneumonia or COVID-19.Two types of features were extracted: one from enhanced images and the other from contoured images, which were then used to generate the detection models.Enhanced images will provide a better view of the bone structure in CXR images [12], while the contoured images will represent lung shape using contour lines that were essential in order to differentiate the abnormal from normal lungs [13,14].The key contributions of this paper are as follows: -Propose features for COVID-19 detection using the enhanced and contour-based CXR images, and -Present a comparative analysis of using VGG16, InceptionV3 and Xception COVID-19 detection.

Methodology
There are four stages of the proposed approach (Fig. 1), as follows: (i) data acquisition, (ii) image preprocessing, (iii) classifier generation and (iv) classification.

Data acquisition
In this paper, three secondary datasets were used [3], [18].The first dataset consists of CXR images and CT scans of various lung diseases and COVID-19 [19], while the second dataset consists of CXR images and CT scans of COVID-19 [20].The third dataset consists of three classes of CXR images; COVID-19, pneumonia and normal [21].Due to the imbalance of CXR images between classes, 196 images of each class were randomly selected, amounting to 588 total CXR images used in this paper.

Image preprocessing
Three image preprocessing tasks were conducted in this paper: (i) image resizing, (ii) resized images with histogram equalization, and (iii) resized images with contour detection.
All CXR images were resized to 150 x 150 pixels to standardized pixel dimensions and also rendered the data suitable for subsequent processing.Next, two distinct image processing techniques were applied.First, histogram equalization was conducted to enhance the contrast of visible abnormalities on the CXR.This procedure produced a set of images referred to as 'histogram equalization CXR'.
Second, a contour detection algorithm was employed to identify and highlight the edges and structural features within the CXR.The CXR was converted to grayscale before binary thresholding was applied to emphasize the contours.These processed images were denoted as 'contour-based CXR'.

Classifiers generation
In this paper, three distinct convolutional neural network (CNN) architectures were utilized, namely VGG16, InceptionV3, and Xception.Pre-trained CNN architectures were employed as fixed feature extractors to generate the new models.To prevent overfitting, Dropout (DO) layers were added, and positioned between the Fully Connected (FC) layers.More details on the CNN architectures used can be found in [3], [18].Classifiers were generated for two distinct categories of CXR images: the histogram equalization CXR and the contour-based CXR.In total, six classifiers were formulated and tested within the framework of this paper.

Results and Discussion
Three sets of experiments were conducted in this paper.First, the performances of models generated from different types of CXR images and VGG16 were conducted.Second, the performances of models generated from different CNN architectures were assessed.Third, comparisons with existing works were performed.Each of these is described in Section 3.1 to 3.3.

Performance of models generated from different types of CXR images
The aim of this set of experiments was to identify the best types of CXR images for detecting COVID-19, Pneumonia and normal images.The VGG16 architecture was selected as it was shown to produce the best performance in [3].Three types of CXR images were used: the original CXR, the histogram equalization CXR and the contour-based CXR.The batch size was set to 50, the number of epochs was set to 100, and the learning rate was set to 0.0001.These settings were selected, similar to the previous work presented in [3].Ten-fold cross validation (TCV) was employed.Table 1 shows the performance of the generated models.From Table 1, X represents a model trained on the original CXR images, A is a model trained on the histogram equalization CXR images and D from the contour-based CXR images.In this experiment, the highest classification accuracy was achieved by the model using histogram equalization, with an accuracy of 94.74%, followed by the model utilizing contour-based features, which attained an accuracy of 92.98%.The lowest performance was observed in the model trained on the original images, with an accuracy of 91.23%.
The VGG16 model architecture demonstrates strong performance in classifying COVID-19 and normal cases, with all models achieving a sensitivity of higher than 90%.However, it did not work well in identifying the pneumonia, possibly due to the indistinct shape of the lungs that can be observed on the original CXR images of the pneumonia patients.
The incorporation of histogram equalization and contour-based CXR images improved the sensitivity and overall accuracy.The contour lines drawn on the CXR images proved effective in distinguishing between normal and abnormal lung structures, while histogram equalization enhanced the CXR image visualization.Subsequently, improved detection performance can be observed.Therefore, only the histogram equalization and the contourbased CXR images were considered in the following experiment.

Performance of different CNN architectures
The objective of this set of experiments was to identify the CNN architecture that produces the best detection performance when applied to histogram equalization and contour-based CXR images.VGG16, InceptionV3 and Xception were selected to be used for this work.The parameters were set to the exact same values as detailed in the foregoing section.The performance of these CNN architectures is summarized in Table 2. From Table 2, A, B and C represent the models trained on histogram equalization CXR images with VGG16, InceptionV3 and Xception respectively.The D, E and F models, on the other hand, represent models trained on the contour-based CXR images with VGG16, InceptionV3 and Xception respectively.From the table, the highest classification accuracy was achieved by the model trained on InceptionV3 using histogram equalization images, recording an accuracy of 98.25% and exceeding 90% for both sensitivity and specificity for all types of lung diseases considered.Similar performances were also recorded by InceptionV3 with contour-based CXR images, although the accuracy was a bit low at 96.49%.
The InceptionV3 produced consistent detection results when applied to both the histogram equalization and contour-based CXR images.Meanwhile, the original CXR images produced the least desired results, whereby they only managed to record sensitivities below 90% in the detection of Pneumonia CXR images.To sum up, all six model architectures demonstrated strong classification capabilities for distinguishing between COVID-19, normal, and pneumonia CXR images, with each model achieving accuracy rates of higher than 92%.
From the several existing work, the one discussed in [3] is the most comparable, owing to its utilization of a similar dataset.The reported ensemble-based classifier incorporates Contrast Limited Adaptive Histogram Equalization (CLAHE), the proposed edge detection algorithm, as well as the VGG16 and InceptionV3 architectures, resulting in commendable average accuracy, sensitivity, and specificity rates of 97.90%, 97.89%, and 98.95%, respectively.Notably, the findings presented in this paper demonstrate even higher performance, achieving an average accuracy of 98.25%, average sensitivity of 98.24% and average specificity of 99.12%.

Conclusion
In this paper, two types of CXR images, histogram equalization and contour-based CXR, were generated and used to detect COVID-19 from other lung diseases and normal images.The results show an improvement in the classification performance.The experiments conducted depict the best and most consistent performance produced by InceptionV3 with the contour-based CXR images acquired from the public datasets.
Future investigations should consider exploring a range of image preprocessing methods to optimize the CXR image quality for subsequent feature extraction and classification.Additionally, the segmentation of critical regions within CXR images and the utilization of the ensemble classifiers should also be considered if better classification performance is the desired outcome.
The presented work is partly funded by Universiti Malaysia Sabah through the grant SDK0191-2020.

Fig. 1 .
Fig. 1.Process flow chart of the proposed approach.

Table 1 .
Model performance of different types of CXR images and VGG16.

Table 2 .
Model performance of different CNN architectures.