Road Pothole Detection System

A pothole is an open crack developed on roads owing to varied climatic situations and exposure to heavy-load trucks. Potholes are acting as one of the major causes of accidents and economic loss in the repair of vehicles. Hence this paper proposes a pothole detection system that assists drivers in avoiding potholes on the road by providing prior warnings using the YOLOV7 machine learning technique. The warning can be like a buzzer while the vehicle approaches a pothole. This concept can be expanded to create vehicles that detect humps and other road irregularities. The application depicted in this work effectively reduces the problem of increasing accidents caused by potholes by ensuring accuracy in real-time of about 94.5% at efficient complexity than state-of-art approaches.


Introduction
Road safety is one of humans' most important safety precautions to save our lives. Roadway pavements should be continuously monitored and repaired as needed to ensure the quality of the road surface. Well-maintained roads benefit the people as well as the economy of the country significantly. The World Health Organization predicts that Authorities have been concerned about the asphalt pavement distresses to prevent unfavorable situations. Pavements are susceptible to traffic load, weather, ageing, defective building materials, and a terrible drainage system, demonstrating two significant pavement failures, such as cracks and potholes. In essence, potholes are concave depressions in the road surface that need to be filled in because they cause terrible situations like accidents, unpleasant driving experiences, and malfunctioning automobiles. Every year humans lose more than one lakh lives on Indian roads, and the proportion of accidents due to potholes on the road is quite significant. The problem is exacerbated during the rainy season. Accidents occur mainly due to the coverage of potholes by water in rainy seasons. If these potholes are detected in real-time while driving, it will help vehicle drivers to avoid them and thus to escape from near danger. A pothole is characterized as surface damage to the road. It is typically a whole structure that has grown over time due to weather and transportation. Car wheels (dents in the rims), suspension components, deflated tyres, wheel alignment difficulty, and undercarriage damage can all be brought on by potholes. It demands the need for identification and classification of potholes in the pavement.
Hence, this work aims to create a pothole detection system that assists drivers in avoiding potholes on the road by providing warnings. To reduce their role in unpleasant events, authorities should repair potholes immediately.

Usage of Machine Learning Algorithms in Detection
Machine learning (ML) is a subset of artificial intelligence (AI) that enables software applications to become more accurate at predicting outcomes without explicitly programming them to do so. Machine learning algorithms predict new output values by using historical data as input. Machine learning is commonly used in recommendation engines. Popular applications include fraud detection, spam filtering, malware threat detection, business process automation (BPA), and predictive maintenance. There are three fundamental approaches in machine learning: supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning. The algorithm that data scientists use is determined by the type of data they want to predict. Supervised learning: In this type of machine learning, data scientists provide algorithms with labeled training data and specify which variables the algorithm should look for correlations. The algorithm's input and output are both specified. Unsupervised learning: Algorithms that train on unlabeled data are used in machine learning. The algorithm scans data sets for any meaningful connections. The data used to train algorithms and the predictions or recommendations they produce are predetermined. Semi-supervised learning: It is a hybrid of the two preceding types of machine learning. Although data scientists may provide mostly labeled training data to an algorithm, the model can explore the data independently and develop its understanding of the data set.
Recently, machine learning techniques have been employed in the identification of potholes. Various computer vision-based learning algorithms are developed to detect potholes with higher efficiency and low complexity.

Object Detection with Convolutional Neural Network
Convolutional Neural Networks (CNN), also known as ConvNets, are a type of feed forward neural network supervised and well suited for computer vision tasks, particularly object recognition. The main advantage of CNN over neural networks is its unique structure in which sparse local connectivity between layers reduces the number of parameters, resulting in faster calculation speed. Shared weight (like a kernel filter) aids in capturing local signal properties. Convolutional Neural Networks, like standard neural networks, have multiple sequential layers in which the outputs of one layer are the inputs for the next layer. Most neural networks concepts, such as stochastic gradient descent and back propagation, are applied to CNNs. CNN accelerates and deepens training with more layers by employing three significant concepts: local receptive fields, pooling, and shared weights and biases.

CNN versus YOLO
A Convolutional neural network (CNN) is a phase characteristic of an input image. CNNs are inspired by the brain's architecture. Artificial neurons or nodes in CNNs, like neurons in the brain, take inputs, process them, and send the result as output. The input layer accepts image pixels in the form of arrays as input. There may be multiple hidden layers in CNNs that perform feature extraction from images by performing calculations. It is followed by Convolution, pooling, rectified linear units, and fully connected layers to detect and interpret patterns. They are regarded as the most influential architecture for image classification, retrieval, and detection tasks due to the high accuracy of their results.
However, it is highly complex, time-consuming and deficient to support input data that is spatially invariant.
You Only Look Once (YOLO) is a well-known model architecture and object detection algorithm. Its popularity stems from using one of the best neural network architectures to produce high accuracy and overall processing speed. YOLO is speedy. It has a processing speed of 45 frames per second (FPS). Furthermore, compared to other real-time systems, YOLO achieves more than twice the mean Average Precision (mAP), making it an excellent candidate for real-time processing. YOLO outperforms other cutting-edge models' accuracy, with few background errors at lower complexities.

Literature Review
Madhumathy P et.al., [1] proposed a work to detect humps and potholes and that information was recorded and later was stored in the database. Based on the detection of humps and potholes, the alerts were sent from the stored information in the database .Gurusamy P et.al., [2] explains the detection of the pit using pi camera attached to pit detected vehicles. Image processing techniques are used for detection of pits which gives the information to the municipal office on time. R.M.Sahu et.al., [3] proposed a work to detect potholes using ultrasonic sensors also to measure their depth and height, respectively. The proposed system captures the geographical location coordinates of the potholes and humps using a global positioning system receiver. K.Mohanprakash et.al., [4] proposed a work to notify the Municipal Corporation regarding potholes which will avoid accidents and make roads a better place to ride. This is achieved by establishing a detection system which employs an ultrasonic sensor to measure the area and depth of a pothole. This related information is then processed by the controller and the equivalent coordinates of the location obtained from a global positioning system, are sent via global system for module.
RajeshwariSundar et.al., [5] presents an intelligent traffic control system to pass emergency vehicles smoothly. Each individual vehicle is equipped with special radio frequency identification (RFID) tag (placed at a strategic location), which makes it identification of unsuitable path of pavement. J. Lin et.al., [6] proposed an texture measure based on the histogram that is extracted as the features of the image region, and the nonlinear support vector machine is built up to identify whether a target region is a pothole. Based on this, an algorithm for recognizing the potholes of the pavement is proposed. S. S. Rode et.al., [7] proposed a solution for detection of potholes and humps on the road and indicate the road maintenance authority for maintenance. The distance sensor senses the potholes and humps which is given to microcontroller. Microcontroller fetches the location of that pothole or hump by using GPS and that GPS locations are send with help of GSM. The GSM at the server part takes that locations and indicate pothole or hump on the map of that area. KshitijPawar et.al., [8] proposed a solution to recognize and detect a pothole on the road. The data was acquired from the smart phone sensors, making the approach costeffective. Efficient pothole detection is done using smart phone sensors.
Hyunwoo Song et.al., [9] proposed an efficient method to recognize a pothole on a road from the viewpoint of cost and implementation. This is a handy way because sensor data is acquired using a smart phone. To make the implementation easier, they utilized Inception V3 and Transfer Learning which gives us a very flexible way for application. The interesting aspect of this study is that general knowledge works for a specific domain problem. Koch et.al., [10] studied vision tracking of potholes of pavement for road surveys. The pothole detection was done using video sequences. The texture of potholes is compared with a reference texture of pavement to detect it. They used a kernel based tracking of pothole. The pothole is detected first and then it is tracked. Mat lab's image process toolbox was utilized for data processing. This technique gives the number of potholes detected in a recorded video.
Rajab et.al. [11] studied pavement potholes and evaluated their area through image processing. They used curve fitting on points of pothole in 2D images. Yao et.al., [12] used the second order moment operator on pavement surface images and calculated major axis, minor axis and orientation of potholes. Salari et.al., [13] used stereo vision for pavement distress evaluation. They reconstructed the road surface and aimed to calculate severity of distress in future. Cruz et.al., [14] surveyed the Kinect sensor and its application in robotics, games, and in natural interfaces, etc and conveyed important information about it. SamyakKathane et.al., [15] proposed a prototype of an IoT based pothole and hump detection system that can be integrated with the vehicle and provide timely information to maintenance authorities so that necessary steps can be taken for safety of drivers.
Each literature of study, focus on different scenario of detection and intimation to authorities, rather than enforcing for accuracy and less complexity. In real time, a model with detection at high accuracy and low processing time ensuring its reliability at all climatic situation and luminosity is required.

Proposed Architecture
Initially, the dataset is collected, and each image is explicitly annotated. Before sending the annotated data to deep learning models like the YOLO family, data is divided into training and testing samples. After training, positive weights evaluate the model's performance on testing data. The block diagram of proposed methodology of real-time pothole detection is depicted in Figure 1.

Technical Components of Proposed system
• Hardware Requirements • Laptop or computer: A 64-bit laptop with 8 GB Memory and Nvidia 1660 GPU is used for processing as YOLO needs high-quality GPU to run efficiently. • Camera: A camera is attached to the system to take pictures or videos of potholes that the system detects in the road. • Software Requirements • Google Colab: Collaboratory, sometimes called "Colab," is a data analysis and machine learning tool that enables you to integrate rich text, charts, photos, executable Python code, HTML, Latex, and more into a single Google Drive document used in coding.

Data collection
Pothole datasets are taken from Kaggle. Kaggle allows users to find datasets used in building AI models, publish datasets, and work with other data scientists and machine learning engineers. The dataset used for training impacts the models' effectiveness and dependability demanding realistic pothole photographs to be included in the dataset. Thus, the most recent pothole image dataset that is publicly available is utilized. This data set consists of 1265 training images, 401 validation images and 118 test images. Finally, it is validated to ensure the utmost accuracy in real time.

Extracting dataset
Images of potholes and non-potholes are collected and labeled to extract a dataset for pothole detection. The dataset is then split into training, validation, and testing sets. The training set is used to train the model, the validation set to tune hyper-parameters, and the testing set to evaluate model performance. The dataset is preprocessed to ensure normalization and improve model robustness. The dataset's quality is essential to ensure the accuracy of the pothole detection model.

Preprocessing of Dataset
Data preprocessing is a vital step in data mining that involves modifying, removing, or adding data before it is used to ensure and enhance performance. Garbage in, trash out is a saying that is particularly applicable to data mining and machine learning projects. With the real-time detection system, images are automatically pulled from live recordings made by the camera. They are further processed using the YOLO V7 Algorithm trained as a model.

Deep Learning Approach
The YOLO V7 algorithm is used in the proposed system. The YOLO stands for "You Only Look Once. It is a deep learning-based object detection algorithm that uses a single convolutional neural network (CNN) to simultaneously predict bounding boxes and class probabilities for objects in an image. It is essential to approach deep learning with YOLO, prepare a dataset, choose a YOLO model, fine-tune the model, evaluate the performance, and deploy it to predict new images or video streams. It is employed based on the roots of deep learning for object detection using YOLO.

Detection of potholes
Initially, load the YOLOv7 model with weights and dataset and use associated labels to predict object classes and bounding boxes for each image in the dataset. It predicts and detects the potholes as object classes using bounding boxes on the images. The YOLO algorithm attempts to predict an object's class and the bounding box that defines the object's location on the input image. It uses three numbers to identify each bounding box: Center of the bounding box ((b_{x}, b_{y})) (1) Width of the box (b_{w}) (2) Height of the box (b_{h}) Initially, YOLO determines potholes by classifying them from the rest of the other boxes, localizes their position, and learns over them to detect them. Fig 2, 3, and 4 show the visualization of potholes detected from datasets.   Initially, clone the YOLOv7 repository from GitHub and install the required dependencies. Then run the training script with the dataset and configuration settings.

ii) Create Data yaml file
In machine learning, a data YAML file is a configuration file that specifies the location and properties of a dataset used for training, testing, or validation. The YAML file typically includes information such as the dataset's location on disk, the number of classes in the dataset, and the size and format of the input images. Additionally, the data YAML file may include information on preprocessing the dataset, such as resizing or normalizing images, and data augmentation techniques like random cropping or flipping. Having a data YAML file allows for the easy configuration of the dataset for use in machine learning models, as the dataset's properties can be easily modified by changing the values in the YAML file without modifying the source code. The YOLOv7 Tiny model with fixed resolution can be trained by modifying the configuration file, preparing the dataset, downloading pre-trained weights, or training from scratch. The training script is run with the modified configuration file and trained weights to start the training process. Training progress is monitored and then evaluated on validation to hyper-tune parameters of epochs, weights, etc. It was then tested on test sets for accurate time detection of potholes yielding high accuracy of 94% at a lower cost.
iv) Testing the model    CNNs' accuracy varies depending on model complexity, training data size and quality, and testing technique. CNN promises an accuracy of 55% to 98% on realtime data.

YOLO V7
The most effective object detection algorithm is YOLOv7 (2023 Guide). Viso Suite is the only all-in-one computer vision platform that allows building, deploying and scaling all applications 10x faster. YOLO works in a single stage to detect objects by dividing the image into N grids, each SxS in size.
YOLO models have achieved stateof-the-art performance on object detection benchmarks, with an average accuracy of 50% to 96% at a significantly faster and lower computational cost.
Potholes are critical risks to road safety. The detection and identification of potholes turns to be ambiguous in rainy and unclear climatic conditions. This work developed the pothole detection system using the YOLO V7 algorithm. The decision to use YOLO V7 was excellent because the main benefit of using YOLO is its incredible speed -it can process 45 frames per second. YOLO is also aware of generalized object representation. It is one of the best object detection algorithms, with performance comparable to CNN algorithms in terms of processing speed and complexity in real time. Eventually, YOLOv7based pothole detection resulted in an accuracy of 94.5% at less computation time suggestions its implementation in real time.

Conclusion
Potholes are critical risks to road safety. The detection and identification of potholes turns to be ambiguous in rainy and unclear climatic conditions. This work developed the pothole detection system using the YOLO V7 algorithm. The decision to use YOLO V7 was excellent because the main benefit of using YOLO is its incredible speed -it can process 45 frames per second. YOLO is also aware of generalized object representation. It is one of the best object detection algorithms, with performance comparable to CNN algorithms in terms of processing speed and complexity in real time. Eventually, YOLOv7-based pothole