Earthquake prognosis using machine learning

. One of the deadliest and riskiest natural disasters is an earthquake. They often occur without a warning or any further alert. Therefore there was a need for its prognosis as it is extremely important for mankind as well as the environment. In this project, the successful application of machine learning techniques have been used for different elements of research which would be possible to use to make a more accurate short-term prognosis of upcoming earthquakes. Random Forest Classifier is the algorithm used for the research.


Introduction
As the world is growing in the field of science and technology, there are still a few things that human beings cannot stop. Natural disasters are one of them. The consequences are large number of deaths, propertyloss, and damage to the environment as well as mankind. Humans cannot avoid them, but early prediction and appropriate precautions can reduce the casualties and the loss of life as well as property. As Earthquakeis one of the main such disasters which is very disastrous, It's after effects are very deadly as it results in disasters like landslides, tsunamis, volcano eruptions, etc..

Literature Review
At present, a specific technique which could have been used for predicting an earthquake does not exist, which made it much more destructive. For example the 2023 Turkey Earthquake tendency (There were 3 consecutiveearthquakes all with magnitudes 5+ whichwerefollowed by land slides thatresulted in nearly 25,000+ death cases).
In today's era Data Analytics is one of the tool used to predict future outcomes from the current and past available observations. E.g. Data Analytics is used is the medical field to improve the patient's health. Sports, defence, marketing, education are some of the other fields where it is used. In sports it is used in management of sports big data [1]. Data Analytics in the medical field tells us about the use of data analytics to predict the demand of oxygen which is very important for diseases like Covid -19 [2]. Its use in defence and finance is helpful [3] [4]. A. Moghar presented a paper in which machine learning was used to predict the stocks [5]. Use of machine learning in agriculture to decide which crop should be grown is significant [6].
Earthquake Prediction using Machine Learning has been done with six machine learning algorithms. To choose the best model, various mechanisms are used separately, and accuracy levels in the training and testing datasets are compared [7,8]. Earthquake Damage Prediction using Machine Learning by D. T. Nandwani et.al [9] reported how to reduce the damage caused by earthquakes. Earthquake trend prediction using long short-term memory run predicts the future trend of occurringearthquakes [10,11]. Use of SVR and ANN was done in the earthquakeprediction model [12][13][14][15]. It has been reported to use Dynamic Fourier Analysis to distinguish between seismic signals from natural earthquakes and mining explosions [16].Use of GIS technology was done to predict the landslide [17].ANFIS has been used to predict the earthquake [18]. Use of mobile gravity data to predict the earthquake has been reported [19]. It has been done to predict earthquake-induced landslide susceptibility using real-time earthquake-induced landslide data [20]. Very Recently,research for post earthquake damaged reinforced concrete has been done [21][22][23]. Recently S.Gentili et.al presented a paper in which innovative machine learning approach called NESTORE is used to predict the probability of earthquake [24]. In the present work, Random Forest Classifier is used to predict the magnitude of an earthquake. Three parameters were considered to predict the earthquake latitude, longitude and depth from the epicentre.

Methodology
An algorithm for classification and regression in machine learning called random forest is based on supervised learning. The Random Forest classifier used here is based on the Wisdom of Clouds idea, which claims that many diverse models acting as a committee might perform admirably in each set of separate constituent models. It is a well-known algorithm for machine learning that falls under the umbrella of ensemble learning techniques. Using a randomly selected subset of characteristics and samples from the training data, the method creates several decision trees. The final forecast is based on the consensus of the separate decision trees' predictions, each of which is trained on a distinct subset of the training data. It is a strong method that handles high-dimensional datasets and a large number of features with ease and is less prone to overfitting than single decision trees. The dataset used in the project is taken from United States Geological Survey [25]. The dataset includes time, latitude, longitude, depth, magnitude, magnitudetype, nst, gap, dmin, rms, place, horizontal Error, dept Error, magError, magNst, location Source, magSource.Parameters considered for the research are latitude, longitude, depth, magnitude. Figure 1 depicts the block diagram of project.

Result and Discussions
In the given map below in Fig. 2, the earthquakesthat have occured in the pastthroughout the world are noted. The area whichismarked in red tells us about the earthquakesoccurred in the past. Similarlyin Fig. 4 and Fig. 5 specifically the earthquakesthat have occurred in the different continents beforewhich are included in the dataset. A lot of seismicactivity has taken place earlier in Europe. Similarly, a lot of earthquakes have been occurred in South America. With the help of thesesnippets a clearpicture about the earthquakesthat have occurredbeforeisnoted. All of these maps were generated through basemap(matplotlib extension).

Fig. 5. Past Earthquakes in South America
So, talking about the model, by inputting the values of latitude, longitude and depth, the model predicts the magnitude of the earthquake that will be occurring at that place. Study of the underground formation of fractures, faults would help to decide the depth of the specific place. So the accuracy of the model came to be 62.90%. Besides this F1 score of the model was noted as 0.60 which tells us that the model is working moderately well. Also the values for precision and recall are 0.59 and 0.62 respectively. The precision is 0.59, which meant that out of all the positive predictions made by the model, only 59% were correct. The recall is 0.62, which means that the model was able to correctly identify 62% of the actual positive instances.%. Fig. 8 shows the performance metrics of the model And the earthquakethatoccurredthererecently on 23/2/2023 was of magnitude 5.4 Richter Scale. In the Fig.11 given below, previous earthquakes that have occurred in Turkey are shown. Talking about the F1 score, precision, it as shown in the following graphs.  The above figures shows the graph depicting the earthquake's magnitude vs number of occurrences of the earthquakes that have happened before. Majority of the earthquakes occured between the range of magnitude 4 to 5 Richter Scale. While implementing the same on Himalayan range,by considering the dataset of previous years its accuracy was noted as 67.82% which is shown in Fig.14. After entering the coordinates and depth of latest earthquake which happened in the Himalayan range in Bhadarwah on 13/06/2023, the magnitude of earthquake was reported as 4 Richter scale which is again in good agreement with the original earthquake which was of 5 Richter scale. Fig.15 shows the past earthquakes happened in the Himalayan Range. Fig. 16 and Fig.  17shows the graph depicting F1 score,precision metric recall and magnitude vs No. of occurences of Himalayan dataset respectively.   Hence it is understood how important it is to predict the earthquake in today's generation as even today, thousands of people are dying due to this horrible disaster. Further by including more geological factors like fault density, construction dataset, increase in the accuracy of the model can be done.

Conclusion
This research paper presented a model for earthquake magnitude prediction using seismic data. The dataset includes information of the latitude, longitude, depth, and magnitude of earthquakes that have occurred worldwide. The model takes in these inputs and other relevant features to predict the earthquake's magnitude. The model's efficiency was tested through a case study of earthquake data from Turkey and Himalayan Range, and it was found to predict the magnitude of earthquakes in the region with a moderate degree of precision and recall. This study highlights the potential of using models to predict the earthquake's magnitude based on the seismic data. The model's predictions can be useful in real-time earthquake monitoring systems to help mitigate the damage caused by earthquakes and savelives.

Acknowledgement
We want to thank the honourable Prof. (Dr.) R. M. Jalnekar, Director of the Vishwakarma Institute of Technology (VIT), Pune, and Prof. (Dr.) C. M. Mahajan, HOD, for their encouragement and strong moral support. Additionally, we would like to express our gratitude to Dr. Sachin S. Sawant, our project's mentor, for his invaluable advice.