Predicting the baccalaureate students admission: The influence of teacher and administration

In recent years, many data has been created about the field of education in Morocco after the introduction of the Education Information System since 2013, which can be used to study the results of the baccalaureate. According to the statistics given by the Moroccan Ministry of Education, the results of the baccalaureate were not more than 60% before 2017, and starting from 2018 the results began to rise due to several changes in the Morocco educational system. However, the number of students who dropped out of the qualifying secondary school was 67,000, with a percentage of 7.4% in 2019-2020. In this paper, the challenge is to create a model using several techniques from machine learning that will enable us to predict the results of the current school year after studying the results of previous years. This will helps decision-makers to improve students’ performance and explore the important factors that influence the prediction of the school-year end outcomes.


INTRODUCTION
Over the past twenty years, the Moroccan Ministry of National Education has known several reforms: the National Charter for Education and Training from 2000 to 2008, Emergency program from 2009 to 2013 and the framework law 51.17 that codified the strategic vision 2015-2030. The Program for International Student Assessment (PISA 2018) [1] placed Morocco 75th out of 79 countries that participated in an international assessment that evaluates 15-year-old students' abilities in reading, mathematics, and science. The Moroccan students in the 4th year of primary school (CM1) and second year of secondary school ranked among the last five countries in the 2019 edition of the TIMSS ranking, which assesses knowledge in mathematics and science. The evaluation of students is one of the most important axes of previous reforms. We will focus in this research on exploiting the results of the baccalaureate for the last scholar years to predict the results of the current scholar year. We will need to build a better model to predict the results of the baccalaureate students as soon as the results of the first semester of the actual scholar year are issued. The predicted results will be handed over to the concerned departments to find out the students who are in urgent need of school support. In order to enable the administration and teachers to know the students who are the model predicted for their failure, so that the preintervention can support them academically, psychologically or socially. The Moroccan baccalaureate includes two levels: the first year so-called regional and baccalaureate second year.
The rest of this paper organized as follows: in the second part, we will address the research reasons. In the third, we will cite the related works. In the fourth, we will refine the dataset and configure it to be usable by the selected algorithms. In the fifth, we will implement the valid models by experimenting with a set of algorithms used in forecasting. We will conclude the paper with notes and suggestions that may be useful in improving the topic search.

RESEARCH REASONS
The art of predicting student performance is highly beneficial to everyone especially to the educational administrators and students [2]. Early prediction of the students' academic performance at the end of a training course is one of the challenges in educational centers [3]. By analyzing students' performance, a strategic program can be well planned during their period of studies in an institution [4].
Through the baccalaureate results statistics, it was found that there has been a tangible progress in the results, as it moved from 60% in 2016 to 78% in 2019 [5]. This development in the results is due to several reasons, including: 9 Development of an information system «MASSAR» to manage schooling investing results.
9 Management's intention to fight cheating in exams.
9 The framework of the expanded regionalization that has been launched since 2016, giving more powers to the regional academies of education and training to closely monitor the educational issue while encouraging competition between the concerned. 9 The Framework 17.51 adopted nine projects all aimed at improving educational work.  The intervention of officials in the academic field in order to raise the level of scholar achievement of students will inevitably lead to an increase in the school results. There are difference in the results between institutions, directorates and academies. For example, the Boulemane directorate of Fez Meknes academy occupied the first nationally over the past three years by more than 90%, despite the fact that the region is rural and vast and is located in the Atlas Mountains, and it lives a period of severe cold. The secret of this, according to official sources, is that there are civil society associations that provide school support to baccalaureate candidates, especially when the exams are approach.
The aim of the research is to help decision makers and enable them to intervene early to support students expected to fail. Teachers can also focus their efforts and identify the students who have material weaknesses in order to address this weakness with support within or outside the classroom. The administration can tell parents that their mission is to help and provide material and psychological support to their children. Likewise, if students know that they are not expected to succeed, they may increase their efforts and perseverance for success. Also for Directories of national education institutions and educators to determine their needs of school support, according to the materials that its points pose a risk to the results in a timely manner to achieve success at the end of the school year.
The ability to predict the educational results of students is very important in the educational system. Recently, machine learning techniques have been applied to analyze educational data focused on retention and graduation rates to develop a more efficient and responsive system to support students.
The national program for assessing learning PNEA2016 [6], focused on the quality of teaching at the national level through targeted questions that concern the first year of the baccalaureate. The study focused on the opinion of the students and their satisfaction with the level of teaching and supervising, and concluded the ratios of the effect on the learning achievement at the level of administration and the student. Based on the foregoing, the administration's cooperation with teachers and other factors may contribute to raising the percentage of success at the end of the year is about 20%, and the 80% is for the student's efforts [7]. The administration will be able to identify the students whom the model predicted their failure admission, also to interfere in their academic and psychological support, whereas the social and economic support remains the prerogatives of other parties. This approach should enable the secondary institution management to develop an easier educational policy that can support students at risk or increase motivation for students who will get high rates.
Predicting successful and unsuccessful students at an early stage of the degree program help academia not only to concentrate more on the bright students but also to apply more efforts in developing programs for the weaker ones in order to improve their progress while attempting to avoid student dropouts [8].

RELATED WORKS
Predicting student performance is an important application of educational data mining, As stated in the paper entitled: "Predicting Critical Courses Affecting Students Performance: A Case Study" by the researchers' Yasmeen Altujjar, Wejdan Altamimi, Isra Al-Turaiki, Muna Al-Razgan [9]. In the study "Review on Predicting Students Graduation Time Using Machine Learning Algorithms" [10] by the researcher Nurafifah Mohammad Suhaimi from Malaysia, where she compared interest's student's performance in terms of gender; she found that the method of Male and female studies are different. In addition, some researchers cared about the age factor, they concluded that the older the age, some physical changes occur in the brain which lead to more difficulties in remembering or learning efficiently. K. Juliani; A. Mewati [11] used the Apriori algorithm to compare students' performances in common courses at the undergraduate and post-graduate levels, they discovered associations, and then identified factors that determined students' chances of success or failure like syllabus plan, student's interest, teaching and evaluation techniques. B. Baradwaj and S. Pa (2011) [12] performed a study using Bayesian classification on bachelor of computer application students from Awadh University in India, the authors found that the student's performance is strongly correlated with other elements more than students' effort such as family income, students' routine and other different' factors as mentioned. Strecht, Cruz, Soares, Merdes-Moreria, and Abren(2015) [13] predict students' success/failure and grade in a course by using social variables like age, sex, marital status, nationality, displaced (whether the student lived outside the district), scholarship, special needs, type of admission, type of student (regular, mobility, extraordinary), status of student (ordinary, employed, athlete, etc.). ElGamal (2013) [14] predicts students' grades in a programming course by considering different factors like the student's mathematical background. Huang and Fang (2013) [15] predict course performance based on students' performance in prerequisite courses and midterm examinations. Romero, Lopez, Luna, and Ventura (2013) [16] investigated the appropriateness of quantitative, qualitative and social network information about forum usage as well as the appropriateness of classical classification algorithms and clustering algorithms to predict students' success or failure in a course. Arnold [20] attempted to predict student success, using linear regression models and a neural network; the results of their research indicate that logistic regression models do not predict as much student behavior as models of artificial neural networks. Thomas and Hass (2001) [21] compared the performance of the three different data mining techniques to predict student's behavior: neural networks, mass algorithms and decision trees, with the neural network model achieving the best results. Delgado (2006) [22] used neural networks to predict student success in exams, identified in binary classes (success or failure).

DATA PREPARATION AND PROCESSING
Big data processing is a crucial task for many researchers, administrators, organizations and companies to collect the data and analyze the huge amount of specific data or information. Data preparation is highly recommended for many reasons such as dataset quality, process of data analysis, possibility of related algorithms to apply for removing noisy and missing data and increase the data reliability that is high-quality data models require high quality of data [23]. Data processing is transforming the data into a basic form that make it easy to work.

Data Understanding
For a specific data analysis project, the first step is to understand the data requirements of the project in conjunction with the problem definitions and expected objectives [24].
The main objective is to conduct some experiments on the results of the baccalaureate level using the techniques of data extraction and derive the knowledge that will prove useful to the Directorate of the Ministry of National Education in El Hajeb, which is our field of research in order to make the right decisions necessary to increase the success rate.
The dataset used is the results of the baccalaureate for the previous scholar year; we will use it on a predictive model for the academic results for the first semester of the current scholar year. Two aspects of student performance have been relied upon, namely the regional average for the baccalaureate first year(regional) and the average of continuous monitoring for the first semester of the current scholar baccalaureate second year. We will use the two years' exam dataset, which is collected from the MASSAR system. This dataset has 8 features, 1 target, and 2024 records described below:

Data Preparation
This step is very important for any data initialization process, as it specifies the data to be extracted, including collecting data from the source, preparing it and preprocessing it, so that it is consistent to facilitate the data analysis process. This Step should be taken to carry out before applying analysis and modeling processes. In the case of most educational data, pre-treatment should include the following steps: Data translation: digitizing records by changing names to numbers, this step is important to understand the data and simplifies the process of using and merging it. We will convert the binary columns into numerical columns (One Hot Encoding).
Data cleaning: standardizing the use of values indicating one thing, for example some institutions store estimates as a letter while others store them as numeric values, also replacing lost data and removing unreasonable, repetitive or contradictory values, as well as filling missing data with reasonable values. Standing in some unusual situations, such as the student getting rates in the national exam, far exceeding the rates of continuous monitoring (outliers' points). Outliers points are the rare values in categorical variables tend to cause over-fitting for the model.
Data Reduction: Retain data that provides knowledge that can be used in the analysis. The explosive growth in the electronic data of qualifying secondary education institutions creates the need to obtain some useful information from this large amount of data. "The process used for transforming raw data compiled by education systems into useful information that could be used by lecturers to take corrective actions and answer research questions" [25]. It is probably better for researchers to emphasize the specific ways in which variable importance is operationalized [26]. Checking which columns are relevant and dropping irrelevant columns (Feature selection). By performing further analysis, we can conclude that grades in the exam in the first semester of students have a significant impact on the overall success of studies, as well as regional average from the Baccalaureate first year. On the other hand, the gender of students and nb_day_born are not of great importance for predicting student success. After that, we will only maintain the regional rate (reg_average) for the first year in baccalaureate and the continuous monitoring (cc_average) for the second year, and the result of the end year (ga_average). We note a correlation between the continuous monitoring rate and the general rate. All students who score in continuous observation have a high pass rate. Understanding the relationship between variables is useful, we can use the value of one variable to predict the value of the other variable, and the correlation between variables indicates that as one variable changes in value, the other variable tends to change in the same direction. In our case, cc_average had an 83% effect to predict ga_average.
To prepare our dataset to be used very well, there are several techniques for doing that, including the Python Library like Panda, NumPy, and Sklearn, etc. In the figure2 all the features have, a normal distribution not skewed distributions; however, we have no problem to use regression model with machine learning algorithms. There are several factors to consider, when choosing between algorithms, most of them works well with small data sets; deep neural network recommended when we have much data and require time and material to train. The objective is to find the students' results of the school year. Therefore, the problem enters supervised learning and represents a linear regression model. Here are some important considerations while choosing an algorithm like the problem statement and the kind of output we want, the type and size of the data, the available computational time, the number of features, and others observations in the data. We must choose the algorithms and the hyperparameters, which allow developing the model to make good predictions on new data. Machine learning is a set of methods that make it possible to establish, from data, regressor models for prediction and decision-making. We will work with the various machine-learning algorithms and artificial neural network with the aim of finding the best predictable models for the end of the school year results; that can be identified factors wish have had a decisive impact on students' overall success.

Deployment
Our main aim is to make a Linear Regression model with two variables, which can give us a good prediction on the final exam. The variables we are basing our predictions on is called regional average and monitoring average variables and is referred to as X= (x1, x2). The variable we are predicting is called the general average variable and is referred to as Ŷ.

Deployment with many algorithms from the Scikit learn library
Many machine-learning algorithms that we applied to the dataset using the Pycaret [27] library, show prediction and error rate. We use the 80% for the data to train model for learning and 20% for the data to test model to see if it has learnt the data well or not. A linear regression model try to find linear relationship between the inputs and outputs. To find this relationship, there exists a method to calculate the optimal coefficients in the formula above, to use with inputs to predict the output. Unless our data is a perfectly straight line, our model will not precisely hit all of our data points. The error metrics will be able to judge the differences between prediction and actual values called by residual; however, we cannot know how much the error has contributed to the discrepancy. In our case, the average of residuals is small, it implies that the model that has a good job at predicting. For building a prediction model, many experts use gradient boosting regression, it is a machine learning technique for regression and classification problems. Table 3. The results and performance of the algorithms used by Pycaret.
The comparison of each model present in the Table3 depend on the problem statement. Every model has trained using the default hyperparameters and the performance metrics evaluated using the cross-validation.

4.4.1.1.Evaluation Metrics
PyCaret makes it possible to compare models easily. We compare the models by the evaluation metrics (MAE, MSE, RMSE…) using the test data and choice the best predictive result.

4.4.1.2.Deployment with deep learning framework
Artificial neural network (ANN) represents a large class of machine learning algorithms; their architecture resembles the neurons in the human brain. Keras makes it easy to create models in the Python programming language through an API and is a higher-level deep learning framework. The goal of Keras is for the creation and training of neural networks, rather than machine learning models in general. The Keras library has many general-purpose functions built in, such as optimizers, activation functions, and layer properties. Deep Networks or Neural Networks generally recommended if the available data size is large. The advantage of neural network is that it has the ability to detect all possible interactions between predictors' variables [28]. It wraps the efficient numerical computation libraries Theano and TensorFlow and allows us to define and train neural network models in a few short lines of code. The same data applied to the previous algorithms by Pycaret applied to the Keras library and gave the good result.
For each training, we used MSE loss function. Our model trained using the optimizer named stochastic gradient descent (SGD) for 60 epochs. Using the backpropagation, the optimizer updates the weight parameters to minimize the loss function. The learning rate is a hyperparameter that controls how much to change the model weights each time. Another powerful regularization technique for neural networks and deep learning models is dropout [29].

4.4.1.3.Definition of the ANN Model
Our model described as bellow:  The Figure5 show us that the model converged the tows lines of training and testing. The loss function MSE is a small for a neural network in our model.

4.4.1.4.Implementation
The classroom management, as well as academic and social interactions between students and teachers had a direct influence on students' learning [30]. The teacher is the main pillar for carrying out the task of educational support into their students. Another way to assess the impact of teachers is to calculate the observed averages of the classes taught, in order to study the comparison between them. After extracting, the predictive results from the model, and obtain a list of students who model predict them to fail. The process is not complete without the concerted efforts of the various stakeholders, including managers, inspectors, mentors, and students' parents.

CONCLUSION
In this paper, our research mainly focused on predict students' success at the end of their studies using their information's after the first and the second Baccalaureate years as input variables. The development of the model by different approaches such as neural network gives the best performance as compared to the other existing techniques. The model provides an opportunity to identify aspects of the plan and educational programs that should be improved in order to motivate students to work more seriously and improve their knowledge in specific disciplines. In addition, a teacher can identify and help students who are expected to fail.
The following research can examine the social and health status and history of the educational process to find out the cause of student's dropout and find possible ways to limit or reduce them.