Modeling of child labour exploitation status in Indonesia using multilevel binary logistic regression

The Indonesian constitution recognizes guarantees the right of the child to rest and leisure, to engage in play, and recreational activities appropriate to the age of the child so that they should not be working. Employers are also prohibited to employ children. However, many children come to work because of poverty, even though child labour is close to exploitation. Theoretically, individual and contextual factors affect the exploitation status of child labour. This study aims to analyze the variables that influence the exploitation of child labour in Indonesia based on data from the National Socio-Economic Survey (Susenas) in 2018. The random effect test shows that there are differences between regency/municipality so that multilevel binary logistic regression performs better than one level binary logistic regression. More than 80 percent of child labourers are exploited in terms of education and working hours. Variables that significantly influence the exploitation status of child labour at the individual level are gender, the occupation sector of child labour, and the occupation sector of the household head. Meanwhile, poverty rates and mean years of schooling significantly influence the exploitation status of child labour at the regional level.

Child worker is different from child labour. Child worker is children who are involved in working activities during the reference period, while child labour is children who work below the minimum age that is permitted by law for a certain time period or who are involved in hazardous works [1]. Hazardous works can be seen from working hours. Children as child labour tend to work outside their families. They are vulnerable to be exploited, are involved in work that is not appropriate for their age, and the work that can interfere with their education, social, and mental life. It will interfere with the chances of getting a better life in the future [2]. Child labour is an obstacle to the realization of quality and prosperous children. As many as 59.3 percent of child labour aged 10−17 years in 2018 are no longer in school.
Although there is not enough information to explain whether being child labour makes them no longer in school or because they are no longer in school then they become child labour. However, these data show that more than half of child labourers are no longer completing their education level although they should still be in school. With a low level of education, as adults, they find that is difficult to get a decent job. Menial jobs with poor pay are closer to them.
Based on Law Number 23 of 2002 concerning Child Protection, children are people who are under 18 years old, including babies in the womb. Every child has a right to receive protection to ensure that he/she can live, grow, and develop optimally and be protected from violence and discrimination. The law also provides that the state, society, family, and parents are responsible for child protection.
The results of the Child Labour Survey in 2009 show that 6.9 percent of children aged 5-17 years were child worker and 43.3 percent of them were child labour. This proportion was quite large because there were around 1.76 million children that were working as child labour [1]. Data from Susenas held by Statistics Indonesia stated that the number of child labour in 2016 reached 1 million children, it increased in 2017 to 1.2 million children or around 3.06 percent of all children aged 5-17 years. In 2018 it decreased to 982 thousand children or around 2.65 percent of all children aged 5−17 years. It is still a lot considering the SDGs target that is to end all forms of child labour by 2025. The Indonesian government also has a target to eliminate child labour by drawing up a road map towards Indonesia free of child labour by 2022.
The problem of child labour is not only that children enter the job's work with long working hours to earn wages but also they vulnerable to be exploited, as the society has the perception that children are weak to depend on themself, in terms of wages, working hours, and education. Many employers employ child labour because they can be paid less than adult workers. Sometimes, employers ignore the regulation about working hours for child labour so that many child labour work longer than the regulation. This makes their education disrupted because time runs out for work. The combination of heavy demands and necessities of life for themselves and their families make children who child labour have exploited and even do dangerous works [3].
The exploitation of child labour refers to the existence of child abuse. Children are forced to do something without care for their rights to meet economic or social needs [4]. Data from the Indonesian Children Protection Commission (KPAI) stated that around 29 percent of child protection cases in Indonesia in 2015 were cases of child labour exploitation [5]. Although Susenas data stated that the number and percentage of child labour had decreased, the percentage of exploited child labour had increased from year to year. In 2016, there was 75.89 percent of exploited child labour. It increased to 86 percent in 2017 and 87 percent in 2018.
Several studies have been conducted to analyze child labour and exploitation. Based on data of Susenas 2017, about 41.7 percent of child labour work in the agriculture, forestry, and fisheries sectors. In more detail, there were differences in the pattern of the occupation sector of child labour in rural and urban areas. As many as 60 percent of child labour in rural areas work in the agriculture, forestry, and fisheries sectors. This was different from child labour in urban areas, where most child labour work in the trade sector and the manufacturing industry in factories. Iryani and Priyarsono (2013) analyzed the severity of exploitation of child worker and concluded that DKI Jakarta, Banten, and West Java Provinces had high exploitation severity. With logistic regression analysis, it concluded that the education of the household head affects exploitation in terms of working hours and education, while gender affects exploitation in terms of wages [3]. Dewi Sulastri's research (2016) in the traditional gold mine of Kelian Dalam Village, Central Kalimantan stated that the low level of the family's economy, inadequate educational facilities, the influence of environment and culture, the character of the village head lead to the exploitation of child labour [4]. Rizqa Fithriani's research (2013) in Lampung stated that poverty and gender affected the chance of children to become child labour. Children who come from poor families and their gender are more likely to become child labour. The effects of poverty and economic levels on child labour had been analyzed using micro and macro approaches. The micro approach used binary logistic regression and the macro approach used multilevel linear regression. The macro approach gave the result that an increase in community welfare will reduce the number of child labour, but an increase in the poverty rate will create child labour. The micro approach gave different results, where poverty is not the reason for children become child labour [6].
Based on the introduction above, this study aims to obtain an overview of the exploitation of child labour and analyze the variables that influence it so that the chain of exploitation of child labour can be ended. Regarding the problem of child labour exploitation, there is an interaction between children who child labour and their community. Hence, this study proposes a new approach by employing multilevel logistic regression to capture the impact of explanatory variables at the individual and regional level on child labour exploitation status.

Materials and data sources
This study uses data from the National Socio-Economic Survey (Susenas) Core March 2018 which was held by Statistics Indonesia. From 389,889 children data covered in the Susenas 2018, there are 4,747 child labour aged 10 to under 18 years were analyzed in this study. The criteria for child labour in this study followed the concept of child labour in the Child Labour Survey in Indonesia in 2009, namely (1) children aged 10-12 years who worked without minimum working hours, (2) children aged 13-14 years who worked more than 15 hours per week, and (3) children aged 15-17 years who worked more than 40 hours per week [1]. The exploitation status of child labour will be viewed from two sides, from the number of working hours and education. Child labour will be considered in terms of the number of working hours if they work more than 20 hours per week and they are exploited in terms of education if their school participation is not yet in school and no longer in school.
A response variable in this study is the exploitation status of child labour which is exploited (code = 1) and not exploited (code = 0). While the explanatory variables at the individual level (level 1) are the gender of child labour (boy, girl), occupation sector of child labour (formal, informal), region area (urban, rural), occupational sector of household head (working in the agriculture sector, working in the non-agriculture sector), and education of household head (primary education, secondary education, high education). The mean years of schooling and poverty rate are the explanatory variables at the regional level (level 2).

Analysis method
This study uses descriptive analysis to obtain an overview of the exploitation status of child labour in Indonesia. This will be presented through tables and diagrams. The inferential analysis is multilevel binary logistic regression. It analyzes variables that influence the exploitation status of child labour at level 1 and level 2.

Binary logistic regression
The logistic regression analysis is most frequently used to determine the influence of the explanatory variable (X) on the response variable (Y) in the form of qualitative data with a nominal or ordinal scale [7]. Explanatory variables in binary logistic regression can be quantitative data with interval or ratio scale or qualitative data with nominal or ordinal scale. If the response variable has two categories, usually called "success" (Y = 1) and "failure" (Y = 0) it is called binary logistic regression. The variable Y follows the Bernoulli distribution for every single observation. The general form of the logistic regression model with p independent variables is is the conditional mean of Y given x. The relationship between the response variable and the explanatory variables is not linear, so it is necessary to carry out a transformation to line up the relationship in equation (1). It is called the logit transformation, so the logit model is obtained as follows: In logistic regression, the response variable is formulated as follows: (3) The error value () has two possible values, if y = 1, then with a probability of ( ) . Therefore, error (ε) has a binomial distribution with mean ( ) The method that was used to estimate parameters is the maximum likelihood estimation (MLE). This method generates parameter values by maximizing the probability of the observed data using the likelihood function. This function expresses the probability of the observed data as a function of the unknown parameter as follows The estimated value of parameter  is obtained by maximizing the likelihood function from equation (4). The maximization process will be easier, mathematically, if the natural logarithm is used.
The ln likelihood equation in equation (5) is derived with respect to . The value of  which makes the first derivative equal to zero is the value of  which maximizes equation (5). It is the parameter estimate that we want.

Multilevel binary logistic regression
Social research discusses the interactions between individuals and groups of people. Individuals are influenced by the groups they are in, and the communities are influenced by the individuals who make them up. Individuals who nest in groups build a hierarchical system where individuals and groups are defined at different levels in this hierarchical system. The relationship between variables with individual characteristics and group characteristics is discussed in a multilevel analysis [8].
In the multilevel analysis, a sample drawn from a population can be described as a multistage sample. First, we take a sample unit from a higher level (level 2), for example, a regency/municipality then take a sub-unit from a lower level (level 1), namely individuals in the regency/municipality concerned. In this condition, the observed individuals are generally not independent of another. Individuals from the same regency/municipality tend to be similar to other individuals. For example, the education levels of individuals in the same regency/municipality tend to be similar to others because they have the same school facilities. As a result, the average correlation between variables measured at the individual level of the same regency/municipality will be greater than the average correlation between variables measured for individuals from different regency/municipality. The violation of the assumption of independence between observations makes the estimated standard deviation smaller than the true value. Because of this, if we use one-level regression, the statistical test is invalid because an insignificant variable is considered significant [8].
The multilevel analysis has a fixed effect or random effect model based on the effect of the explanatory variables on the response variable. This effect is seen through the regression coefficient value. If the regression coefficients have the same for all observations, it is said that the model has a fixed effect. But, if the regression coefficients differ between two or more groups, the model has a random effect. A model with a random effect that only differs in intercept between groups is called a random intercept, while a model with different regression coefficients (including an intercept) is called a random slope [9]. In a model with a random slope, the effect of the explanatory variables on each group will be different.
If the response variable in a multilevel analysis is categorical data with two categories, it will be discussed in a multilevel binary logistic regression. This study uses two-level binary logistic regression, as the level 1 or individual level is household, while the level 2 or group level is regency/municipality. In this study, we want to see the influence of the explanatory variables at each level on the response variable without having to distinguish the size of the influence between groups. Thus, we used a multilevel binary logistic regression model with a random intercept. The multilevel binary logistic regression model with a random intercept at level 1 is [8] ij Intercept in the multilevel binary logistic regression with random intercept is assumed differs between group because of the influence of explanatory variables at level 2, so j qj q j u Z 0 0 00 0 We write 0 p pj   = so we can combine the notation of slope at level 1 and level 2. We The stages of analysis in multilevel binary logistic regression are:

Testing for the significance of the random effect
The random effect arises because of variations at level 2. For this reason, a goodness of fit test is conducted to assess a model with a random effect or without a random effect that fits the model. The test statistic that is used is called deviance. It is the comparison of -2 ln likelihoods between a logistic model without random effects and a logistic model with random effects. The deviance value is also called the likelihood ratio and is calculated by the following formula D is distributed as chi-square with one degree of freedom. If ( ) then the random effect significant. It means that the variation in the response variable is significant between groups so that the multilevel binary logistic model is more suitable than the one-level binary logistics model.

Testing for the significance of the parameter simultaneously
The significance of the parameter estimation is tested simultaneously using the likelihood ratio with the following test statistics then at least one of the explanatory variables at level 1 and level 2 significantly influences the response variable.

Testing for the significance of the parameter partially
If the result of testing for the significance of the parameter simultaneously is significant, it is followed by partial testing of every explanatory variable. The partial test uses the Wald test with the test statistics are distributed as N (0,1) or Z as follows

Calculate and interpret the odds ratio
In regression analysis with the response variable is a category data, the interpretation of the parameters was carried out using the odds ratio. Odds ratio, denoted OR, is a measure to calculate the odds of a success event (y = 1) between one category (x = 1) compared to other categories (x = 0). Odds is the ratio between the probability of success event (y = 1) and failure event (y= 0). The value of OR is calculated by the following formula [7]  where p = 1, 2, …, P Level 2:

Overview of child labour exploitation and the related variables
Children should still study, play, and socialize with their peers so as not to interfere with the child's development process. The law prohibits employers from employing children in both the formal and informal sectors. Even so, there are still many working children, even child labour. To make matters worse, based on data from Susenas Core March 2018, there is 86.8 percent of children who child labour are exploited. It shows that child labour is vulnerable to exploitation. They are exploited in terms of working hours and education. The average working hour of child labour in Indonesia during the week is 45 hours. It even exceeds the normal working hours of adult workers. In terms of school participation, there is 61.89 percent of children who child labour that no longer goes to school. It means only 38.11 percent of them that are still in school. This also shows that the status of child labour will interfere with children's school participation. The percentage of exploited child labour in each province is shown in Fig. 1. The provinces with the largest percentage of exploited child labour were DKI Jakarta, then Banten and Central Java. This result is in line with the research of Iryani and Priyarsono (2013) which stated that DKI Jakarta, Banten, and Central Java are the three provinces with the highest level of exploitation severity in terms of working hours [3]. All children who child labour in DKI Jakarta (100 percent) are exploited. This is because the majority of child labour in DKI Jakarta (87.2 percent) work in the formal sector. Workers who work in the formal sector must follow standard working hours like adult workers. This amount has certainly exceeded the limit of working hours allowed for children. The average working hour of child labour in DKI Jakarta is 57.2 hours with a maximum working hour of 84 hours which is not an outlier. In terms of school participation, 67.5 percent of children who child labour in DKI Jakarta are no longer in school. Besides that, the entire DKI Jakarta area is an urban area that is also the center of Indonesia's economy. All of these are great potentials for the exploitation of child labour in DKI Jakarta [3].  Table 1 shows the percentage of exploited child labour by category on the individual explanatory variables. Based on Table 1, it can be seen that the proportion of exploited child labour is greater for boys than girls, 89 percent of boys who child labour are exploited, while 83 percent of girls who child labour are exploited. This is inseparable from the greater responsibility of a male to support the family economy [6]. This result is also supported by the total of boys who child labour that greater than girls who child labour. Based on Table 1, it can also be seen that the proportion of exploited child labour who work in the formal sector is 98.3 percent. It is greater than those who work in the informal sector which is 76.8 percent. Almost all children as child labour in the formal sector are exploited. This condition can occur because jobs in the formal sector have long working hours which are not suitable for children. After all, it makes them vulnerable to be exploited, especially in terms of working hours.
The percentages of exploited child labour, both living in urban and rural areas, are equally high. However, the proportion of exploited child labour in urban areas is greater than those who live in rural areas. This is because more children who child labour that live in urban areas work in formal sectors. It is twice as children who child labour who work in informal sectors. The individual explanatory variables used in this study contain the characteristics of the household head as the party responsible for the development of a child. Based on the category of occupation sector of household head, the proportion of children who have a household head working in the agricultural sector is more exploited than children who have a household head working in the non-agricultural sector. Based on the level of education of household head, it can be seen that the percentage of exploited child labour increases along with the decline in the level of education of household head. Highly educated inline with high income, so that household with highly education household head has a smaller probability to has children as child labour including the exploited child labour.
The exploitation of child labour is presumed to be influenced not only by individual characteristics but also by the characteristics of groups where the child labour gathers, in this case, the regency/municipality. The poverty rate and the mean years of schooling every regency/municipality were used as contextual variables in this study. The highest poverty rate in Indonesia is in Deiyai Regency, Papua. In Deiyai, 43.49 percent of the population have expenses below the poverty line. The second highest percentage is in Intan Jaya Regency, Papua, where 42.71 percent of the population live below the poverty line. This value is very different from the poverty rate in South Tangerang City, where only 1.68 percent of the population are below the poverty line. Based on the boxplot of the poverty rate in Fig. 2(a), the distribution of the poverty rate is right-skewed. This shows that many regions have a little value of the poverty rate. There are many areas whose poverty rate are an outlier, which means their poverty rate are very large compared to others.
The mean years of schooling between regencies/municipalities in Indonesia vary widely, there are many regencies/municipalities whose mean years of schooling are outliers, both top and bottom outliers as shown in Fig. 2(b). The region with the smallest mean years of schooling was Nduga Regency with a score of 0.85 years. This means that on average Nduga residents did not complete grade 1 of elementary school. This figure is very small compared to the mean years of schooling in Banda Aceh, which is the region with the highest mean years of schooling in Indonesia, namely 12.60 years. On average, Banda Aceh Cities' residents have completed high school.

Fig. 2.
Boxplot of (a) poverty rate, (b) mean years of schooling by regency/municipality.

The influence of the individual and contextual variables on exploitation status of child labour
The influence of the individual variables, namely the characteristics of children as child labour and the contextual variables, namely the regency/municipality where the children reside, on the exploitation status of child labour were analyzed through multilevel binary logistic regression. The first stage of the multilevel binary logistic regression analysis is testing for the significance of the random effect. At this stage, it is checked whether there are variations between groups. By using the formula in equation (9), the likelihood ratio value is obtained, namely deviance D = 694.79. This value is greater than  2 0.1,1 = 2.71, it shows that with a 90 percent confidence level there is a variation between regencies/municipalities. It means the two-level (multilevel) binary logistic model is more suitable than the one-level binary logistic model with the data.
Furthermore, testing for parameter significance is carried out to check whether the explanatory variables simultaneously significantly influence the exploitation status of child labour. The ratio of the likelihood value of the model without explanatory variables and the model with explanatory variables in equation (10)  13.36. It can be said that with a 90 percent confidence level there is at least one explanatory variable that significantly influences the exploitation status of child labour. Testing for the significance of the variables simultaneously is then followed by a partial test to identify which explanatory variables significantly influence the exploitation status of child labour. The Wald test statistical value is calculated by using equations (11) Table 2.  Table 2, it is found that gender, occupation sector of child labour, occupation sector of household head, poverty rate, and mean years of schooling significantly influence the exploitation status of child labour in Indonesia in 2018. The multilevel binary logistic regression model that is formed based on parameter estimates in Table 2  Based on the partial test, gender significantly influences the exploitation status of child labour. Boys who child labour tend to be exploited 2.18 times than girls who child labour. This result is in accordance with the results of [6]'s research that boys who child labour have a greater responsibility to support the family's welfare so that they are more vulnerable to exploitation. Child labour who work in the formal sectors tends to be exploited 11.53 times compared to child labour who work in the informal sectors. The descriptive analysis shows that 52.9 percent of exploited child labour work in the formal sector. Children who child labour are categorized as working in the formal sectors if they become labourers or employees who have working hours according to the rules set by the employer. This causes children who child labour to have working hours like adult workers so that they are exploited in terms of working hours. Child labour that have long working hours will be vulnerable to disruption in their education so that they are also exploited in terms of education. The household head occupation sector also significantly influences the exploitation status of child labour. Child labour who have a household head working in the agricultural sector tends to be exploited 1.22 times compared to child labour who have a household head working in the non-agricultural sector. Parents tend to encourage their children to work on agricultural land to become unpaid family workers [1]. Majority of child labour who have household heads working in the agricultural sector come from poor families. Poverty makes the economic burden that must be borne by the family heavier so that children have to work and are then exploited.
Both contextual variables used in this study have significantly influenced the exploitation status of child labour. The regression coefficients for both contextual variables are negative, it means that the greater the poverty rate or the greater mean years of schooling in a regency/municipality will decrease the tendency of child labour in that region to be exploited. Every one percent increase in the poverty rate in a regency/municipality will increase the tendency of child labour in that region to be exploited by 0.98 times. This result is contrary to the research of [6] that a decrease in the number of poor people will be accompanied by a decrease in the number of child labour. Poverty makes children have a duty to earn a living by becoming child labour. The pressure of poverty makes a child actively involved in economic activities [2]. On the other hand, many children from non-poor families are allowed by their parents to become child labour because it is considered as a learning process for the child to gain skills and work experience that will increase the child's independence [10].
Every one-year increase in mean years of school will increase the tendency of child labour to be exploited by 0.76 times. The increase in the mean years of schooling shows that the level of education of the community in that region is better so that they do not tend to let their children enter the work world. Besides, the availability of work opportunities for child labour and the availability of educational facilities from the area of residence influence on the incidence of child labour exploitation [11].