Maintaining Religious Harmony through Predicting the Level of Lawlessness using Linear Regression

— Crimes are basically committed due to low human faith and ultimately disrupt harmony, including inter-religious harmony. Therefore, the purpose of this study is to predict and determine the number of law violations that occurred in Yogyakarta Province in 2022 as a case study using a linear regression algorithm based on data ranging from 2012 to some of the data on law violations in 2022. The method used is the forecasting method in which the stages of data collection, data preprocessing, data modeling


I. INTRODUCTION
Indonesia is a country whose people adhere to different religions, including Islam, Catholicism, Protestantism, Hinduism, Buddhism and Confucianism. The diversity of religions adhered to is tied to the values of Pancasila as the basis of the state to create harmony between religious communities [1]. The low faith of a person so that committing a crime can damage inter-religious harmony [2].
Crime is often interpreted as behavior that violates the rule of law because of which a person can be punished. Crime occurs when someone violates the law either directly or indirectly or is a form of negligence that can result in punishment. In the province of Yogyakarta, the annual violations of law vary. The number of such violations can be calculated year-to-year and can be predicted using many methods. Therefore, on this occasion we will use a forecasting method to predict the number of law violations that will occur in 2022.
There are various forecasting methods that have been used to predict the number of law violations, including Double  [3], Triple Exponential Smoothing [4], Fuzzy Logic [5], Exponential Smoothing [6]. In addition, the use of forecasting methods to predict the number of law violations has been practiced in various places such as the Indonesian seas [3], Probolinggo Regency, and Batam City.
Yogyakarta is a province that is frequently visited by the wider community for various activities such as study tours, holidays, or just for recreation because of its uniqueness. The more people there are, the greater the possibility of law violations that will occur. A system for predicting the number of law violations is needed for the development of public order and security so that Yogyakarta Province is always safe and peaceful. In this research a prediction system has been built for the number of law violations in Yogyakarta Province using a linear regression algorithm. Based on this explanation, this study aims to predict the level of crime to maintain religious harmony with a case study in Yogyakarta as a machine learning model that was built.

II. RELATED WORKS
The forecasting method using a linear regression algorithm has been applied in various studies including predicting the supply of tablet-type drugs [7], predicting long-term electricity usage [8], and predicting the number of students enrolled in Maranatha Christian University [9]. In addition, there are several research that using regressions to predict crime, such as: (1) using a machine learning model, predict the trends and patterns of crime in Bangladesh [10]; (2) an analysis and prediction model for crimes using many variables [11]; (3) organizational crime and formal employment case in Colombian regression discontinuity evidence [12]; (4) examining crime and the shared housing industry using a spatially weighted regression method is part of the sharing economy [13]; (5) detailed neighborhood-level trends in crime during the Chicago COVID-19 epidemic [14]; and (6) fuzzy rough nearest neighbor and sequential minimal optimization with logistic regression for detecting credit card fraud [15].

A. Research activities
The research method is the rule used in collecting data from various information, by carrying out several steps, namely by conducting a literature study and studying several references related to this research, as well as collecting data that is useful for obtaining solutions to the problem formulation. The stages of research are a series of processes of how research is carried out. These stages are provided in the Figure 1 that begin with identifying the problem until conclude the prediction. • Identification of problems Formulation or identification of the problem is the basic (initial) thing that needs to be done in research. With a problem, we can conduct research to find a solution to the problem we identified earlier. In this study, the problem to be solved is how to predict the number of law violations that will occur in 2022 in Yogyakarta Province using a linear regression model.
• Study of literature At this stage, various research is carried out in related journals or journals that have the same discussion related to the research we are conducting.
• Data collection The data used in this study is official data sourced from http://bappeda.jogjaprov.go.id/. The data was selected over the past 11 years (2012 -2022). The data contains the types of crimes with the number of each of these crimes. Then, to carry out and produce a predictive output, the data will be divided into input and output categories.
• Data Preprocessing Data preprocessing aims to transform data into a format that is simpler and more effective in obtaining more accurate values and reducing computation time for large-scale problems, thereby making data values smaller without changing the information they contain.
• Modeling At this stage, a new model of linear regression is created for further predictions to be made on the data.
• Prediction with Regression In the prediction stage, the model is used to train training data and validate data testing.
• Model Evaluation Accuracy calculations are carried out in training and testing, so that you can see the differences in the accuracy of the 2 stages (training and testing).
• Conclusion Drawing Conclusions are drawn based on the predicted data which is then compared with the original data.

B. Linear regression
In general, there are two kinds of relationships between two or more variables, which are usually called the form of the relationship and the closeness of the relationship. To determine the form of the relationship then used regression analysis. As for the closeness of the relationship can be known by correlation analysis. There are also those who say that linear regression is a statistical method used to form a model of the relationship between the dependent (Y) variable and one or more independent (X) variables [16]- [19], with the aim of estimate and predict the population mean or the mean value of the dependent variable based on the known values of the independent variables. The results of the regression analysis are in the form of coefficients on each variable X (independent). The coefficient is obtained by predicting the value of the Y (dependent) variable with an equation. The regression coefficient is calculated with two objectives at once, to minimize the deviation between the actual value and the estimated value of the Y (dependent) variable based on existing data. This regression analysis consists of two equation models: simple linear regression analysis and multiple linear regression analysis. Formula (1) and (2) are a formula for simple and multiple linear regression.

A. Data Pre-processing
We do not process the data that we obtain from the official website of the Yogyakarta provincial development planning agency, instead we will carry out the data pre-processing stage first. Data pre-processing is a process that converts raw data into a form that is easier to understand [20], [21]. This process is important because raw data often does not have a regular format. In addition, so that we can also see the data from a statistical point of view so that the data can show the correlation between one variable and another.
• Correlation Between Variables The last stage in the data pre-processing process that we did was to see the correlation between variables. In this research, we are looking for a correlation between each case of law violation that occurred in the province of Yogyakarta. The result shows that most law violations will increase if other law violations occur. For minority cases, there are three types of law violations which will decrease if other law violations occur. The three types of law violations are cases of law violations in the form of rape, illegal mining, and illegal logging. It was found that illegal logging cases will decrease if illegal mining cases increase and vice versa. In addition, it was also found that there was a correlation between the two cases and rape cases. We found that cases of illegal mining and illegal logging increased, so rape cases decreased.

B. Data Indexing
After carrying out the data pre-processing process, indexing will then be carried out where the process will initiate the variables that will be used in the prediction process using a linear regression algorithm. This indexing is needed to determine position data in Cartesian coordinates so that the result will be obtained in the form of predictive data on cases of law violations that will occur in 2022. In this indexing process, we determine that the first variable is data on legal violations that we have while the second variable is data on legal violations that will be predicted. After this indexing process, the process of predicting law violations in 2022 is carried out using a linear regression algorithm. After modeling, prediction results were obtained for law violations in Yogyakarta Province in 2022 in the form of 20 cases of motor vehicle theft, 1 case of circulation of counterfeit money, 111 cases of drugs, 10 cases of serious maltreatment, 235 fraud/fraudulent acts.

C. Prediction Results
This section will discuss how to predict the level of lawlessness in the province of Yogyakarta in 2022 using a linear regression model. To predict the level of law violations in Yogyakarta Province with a linear regression algorithm, training data and data tests are carried out, where the size of the test data is 40% of the total data in the dataset so that the accuracy obtained for both is 99% and 94%. However, the difference in the accuracy of these two data, especially the test data which is lower than the training data, indicates that there is an important change or difference in the understood data (Pyle et al., 1999). In the object of this study, data on criminal acts was taken 9 years back, from 2012 to 2021.
Linear regression is a machine learning approach that is used to test the extent of a causal relationship between variables (Kamal A, 2012). The causal variable is usually written with X and the effect or response variable will be written with Y. After pre-processing the data and splitting the data, predictions are made on the X test data using X and Y train data. so that predictions are obtained in the form of arrays with predicted values: [25, 1, 111, 1, 11, 243, 1, 1, 0

V. CONCLUSION
Based on the results of the analysis that has been done, it can be concluded that in general the forecasting method can be done using a linear regression algorithm, using clear stages. Starting from data collection where in the object of this research, crime data were taken from the previous 9 years, from 2012 to 2021. Then proceed with data preprocessing, data modeling, and finally the prediction process with a linear regression algorithm. By using a linear regression algorithm, the predicted results of law violations in 2022 are in the form of 25 cases of motor vehicle theft, 1 case of counterfeit money, 111 cases of drugs, 1 case of illegal logging, 11 cases of serious abuse, fraud/fraudulent acts 243 cases, 1 case of mass riots, 1 case of terrorists, 3 cases of cybercrime, and 17 cases of violent theft. and increase religious harmony. Future research may use algorithms or other machine learning methods.